The Data Mine
A generation of internet users spent two decades writing — for free, for fun, for strangers — millions of arguments, reviews, confessions, and jokes on a sprawling network of forums. They were told it was a community. It turns out it was a mine. The company that owns those forums has discovered that the unpaid writing of its users is one of the most valuable substances on earth in the age of artificial intelligence: the authentic, human, conversational text that the AI models are starving for. It now sells that text to Google and others for tens of millions of dollars a year, with a licensing pipeline it says could reach into the hundreds of millions. The stock has been re-rated as an AI play. But the same AI that is buying the data is also learning to answer questions without sending anyone back to the source — which means the mine is selling the very ore that may make the mine obsolete. This is the anatomy of a company monetizing its users' words at the exact moment the technology buying them threatens to make those words unnecessary.
There is a phrase that has become a cliché of the internet age, and like most clichés it is true: "if you're not paying for the product, you are the product." Reddit, the sprawling network of online communities where hundreds of millions of people discuss everything from particle physics to their failing marriages, is the purest expression of that maxim ever to trade on a public exchange. Its users have, for twenty years, produced an almost unimaginable volume of text — opinions, expertise, arguments, recommendations, the accumulated conversational knowledge of a generation — entirely for free, in exchange for the social rewards of community and the dopamine of internet points. And in the age of artificial intelligence, that text has turned out to be one of the scarcest and most valuable raw materials in the world, because it is exactly what the large language models need and cannot easily get: authentic, human, multi-turn, opinionated conversation, the real voice of real people arguing about real things. Reddit is sitting on a mountain of it, and it has realized, belatedly and lucratively, that it owns a mine.
The economics of the discovery have been transformative for the company. Reddit has signed data-licensing agreements — most prominently with Google, in a deal reported to be worth around $60 million a year, to allow its content to be used to train AI models — and it has signaled that its broader licensing portfolio could grow into the hundreds of millions of dollars, with figures of around $550 million floated as the deals expand and renew. This has recast the company's entire investment story. Reddit is no longer just an advertising business; it is now pitched as a data business, an owner of one of the few large troves of high-quality human text not already scraped and exhausted, a toll collector on the AI industry's insatiable hunger for training material. And the financials have, for now, been spectacular: in the first quarter of 2026, Reddit reported revenue of around $663 million, up 69% year over year, with advertising growing even faster, daily active users approaching 127 million, and genuine GAAP profitability with healthy margins. The market has rewarded the AI-data narrative, and on the surface the story looks like one of the cleaner AI winners — a company that owns the scarce input rather than burning cash to build the models.
The mine that sells the ore that makes it obsolete
But there is a contradiction buried at the heart of the data-mine business, and it is the kind of contradiction that the forensic eye exists to surface, because it is invisible in the current financials and potentially fatal to the thesis. The same technology that makes Reddit's data valuable — generative AI — is also the technology that threatens to destroy the engine that produces the data. Reddit's value, as a mine, depends on a continuous flow of fresh human conversation: people coming to the site, asking questions, arguing, answering, contributing new text day after day. That flow has historically been fed, in large part, by search — for years, a huge share of Reddit's traffic arrived from Google, as people searching for "best X" or "how do I Y" were funneled to Reddit threads where real humans had discussed exactly that. And as the chapter on Google's predicament described, AI is now collapsing that funnel: AI-generated answers, including Google's own AI Overviews, increasingly satisfy the user's question directly, synthesizing the answer — often from Reddit's content — without sending the user to Reddit at all.
See the trap. Reddit sells its conversations to Google to train the AI. The AI then uses that training to answer users' questions itself, on the search results page, without sending those users to Reddit. So the users who would have arrived at Reddit, read the threads, and — crucially — contributed new posts of their own never make the trip. The flow of fresh human conversation that constitutes the mine's ore is choked off at its source by the very AI that the mine is selling its existing ore to feed. Reddit is, in the starkest possible terms, selling the ore that may make the mine obsolete — monetizing its accumulated human text to an industry whose product reduces the human traffic that generates new human text. The data-licensing revenue is real and growing now; the question is whether it is a durable new business or a one-time harvest of an asset whose replenishment the buyer is simultaneously strangling.
The number that matters most is the one being threatened
This is why the most important metric for Reddit is not its licensing revenue but its traffic — specifically, its daily active users and the flow of new content, because those are the renewable resource on which everything else depends. A mine is worth a fortune only as long as it keeps producing ore; a mine being depleted faster than it refills is worth far less, no matter how high the current price of the ore. Reddit's user numbers have, encouragingly for the bulls, continued to grow — approaching 127 million daily actives — which suggests the engine is still running. But the threat is structural and growing, and it operates with a lag: as AI answers capture more and more of the searches that used to deliver Reddit its traffic, and as the open web's referral economy decays under the weight of zero-click AI summaries, the funnel that feeds Reddit's content engine narrows. The company is trying to compensate — building its own on-platform search, encouraging direct engagement, leaning into the community loyalty that brings users back without Google's help — and it may succeed. But the headwind is real, and it is the same headwind that is dismantling the open web's traffic economy generally, in which a generation of content sites that depended on search referrals are watching that traffic evaporate as AI answers intercept it upstream.
So Reddit faces a genuinely double-edged future, and the two edges are wielded by the same hand. On the upside, it owns a uniquely valuable trove of human conversational data, with real and growing licensing revenue, real advertising growth, real profitability, and a defensible community that machines cannot easily replicate. On the downside, the AI revolution that makes its data valuable is also the AI revolution that threatens its traffic, its content-generation engine, and the long-term replenishment of the very asset it is selling. The bull and the bear are looking at the same fact — the rise of AI — and reaching opposite conclusions, exactly as they were in the Google chapter, because AI is simultaneously Reddit's best customer and its most dangerous competitor for the attention of its users.
Does the buyer keep paying?
There is a question about the durability of the data-licensing revenue that goes to the heart of whether Reddit deserves to be valued as an AI-data annuity, and it is rarely asked plainly: once an AI company has trained its model on Reddit's corpus, does it need to keep paying for the same data? A large language model is trained on a body of text and then it has learned from it; the historical Reddit archive, once ingested, does not need to be re-purchased every year. What an AI company would pay recurring money for is access to the fresh, ongoing flow of new content — the new posts, the new arguments, the up-to-date conversations that keep the model current. In other words, the recurring, durable, annuity-like portion of Reddit's data-licensing value depends precisely on the freshness of the data — the continued production of new human conversation — which is exactly the thing the AI-driven collapse of search traffic threatens.
This sharpens the trap considerably. If data licensing were a one-time sale of a static archive, it would be a windfall, not an annuity, and a stock valued on a one-time windfall as if it were a perpetual stream would be badly mispriced. If, instead, it is a recurring payment for the ongoing flow of fresh content, then its durability depends entirely on Reddit continuing to produce vast quantities of new human conversation — which depends on traffic, engagement, and contribution, which depend in turn on the very funnel the AI is choking. Either way, the AI threat undercuts the licensing thesis: the windfall interpretation makes the revenue non-recurring, and the annuity interpretation makes it dependent on a flow the AI is strangling. The licensing dollars look like a clean, recurring, high-margin AI revenue stream on the income statement today. Whether they are actually recurring, and actually durable, hinges on questions — about data freshness, about traffic, about whether trained models keep paying — that the current valuation simply assumes away in the buyer's favor.
The most volatile mine on the market
It is fitting, given the company, that Reddit's own stock has behaved like the speculative frenzies its forums are famous for hosting. Reddit went public in 2024 at a modest price, soared to lofty heights on the AI-data narrative, and has since traded with extreme volatility — lurching up and down on each data-deal headline, each earnings report, each shift in sentiment about whether AI is its savior or its assassin. The stock is, fittingly, a favorite of the very retail-trading crowd that populates its WallStreetBets forum — a recursive joke in which the traders who gather on Reddit to gamble on stocks gamble, among other things, on Reddit. This volatility is not incidental; it reflects the genuine, unresolved binary at the center of the company. A stock that swings violently on every piece of news about its AI exposure is a stock the market cannot decide how to value, because the same fact — the rise of AI — points in two opposite directions, and the price oscillates between pricing the dream (a durable data-and-ad compounder) and the nightmare (a content engine being throttled at the source).
And the valuation, at its more euphoric moments, has priced the dream — assigning Reddit a multiple that assumes the data-licensing revenue is a large, durable, growing annuity and the advertising business keeps compounding and the traffic engine survives the AI transition, all at once. That is three optimistic assumptions stacked and multiplied, in the now-familiar pattern of this series, and the volatility of the stock is the market repeatedly discovering and then forgetting that the assumptions are contingent rather than certain. The data mine is real, and at the right price it is a fine business. At the wrong price — at a valuation that capitalizes a contingent, AI-threatened, freshness-dependent revenue stream as a permanent annuity — it is the same overpayment for a real asset that runs through every chapter here, distinguished only by the recursive irony that the asset is the unpaid chatter of the very people most likely to be trading the stock.
Whose data is it, anyway?
There is a deeper and more uncomfortable dimension to the data-mine story, and it is one the financial framing tends to skip: the moral and legal question of whose data Reddit is actually selling. The text that Reddit is licensing to Google for sixty million dollars a year was not written by Reddit. It was written by Reddit's users — millions of unpaid people who posted their knowledge, their opinions, their hard-won expertise, and their late-night confessions to communities they thought of as their own, with no expectation that their words would one day be sold, in bulk, to train commercial AI models that would then compete with the very forums they were posting on. Reddit's terms of service grant it the right to do this, and legally the arrangement is sound. But the substance of it — a company monetizing, at enormous scale, the unpaid creative and intellectual labor of its users, selling their words to AI companies without compensating the people who wrote them — sits at the center of one of the defining unresolved fights of the AI era: who owns the value of the human-created data that the AI models are built on, and who should be paid for it.
This is not merely a philosophical concern; it is a business risk. The users whose contributions constitute Reddit's entire value are not contractually bound to keep contributing, and they have, in the past, revolted — staging mass protests and blackouts when the company's monetization moves felt like a betrayal of the community. A user base that comes to feel exploited — that concludes its free labor is being strip-mined and sold while it receives nothing — is a user base that can disengage, post less, leave, or turn hostile, and the entire data mine depends on those users' continued, voluntary, unpaid production. There is a fundamental tension between maximizing the monetization of the users' data and maintaining the goodwill of the users who produce it, and that tension grows sharper as the dollar figures grow larger and more visible. The mine's ore is dug, for free, by people who can stop digging at any time, and who are increasingly aware that what they dig for fun is being sold for fortunes. The more loudly Reddit celebrates its data-licensing revenue to Wall Street, the more clearly its users hear that their unpaid words are someone else's product.
A real asset, an uncertain annuity
None of this is a claim that Reddit is a bad company or a doomed stock. It is a genuinely valuable and increasingly profitable business with a real, scarce, hard-to-replicate asset, a loyal community, growing advertising, and a legitimate new revenue stream in data licensing; it may navigate the AI transition successfully, defend its traffic, keep its users contributing, and turn its data trove into a durable, compounding annuity. The bull case has real substance, and the +69% revenue growth and genuine profitability are not illusions. The warning is narrower and specific to the nature of the asset: that Reddit is being valued, increasingly, as an AI-data play, on the premise that its trove of human conversation is a durable and growing source of value — at the exact moment that the AI buying that data is throttling the traffic that produces it, and the users generating it are growing aware that their unpaid words are being sold. The data-licensing revenue is being booked today as if it were a permanent annuity; it may instead be the proceeds of harvesting a renewable resource whose renewal the harvest itself endangers.
The deepest irony is the one in the name. A data mine, like any mine, is valued on the assumption that there is more ore where the current ore came from — that the resource is deep and replenishing, not a finite seam being worked out. Reddit's ore is human conversation, and human conversation is, in principle, infinitely renewable, as long as the humans keep coming and keep talking. The threat is that the AI which makes the existing ore so valuable is the very thing reducing the flow of humans to the mine and the incentive for them to talk — intercepting the questions before they reach the forums, answering them from the stolen-and-purchased past rather than sending the asker to contribute to the future. A mine selling its ore to the company building the machine that empties the mine is a strange and precarious business, and the market, dazzled by the licensing dollars flowing in today, is not yet pricing the possibility that the seam is being worked out faster than it refills. The words were written for free, by people who thought they were talking to each other. They turned out to be talking to a machine that is learning to no longer need them — and to a company that is being paid, handsomely, for the transcript.
That is the quiet tragedy beneath the tidy AI-winner story, and it is a tragedy with a financial edge. The thing that made Reddit valuable was never the technology or the brand; it was the people, and the strange generosity with which they gave away their knowledge to one another. The AI era has found a way to put a price on that generosity — and, in the same motion, to undermine the conditions that produced it, by intercepting the questions, answering from the archive, and quietly informing the contributors that their gift has a market value they will never see a cent of. Reddit has discovered it is sitting on a goldmine. The unanswered question — the one the licensing revenue cannot settle and the volatile stock keeps relitigating — is whether a goldmine whose gold is human goodwill can keep producing once the people doing the digging realize what their labor is worth, and once the machine they are feeding has learned enough to stop asking them anything at all.
Disclaimer
This article is produced for informational and educational purposes only and does not constitute investment advice, a solicitation, or a recommendation to buy or sell any security. All data cited reflects information available as of the publication time noted above. Market conditions may change materially between publication and when you read this. Past performance of any strategy referenced is not indicative of future results. Consult a qualified financial advisor before making investment decisions.
The Algorithm
They told you TikTok was sold. Read the cap table again. The Chinese company that built the most valuable recommendation engine on Earth still owns a fifth of the American version — and licenses it th…
Search and Destroy
For twenty-five years it ran the greatest business ever built — the toll bridge that every question on the internet had to cross, monetized with the highest-margin advertising in history, a monopoly s…
Whatever It Takes
The man who controls the votes has decided that artificial superintelligence is imminent, that whoever builds the infrastructure first wins everything, and that his company will therefore spend whatev…
The Custom Job
It has been crowned the second great winner of the AI chip boom — the company the hyperscalers turn to when they want to escape Nvidia's grip and design their own silicon. Its AI revenue is tripling, …