Data is the only moat

frontierai.substack.com

158 points by cgwu 17 hours ago


Nevermark - 11 hours ago

Data.

Vertical integration.

Horizontal integration.

Cross- and/or mass-relationship integration.

Individual relationship investment/artifacts.

Reputation for reliability, stability, or any other desired dimension.

Constant visibility in the news (good, neutral, sometimes even bad!)

A consistent attractive story or narrative around the brand.

A consistent selective story or narrative around the brand. People prefer products designed for "them".

On the dark side: intimidation. Ruthless competition, acquisitions, law suits, reputation for dominance, famously deep pockets.

To keep someone is easier. Tiny things hold onto people: An underlying model that delivers results with less irritation/glitches/hoops. Low to no-configuration installs and operation. Windows that open, and other actions that happen, instantly. Simple attention to good design can create fierce loyalty, for those for whom design or friction downgrades feel like torture.

Obviously, many more moats in the physical world.

jackfranklyn - 2 hours ago

Building in a niche B2B space and this resonates. The data moat isn't just volume though - it's the accumulated understanding of edge cases.

In my domain, every user correction teaches the system something new about how actual businesses operate vs how you assumed they did when you wrote the first version. Six months of real usage with real corrections creates something a competitor can't just replicate by having more compute or a bigger training set.

The tricky part is that this kind of moat is invisible until you try to build the same thing. From the outside it looks simple. From the inside you're sitting on thousands of learned exceptions that make the difference between "works on demos" and "works on real data."

light_triad - 13 hours ago

Distribution, brand, network effects, regulatory positioning, and execution speed all create defensibility; "data helps" doesn't imply "data is everything"

Also as foundation models improve, today's "hard to solve" problems become tomorrow's "easy to solve" problems

weinzierl - 7 hours ago

Why is it that we have agents that can prospect for sales leads and answer support tickets accurately, but we don’t seem to be able to consistently generate high quality slides?

I don't know about prospecting, but "answer support tickets accurately"? Seriously, this must be ironic, right?

netdevphoenix - 2 hours ago

Efficiency will ultimately decide if LLMs become feasible long-term. Right now, the LLM industry is not sustainable. Investors were promised literally the future in the present and it is now undeniable that ASI, AGI or even moderately competent general purpose quasi-autonomous systems won't happen anytime soon. The reality is that there is not space for all these players in the market in the long-term. LLMs won't go away but the vast majority of mainstream providers will definitely do

whatever1 - 14 hours ago

Information was always the moat for everything. We literally have spies who risk their lives to try to gain access to information.

burntcaramel - 12 hours ago

Don’t forget people’s minds.

- Which brands do people trust? - Which people do people of power trust?

You can have all the information in the world but if no one listens to you then it’s worthless.

NiloCK - 9 hours ago

Data has historically been a moat, but I think now more than ever it's a moat of bounded size / utility.

The biggest data hoarders now compress their data into oracles whose job is to say whatever to whoever - leaking an ever-improving approximation of the data back out.

DeepSeek was a big early example of adversarial distillation, but it seems inevitable to me that frontier models can and will always be siphoned off in order to produce reasonably strong fast-follow grey market competition.

andy99 - 11 hours ago

What if the only moat is domains where it’s hard to judge (non superficial) quality?

Code generation, you don’t see what’s wrong right away, it’s only later in project lifecycle that you pay for it. Writing looks good to skim, is embarrassingly bad once you start reading it.

Some things (slides apparently) you notice right away how crappy they are.

I don’t think it’s just better training data, I think LLMs apply largely the same kind of zeal to different tasks. It’s the places where coherent nonsense ends up being acceptable.

I’m actually a big LLM proponent and see a bright future, but believe a critical assessment of how they work and what they do is important.

dangoodmanUT - 9 hours ago

saying they swear by the cursor composer model doesn't give me a ton of confidence

ralusek - 13 hours ago

I feel like algorithmic/architectural breakthroughs are still the area that will show the most wins. The thing is that insights/breakthroughs of that sort that tend to be highly portable. As Meta showed, you can just pay people 10 million to come tell you what they're doing over there at that other place.

inb4 "then why do Meta's models still suck?"

jongjong - 13 hours ago

Attention is the only moat.

Companies always try to make it seem like data is valuable. Attention is valuable. With attention, you get the data for free. What they monetize is attention. Data is a small part to optimize the sale of ads but attention is the important commodity.

Why else are celebrities so well paid?

guelo - 11 hours ago

What's annoying is that companies capture user data and then lock it into their platforms, transform it, and resell it. But it is really the user's data that they're selling back to us. I would like regulation here, you capture my data then I can pick who you must and must not share it with.