The Scale AI shockwave
Source: Meta's $14.3B investment deal coverage + Surge AI
The data labeling industry's biggest player just lost its neutrality - and with it, most of its biggest customers. Google (Scale’s largest customer), OpenAI, Microsoft, and xAI are all reported to cut ties. Just like that, the real ‘data wars’ have begun.
When the Scale news hit last week, I didn’t realize the impact it would have, nor the rabbit hole it would send me down. Google was reportedly spending upwards of $150M annually to Scale alone. Surge AI, a fully bootstrapped data labeling company, charges almost 3x the rates of Scale for its top-quality data. The floodgates are open.
Expert knowledge capture next?
Source: Garrett Lord, Handshake (TBPN) + Meta’s deal coverage
Everyone wants PhD-level expertise. Nobody knows how to get it at scale, not even Scale. As generic data gets commoditized, specialized domain knowledge remains scarce and expensive. Handshake, the career networking platform, has shifted to address this market. They’re leveraging their vast network of PhDs and domain experts to fill the gaps where traditional labeling companies fall short.
The pivot from ‘help students find jobs’ to ‘students are the product’ is nothing short of genius. This platform shift creates immediate opportunities for specialized service providers, and more relevant to me, consultancies. So what’s the best way to capture this market?
Synthetic pivot
Inspiration: Mike Knoop - Ndea (TBPN)
With money comes opportunity. Entrepreneurs have noticed, and they’re finding ways to capitalize. This is a sprint. Mechanize, Morph, Habitat - all betting that the future will rely on computer-generated data. And all started in the last several months. The economics are obvious: synthetic scales infinitely. Will data quality hold up in real-world applications?
The opportunity in specialized domains where you already have expertise or access is intriguing. Generic synthetic data will be commoditized within 18 months. Domain-specific generation with expert validation? That's defensible and profitable. Is synthetic data the solution to the expert knowledge scarcity problem? I really like this one.
Other commentary
The Los Angeles Lakers just sold for $10B (largest sports sale ever) but didn't outperform the S&P 500 over the 45 year period the Buss family owned the franchise. There are many reasons to own a sports franchise, but lol.
Cluely’s virality pays off with a $15m check led by A16z. The founder might be clinically insane.