These last few weeks have been a whirlwind! Even this week, a few things happened that were personally exciting to me.
-
The "no moats" draft was released/leaked, and AI internet went crazy. To me, the claimed technical moats of big tech are eroding (and maybe overstated). The business and resources moat aren't going anywhere, but I'll talk about this viewpoint a bit more later. I'm excited that large companies are waking up to the Linux moment in AI.
-
Personally, I was thrilled to see our little bricks of contributions like RedPajama, FlashAttention and longer sequence models featuring prominently in reproductions from Together, MosaicML, and Berkeley and in the BigCode models. Last year at this time, I was constantly engaged in hand-wringing about whether Academia and Open Source would contribute to what happens next, and yes--we very much have--and I believe will continue to do.
- Also there were immensely valuable discussions and contributions from folks we as a community owe a ton to like Stella and the EleutherAI folks to further refine data mixtures. This community is vibrant and exciting! Let's live up to the best of what we can do.
Putting these two points together, I wanted to frame some thoughts on the "moat." It's true that the technical moat for AI models is rapidly commoditizing. We've been on this for a long time [link, link]. However, the moat of search is a real one. I use ChatGPT regularly, but I still don't know many folks who use Bing for search. I'm thrilled that we have open copilots, and I hope they take off, but VScode is amazing. So the moats are shifting and a bit uncertain, and this is what is so exciting about open source. The power is not with who has the model, but who can deliver the value people want--and that hopefully means more AI-infused services in the world that are cheaper, safer, and more useful. We're moving as Alex Ratner says from GPT-X to GPT-You. Awesome.
Now back to Google's role in the moats... I love Google. I have valued my interactions with their researchers, executives, and founders immensely through the years. They hold a special place in our department. Let me give a (perhaps unfair) take on some of Google's recent work in AI on infrastructure, AI's core algorithms, and the open-source path they have followed in other examples. Again, these are brilliant people to whom we owe a ton of gratitude. They have built world-class products that we use most of our day. It's amazing!
Infrastructure: TensorFlow. My view of Google's Tensorflow experience was the following. When it first came out it was huge! Wow! I rewrote so much of our code in Tensorflow when it came out. It looked unbeatable. It wasn't the first: there were packages like Torch, Theano, Caffe, and others. Fast forward, and it's not as used as PyTorch. What happened? Well, two things, I think. First, I think the openness of PyTorch won the day and its focus on making researchers productive and listening to the community (Soumith and team are amazing!). But also, there was something to learn about collaboration with the community. When we put out DAWNBench (which became MLPerf), my understanding is that Google was shocked it didn't solidly win the benchmark. To me, the reason was kind of clear: they were using TensorFlow on TPUs with models they designed. It was an all Google stack, they had to be great everywhere and all the time. Being the tip of the spear is expensive and hard in one place, let alone every place! The other side was a range of folks from NVidia to Meta to Microsoft, etc. Now, we see the power of a larger community almost every week (and today, shameless plug, with Tri Dao in FlashAttention used everywhere).
Contributions to AI's Core. The narrative I hear now from Googlers that Google developed AI is also a bit strange to me. They made tremendous advances, and we owe them a ton--no question about it, but claiming it could have (or should have) been done alone seems off to me. I intend to take nothing away from their genius, but it is built on others' work: the vaunted Transformer paper wasn't "here is attention" it was called "Attention is all you need."... Because researchers already knew about attention, it had been built in papers from the academic labs of Bengio, Manning, and others--led of course by great graduate students2. Google demonstrated some important concepts in the paper, but it wasn't a solo effort.
Google knows the other way. We've also seen examples of Google embracing open-source in other areas (Android and Chromium just to name a few), which brings about a vibrant community. The T5 model family being released in the open was a huge positive step, and it has been very much welcome by the community. We have always been a community in AI, that's why I joined the area and even though I'm still an oddball with weird taste--I've always felt the love from the community (THANK YOU!). Big companies--and definitely grad students--please don't believe anything different, AI has been and will be a community effort.
This article was focused on Google because of that "no moats" article, but it's not just Google. The rumors are Google/DeepMind and OpenAI are thinking of going it alone. They've made great contributions, and so this change subtracts from the pool of brain power thinking about these projects in the open way that I believe wins out. I think our work together has a chance to bring about an age of AI abundance. Let's get there!
Footnotes
- Title thanks to ChatGPT.↩
- On a personal note, check out Is AI rare or everywhere? This seems like a really fun academic question to me, what is the “simplest” model that gets us there. More coming soon on dramatically more scalable architectures.↩