AI has been on quite a tear, with exciting progress in multiple modalities: language (ChatGPT from OpenAI), vision (e.g., DALL·E 2 also OpenAI and Stable Diffusion from Stability and Runway), and many more modalities and models. In the midst of all these remarkable things from the past year, I contend one is exciting and often overlooked:
AI is having its Linux moment.
Since the beginning of the deep learning era, AI has had a strong open-source tradition. By Linux moment, I mean something else: we may be at the start of the age of open-source models and major open-source efforts that build substantial, long-lasting, and widely used artifacts. Many of the most important dataset efforts (e.g., LAION-5B) and model efforts (e.g., Stable Diffusion from Stability and Runway, GPT-J from EleutherAI) were done by smaller independent players in open-source ways. We saw this year how these efforts spurred a huge amount of further development and community excitement. This blog post is my wildly optimistic take on what that trend might mean for 2023 and beyond.
A brief, perhaps overly rosy recap of Linux.
I wasn’t there, but here is the story I like to believe. A bunch of open-source hackers came together and built a freely available operating system. At first, their efforts seemed kind of lowbrow (insert reference to hacky GNU counter-culture devs and their duct-taped operating systems). But then they came to dominate in a huge number of applications. For example, I type this on an OS that is based on BSD, talking to servers likely running a flavor of Linux. The big player Microsoft continued to have a large install base with Windows and made amazing products that people want, but open source has had a huge role in computing.
The open source community was motivated by principles of free software and improving the world. It’s certainly true this community has had challenges – but to a first approximation the open source and Linux movement changed the world for good. Open source embraced permissive licensing and allowed a broader set of people to be represented in the creation of important computing tools; it also brought down barriers so that more people could use technology in myriad ways.
Can we do the same (or even better) for AI? We might need to break new ground in how we think about open-source software, and reimagine what this movement means in the context of AI systems. AI is more than the code used to train models: it’s also the compute intensive training process, and the model and dataset artifacts that surround it. I want to believe that the same principles should still apply!
Let’s first recap the last year, and then I’ll give a few notable technical differences.
Last year in open-source AI.
Last year was amazing even by AI’s standards! The amount of progress made was mind-boggling, and apologies if my attempt to summarize it missed out great work (send us notes!).
- The Setup. Models are open sourced like never before because of Hugging Face’s model hub. THANK YOU! We put our code there and you should too! They’ve made a huge contribution to the field and open source.
- The Community. Hacker collectives are welcoming (LAION, how are you so nice?). They aren’t alone! Many have been very effective (Eleuther, CarperAI, and more). They released models like GPT-J-6B, OpenCLIP, Stable Diffusion, and many more! I lurk on these discords, and they are some of the most energizing technical places I can imagine. People from all over the world from incredibly varied backgrounds are building things that the field has been dreaming for decades.
-
The Models. Foundation models are popping up in many areas, and they are quickly improving. Let me highlight two major areas and their developments, but there are many missed (e.g., biology via great folks at OpenFold, Meta, and others!)
-
Language. One major question is: how far behind are open models? To answer this question, we created the HELM benchmark, which is a major effort that I’m really proud to be involved in. When we started, people would ask me “what open source models?” There were only a few, but now there are 15 and more coming soon! We are measuring these models with the help of Together, and it’s amazing how quickly this community is improving. Of course, we owe a huge debt to OpenAI for making their APIs available to show us the way, but I’m so happy that this technology is democratizing so quickly.
-
Images. This year was amazing for images. The DALL·E-2 folks wowed us with magazine covers and impressive art generation. Very quickly after, Stable Diffusion via Stability and Runway was open sourced. The open-source models enabled the community to build a huge number of use cases. Most importantly, Stable Diffusion could be remixed by an endless set of folks, who tweaked its capabilities for new image generation applications. It showed the power of community and open source.
-
-
Compute. Compute is opening up instead of being controlled exclusively by large corporations. Stability has given away a huge amount of compute and helped to foster amazing projects, governments are getting in the game, and folks like Together are enabling decentralized contribution of hardware.
-
Algorithms. AI algorithm development has always benefited from open source toolkits and the culture of posting your results to Arxiv. Many individuals not affiliated with large corporations are making contributions. For example, the fastest implementation of transformers was done by a graduate student (from our own lab, a shameless plug for the amazing Tri Dao!).
-
Datasets. As architectures have stabilized, modifying the data is arguably the most reliable way to improve model performance. For most people, data is a more accessible medium than code to express their ideas. This would lead one to suspect that many more could participate in these cutting edge AI systems, which are thrilling even to people outside the AI/ML core. Indeed, the community has responded with massive datasets like the Pile, C4, LAION-5B, and efforts on alignment. Thanks to the Hugging Face Datasets library and hub, we have a central place to see and access them all!
-
Tools. The community’s investment in supporting software that is open-source has made it easy for anyone to participate in AI. This software has continued to improve over the last year e.g. PyTorch (Meta), Keras (Google), Transformers (Hugging Face), MegatronLM (Nvidia), DeepSpeed (MSFT), and many more! There’s also people like lucidrains who absolutely tear through new work in a couple of days to give the community great implementations of new models!
So open-source AI models, data, and compute are already here in a major way thanks to the pioneering efforts of many great individuals and a budding community. Similar to OSS, more people are represented in the field and they are being applied to more of humanity's problems. What an awesome time to participate in the field!
What’s different from the Linux era?
There are some important differences; perhaps three are most salient to me:
- People are excited. The sheer number and variety of people that are excited about these AI models is quite a bit larger than the number excited by the internals of operating systems. Go on any discord or peruse twitter and you can feel the amazing vibe! Unlike operating systems, these models feel tangible, are directly applicable to daily life and approachable to anyone.
- People are building. People are excited about building applications with these models. They are rapidly incorporating them into their software stacks or day-to-day workflows. From Excel to Gmail to VSCode plugins, these models are being used everywhere we look.
- Data makes contribution easier. Contributing to Linux required a fairly narrow set of specialized skills. In contrast, much of the AI process is about the data and refining it. This appears to be dramatically more accessible to that wider range of people (along with a few decades of amazing tooling!) Look at the incredible impact LAION has had! I’m in awe of that group.
Linux was successful without these ingredients, so I think this suggests the AI community has a chance to be much larger, more representative, and much more broadly impactful.
A Model Oligopoly or Power to the People?
One possibility is that OpenAI, Google, or someone builds a mega model which is capable of performing all economically valuable human activity (kind of awesome, if they do! Please take a shot!) However, here are two reasons why I think we might enter an age of many models rather than of a monolithic few.
- AI with your values. We can build many different AI systems imbued with the values of their creators and groups–and it’s easier than ever before. Professional ethics and norms may have to adapt. It’s doctors (PubMedGPT), lawyers (Pile of Law), material scientists, biologists, those work on drug discovery. Each will have to wrestle with this technology. A decade ago ML and open source old heads like me were talking about democratizing ML (couldn’t say AI then) so more people could use this (much less impressive) magic… and it happened! Now many more people who care about their domain are going to use these tools to bring new ideas to a range of exciting solutions! They’ll each have to sort out their own sticky ethical and professional norm questions, but this new tool is exciting.
- We’re trying to do our part by putting up material and talks (CS324, and the associated podcast) and models, methods, and datasets. We’ll have workshops; we need community–get in touch!
- Check out Rob Reich’s upcoming talk in CS324. A brief conversation with Rob convinced me about how explicit norms for these models is an important activity for both the open source community and academia.
- Companies may want their own tools. It’s going to be awesome to use and build software as GPT-3 and its cousins are incorporated into a huge number of products. Why won’t every component of our software be smarter and better able to take advantage of our input? Smarter, more interesting actions are going to fundamentally alter how we engage with models. However, my guess is that companies will still need to build applications on their unique data from their unique viewpoint to accomplish their unique goals: the amount and variety of data in enterprises hugely outstrips public data, and that data is really a manifestation of their business or activity–which makes it unlikely they will contribute to a single, widely available model. This makes me think that “build shops versus buy shops” will open up in many places with open source. We have all had professional grade tools in many areas like data management that are made by huge companies–but still open-source began to dominate in data.
The community in AI has been amazing in many ways. One of the reasons I broadly identify with the community is because it has been welcoming to me, and now, I owe it a lot. I feel like this is a way that we can contribute to what has been the most intellectually rewarding experience of my life so far.
Call to Action
If there is one combined call to action: get involved and be considerate of others! That first part of the call was probably implicit from the beginning of this post. But in this age of polarization, we’re not going to agree on everything. What we can agree on is that we want to build open tools that help people. We want these incredible tools to be in the best interests of everyone, and not just influenced by a few big players. To do so, we need to keep incredible good will to become an open source community, where we can build on each other's advances, and push back against harmful oversteps. It’s going to be a big challenge, but the opportunity is worth it! Like the Linux era, our challenge is to figure out how we can embody the principles of open-source as individuals and organizations, reimagined for the AI moment. In the same way that the Linux community laid the foundation for decades of computing, we may have a chance to build the foundation of the next generation of computing with AI together.
p.s. I can change my mind whenever I want, so please don’t take anything I say too seriously, I certainly don’t.
Acknowledgements. Thank you to Percy Liang for his always insightful comments and his tireless efforts to lead CRFM and its important projects like HELM. Also thanks to Michael Zhang, Karan Goel, Avanika Narayan, and Ce Zhang for their comments and contributions to this post. Finally, again thank you to the wonderful budding AI open source community!