Motivations and strategy for open sourcing AI model weights

Andrew Marble
July 7, 2023

An important historical use of open source licensing has been the promotion of “software freedom” – freedom to run, copy, distribute, study, change and improve software1. This aim has been very successful and free software for most applications exists. There are many other motivations for writing open source software and freedom has somewhat receded as an aim in light of past successes. As AI becomes increasingly important, it’s worth revisiting software freedom in terms of how AI model weights are released and licensed. Licensing can be a strategic tool for pursuing “freedom” with the goal of maintaining a robust open ecosystem of AI models.

There is an ongoing discussion about how concepts and licences from open source software can be applied to AI models, in particular the weights. Model weights are the set of parameters calculated during training that tell the generic architecture how to calculate its output. In modern models they number in the billions and may be the product of millions of dollars’ worth of computation. Weights share similarities with software – in fact they are an integral part of the computer code that’s needed to run an AI model, but weights and software differ in important ways. Currently open source weights are typically released under existing open source software licenses, such as Apache 2.0 or MIT. However it’s generally acknowledged that these may not be sufficient and there has been recent discussion that new terminology and categories are necessary.2

One area that needs more discussion is the motivation behind open sourcing, and how that suggests the licensing strategy that is used. While weights are different than software, community members that train and release them have many of the same motivations as traditional software actors, and so we can look at what motivates open source software as a guide to how to think about open source AI as a strategic tool, to further personal, business, and societal goals.

Modern software development runs on open source. Most of the major languages have open source compilers and frameworks, and a rich community that a developer can draw on to build something. Through that lens, some common motivations (and advantages) of contributing to open source are:

To some extent, the reasons above are all “developer centric”, there is either a personal or business reason for open source software, that can also apply to AI model weights. But the overall ecosystem works well because personal and business incentives are aligned with community goals of building software. While the discussion here is about contributing to open source, there are also important reasons for using open source, including security and auditability which may be especially relevant to AI.

Open source contributors have historically had goals related to “software freedom” as well.

“The world has changed – and code freedom is being overtaken by developer freedom3.”

This quote is from a commercial open source project, TerminusDB, explaining their decision to switch from GPLv3, a “free software” open source license, to Apache 2.0, a commercially oriented open source license. Licenses like GPLv3 essentially (called GPL licenses here) prohibit closed source commercial additions to licensed software, with the goal of keeping any derived uses “open”. The Free Software movement began amidst the rise of proprietary operating systems like Unix and Windows, software copyrights, and more “ideological” concerns4 about freedom of users to use computers and software the way they wanted, rather than be dictated to by a company. It was felt that forcing software to include source code (and virally apply that requirement to derivatives) would keep software free. This suggests three additional motivations:

With respect to freedom as a concept, we see e.g. Free Software Foundation projects such as GNU that often end up emphasizing the philosophical aspects of their definition of software freedom, even when (I’d argue) impractical. This is in keeping with the more ideological commitments of the free software movement and not necessarily a bad thing. More practically, Linux itself is probably one of the best known examples of a credible open source software alternative written with an open source license that requires derived works to remain open. This combination of being freely available and modifiable but requiring anyone making modifications to apply the same license has very successfully carved out an ecosystem highly adopted and fully featured open source operating systems.

An important consequence of the way tools like Linux are licensed is that the viral licensing requirement prevents the rise of, for example, a closed “BigCo Linux” that contains closed source proprietary changes and so has to some extent successfully fended off a world of only closed software that is not controlled by its users (well, except Android). The same is true for various other software products, such as Blender (3D drawing software), Inkscape (2D drawing), LaTeX (typesetting), etc.

Implications for AI Model Weights

As open source software has become more mature and the threat of a closed operating system landscape has receded, open source has focused mostly on the first set of goals – developer freedom over software freedom. This has been beneficial for everyone, allowing commercial exploitation but also a robust open ecosystem, and major software and tech companies have seen fit to make major open source contributions in alignment with their business. (Note: after writing this I found a good recent article lamenting the decline of freedom as a focus: We need more of Richard Stallman, not less, not that I agree with everything in the article.) AI is much less mature, and particularly with the rise of commercial foundation models, we are perhaps seeing a glimpse of the same growth pathway that software experienced, including the threat of a closed, proprietary landscape. The most popular commercial models are closed, and even those that are available often come with restrictions or built in limitations that prevent “free” use. Additionally, there are ongoing lobbying and legislative efforts to attempt to restrict how people can use AI models, and by-extension their computers. In that context, it’s worth thinking about what the free software movement did for open computing and the lessons that can be applied to AI.

“Movement. It moves a certain distance, then it stops, you see? A revolution gets its name by always coming back around in your face.” – Tommy Lee Jones in Under Siege, in response to being told the movement is dead.

As Linux has shown, open licensing can be a powerful tool to in maintaining a competitive open software landscape. Although the experience of Linux does not necessarily translate one-for-one to AI. There have already been strong (somewhat) open AI rivals to the closed OpenAI products (confusing, I know) . For example Stability AI already presented a credible alternative to Dall-E with Stable Diffusion. Although they released their weights under a restrictive “Open Rail” license5. With respect to training foundational AI models, I suggest people interested in maintaining a free (as in Linux) AI ecosystem need to think about:

What license terms can further software freedom?

GPL and similar software licenses require that users share code and apply the same license to derived works. Some licenses include additional clauses that also require sharing the source when running software on a server or as part of a closed system. Do similar constraints make sense for model weights? What else needs to be in place to maintain “free” models? Should models that are “fine-tuned” from a foundation model be similarly licensed? What about those served through an API? Foundational models are also being used to generate training data for other models6. How would a license apply in these cases, and should it? What potential circumventions exist that would render a license ineffective?

How can license terms respect the role of AI models in the overall software value chain?

Foundation models play many different roles in AI development. A foundation model may be fine-tuned into a narrow application specific model. As an older example, Image-net pre-training is often used as a starting point for computer vision models, which are then fine-tuned on proprietary data into an application specific image classifier. Or a foundation model may be further refined into another foundation model. For example language models like LLaMA have been refined through human feedback into chat-like models such as Vicuna. In some cases, models are not refined at all, but are passed contextual data during inference for few-shot learning. The foundation model plays a different role in each case, and the attractiveness of sharing the “derived” model differs.

How might license terms conflict with adoption?

Terms that are too onerous will limit how models are used. However evidence suggest license terms may not be a big deterrent. For example, Meta’s LLaMA was released with a non-commercial license arguably more restrictive than GPL . In spite of this, there is a vibrant ecosystem of publicly available foundation models that have been fine-tuned from LLaMA (Alpaca, Vicuna, Guanaco…) while acknowledging and respecting the license terms. Had Meta released LLaMA under a GPL-like viral license it would likely only have further cemented it as a strong free and open contender in the language model space.

How might these conflict with other aims or requirements of open source software and developer freedom?

Are there other pragmatic trade-offs necessary to continue supporting the other motivations for contributing to open-source or its benefits? There have been past arguments about the merits of various clauses in the GPL licenses in particular and the limits they impose7.

What other practical considerations guide AI model freedom?

Are there instances when sharing requirements training data or metadata related to training may also be appropriate (or not)? Models differ from software in that they are an opaque set of weights derived from training – closer to (though not the same as) software distributed as a binary file rather than source. They could only be reproduced with training the data (and even then generally not deterministically). Is some consideration necessary in the license between the role of the model and the weights? For example, could model weights be open source and the model itself be closed?

What kinds of AI projects are well suited to “freedom” focused model licenses?

Like with software, GPL-style licensing will not make sense in all cases, and the “freedom” focus will be more effective when used pragmatically. In software, I’d argue that smaller component libraries may be worse off if they impose their license on derived works. For “finished” products, the requirement to share derivatives can be an important tool in maintaining an open ecosystem. How does this translate to model weights? Does it imply that foundation models and foundation models derived from them could stay open, while the requirement is less important for narrow models? Or are there different considerations for weights?

The list is not meant to be exhaustive, and importantly I’m not advocating the focus on freedom as something that should be universal or even always desirable in licensing. The goal is to point out that licensing’s potential has not yet been realized and AI is facing some of the same challenges of being walled in by commercial interests that software previously faced. Considering these questions when examining licensing options may be able to help make them a strategic tool that ultimately (like Linux) will prevent local minima and make the whole ecosystem better.

As a closing thought, the Linux kernel works as GPL open source software because it is a general purpose system that serves as a broad foundation, and it is too good (and now too ingrained) to ignore. Organizations will accept the source code obligations of its license because they are outweighed by its benefits. There is already a state-of-the art foundational language model that, for now, matches these criteria. Various sources have called chatGPT AI’s iPhone moment8. If OpenAI wanted, it could have been its Linux moment.





  5. At the risk of going on a rant, it seems like there is a pre-occupation with imposing particular opinions of “ethics” on AI models that was thankfully absent from earlier open source software discussions, and is in opposition to user freedom. People can choose how they want to license their work, hopefully more weight will be given to how it furthers the kind of freedom we’re talking about here.↩︎

  6. For example, “Textbooks are all you need”↩︎