Learn

Proprietary vs. Open Source Foundation Models

This post will discuss a framework that can help guide you through the decision process by comparing the tradeoffs between proprietary vs. open-source foundation models along five key dimensions: cost structure, time-to-market, latency, flexibility and transparency, and security and governance. 

Sophia Lu, Vice President

In 2022 we saw a Cambrian explosion of foundation models across domains like language, code, image, video, 3D, and music led by mega caps, AI pioneers, and researchers. For users looking to build on foundation models, a large selection of both proprietary and open-source models is certainly good news. However, choosing which foundation models to build on takes work, as it potentially impacts margin, time-to-market, security and governance, and many other aspects of your business. 

This post will discuss a framework that can help guide you through the decision process by comparing the tradeoffs between proprietary vs. open-source foundation models along five key dimensions: cost structure, time-to-market, latency, flexibility and transparency, and security and governance. 

However, this article does not compare technical aspects between foundation models. 

Framework Takeaways

FrameworkProprietary modelsOpen-source models
Time to MarketEasy to deploy OOB, faster time to marketRequires time and expertise to set up self-hosting
Cost StructureUsage-based (definition of usage caries), additional costs to fine-tuneModels are distributed freely, might require resources to fine-tune
LatencyLonger response time, might negatively impact the user experience for real-time scenariosDepending on use cases, models can be smaller, and therefore, faster
Flexibility & Transparency No visibility into codeOffers the most code transparency and flexibility
Security & GovernanceOffer added security and governance capabilities, however, data management and governance remains opaque Lacks security and governance capabilities, but can be brought within businesses’ security perimeter and securely fine-tuned on local data. 

Time to Market 

The first thing to decide when building with foundation models is the ease of use. Which primarily impacts the time to market of your product. OpenAI models such as GPT-4 and Dall-E are a set of unified API endpoints that developers can interact with. They are also fully managed, so you don’t need to worry about setting up a self-hosting environment.

However, with open-source models, you need to set up self-hosting environments or look for MLOps platforms to host the models and run fine-tuning. This lengthens the go-to-market timeline and potentially requires technical talent and resources to kick off projects.

If you do not have sufficient tech talent to set up self-hosting and time to market is critical, proprietary models are the faster and easier way to build AI capabilities into your products. 

Cost Structure 

Each foundation model has different cost structures; proprietary models have pricing tiers based on the number of tasks or monthly use. Many open-source models are distributed freely (depending on the specific model license and data). Understanding how the costs for using the base models, embedding, fine-tuning etc., impact your margin is an important first step in your decision. 

Anecdotally, we have heard that using a commercial model such as GPT can be ~10x more expensive than open-source alternatives for specific tasks. Below are a few examples of proprietary language and image model pricing:

  • GPT-3 is consumption-based, priced per 1K token
    • For the pricing of this model, it is important to note that “tokens” are pieces of words for NLP, 1K tokens = ~750 words. For base models, you are billed by the number of tokens in your prompt and the number of tokens in the output. 
    • If you are fine-tuning GPT-3 models, the training and usage of fine-tuning models bring additional costs
      • Training tokens = tokens in the training data set multiplied by the number of training cycles (default to 4)
      • Usage tokens = requests sent to fine-tuned models (prompt + output) 
  • Cohere is consumption-based, per task
    • Pricing for fine-tuning language generation and embedding is approximately 2x the price for the base model, and the token limit is unknown
  • Dall-E is consumption-based, per image generated
    • Lower resolution images cost slightly less., every prompt results in a net new image generation
  • Midjourney is priced per user per month
    • Midjourney has two options for use, a free tier and a paid tier
      • Free tier: Developers have access through Discord, and everyone can see the content generated in the Midjourney channel 
      • Paid tier: The content generated is private and only seen by the creator and those given access  
  • Stable Diffusion is open source and free to use
    • This provider of Stable Diffusion, Stability.ai, offers a web-based interface that is priced by consumption  

Many startups leverage API-based, proprietary models in their experimental phase to kickstart product development. Once they find product-market fit, it might make sense for some of them to transition into self-hosting, especially for startups that might want to fine-tune their models or target high-throughput use cases.

Latency 

Latency can be considered a downside of proprietary models, especially larger-scale models. GPT-4, for example, is hosted behind an inference API with a response time ranging up to 20 seconds, which will negatively impact user experience around real-time use cases such as short-form personal writing assistance, code generation, or seller insights. 

On the other hand, if you are building on top of open-source models, you can choose the model best suited to your use cases. Depending on the task, you might find smaller open-source models that work just as well as larger proprietary models. It’s also easier to layer model serving optimization on top of open-source models, reducing latency.  

Flexibility & Transparency

One of the biggest challenges in large-scale models is the lack of visibility, which prevents users from understanding and makes it challenging for businesses to offer a safe and consistent user experience. For example, Dall-E will sometimes not respond to certain words in the prompt but might respond to the same terms with toxic or biased output at other times.

Some proprietary models offer more flexibility by allowing users to assign relative importance to certain parts of a prompt. For example: 

  • “hot dog:1.5” makes it more about the animal and less about the food, meaning a dog that is hot and looking for water
  • “hot dog:0.5” makes it more about the food instead of the animal

Open-source models offer more transparency and flexibility, especially at the hands of users who are familiar and experienced with AI models. This allows users to gain better insights into the predictions and behaviors of the foundation models and better align models with their business or product goals.

In general, there are certain use cases, such as financial services and healthcare, where transparency is crucial due to federal and state privacy regulations. Because more concerns are rising about how people use AI, there is increasing awareness and interest in AI model transparency. Creators like OpenAI and Google must invest in providing transparency and explainability to their products. 

Security & Governance 

Overall, there are a lot of gaps around the security and governance of large language models (LLM) and generative models. Proprietary and open-source models both exhibit risks in different aspects. Proprietary models offer added security and governance capabilities that open-source models lack.

For example, OpenAI models have built-in filtering and content moderation capabilities that can flag and prevent sensitive, violent, hateful or any other content that violates OpenAI’s content policy. OpenAI also offers an enterprise SKU through Azure, including role-based access control (RBAC) and private networks to ensure security.

However, due to data compliance and security concerns, many enterprises avoid using or fine-tuning proprietary models. Using external models on sensitive personal and client information might result in potential data leakage. For example, sensitive data used as part of the training set for GPT-5 and could potentially be retrieved through adversarial attacks. Although out-of-the-box open-source models lack security and governance capabilities, they can be brought within businesses’ security perimeter and securely fine-tuned on local data. 

One Last Thought 

The landscape of foundation models is rapidly evolving and becoming increasingly fragmented. The choice of building on proprietary vs. open-source models is primarily driven by understanding their relative strengths and limitations and understanding what you value most when it comes to the use cases and companies you are serving.

Soon, companies might start to work with a mixture of open-source and proprietary models and will need to re-evaluate the underlying models routinely. Consider investing in designing a model evaluation framework and an architecture that allows you to work with the right mix of foundation models best suited to you.

This image shows examples of open-source and proprietary models developed by megacaps, AI startups, and researchers across domains.