BCG Henderson Institute

Generic filters

How CEOs Can Evaluate New Generative AI Models

As GPT-4 and other new models are released, leaders need to constantly assess the implications for their business.

The steady march of new generative AI models continues. The release of GPT-4 to replace GPT-3.5 (the model used by ChatGPT) is just one example; other companies are also working rapidly on the next advancements. Each release ushers in a flurry of analysis of the model’s technical capabilities. But in a recent article, we argue that the priority for CEOs isn’t to fully immerse themselves in the technology; it is to identify how generative AI will help them build competitive advantage. This focus becomes even more critical in our state of constant innovation.

To keep pace, company leaders will need to evaluate each new generative AI model along three key dimensions, in relation to their organization’s business model: the truth function, proprietary data, and the economics.

The Truth Function

What use cases are supported by the new model’s accuracy? What are the trade-offs between functionality and accuracy?

Generative AI is, at its core, a probabilistic model generating output. The model is not trying to maximize truth, and as such, it sometimes produces incorrect assertions—also called “hallucinations.” However, generative AI’s emphasis on creativity (as opposed to accuracy) is not a bug, it is a valuable design feature. And, with adequate human supervision, this feature can also be an asset for businesses.

For example, in use cases with a high tolerance for error—such as product design or drug discovery—generative AI’s hallucinated creativity can lead to breakthrough innovations when paired with human expertise. As Wharton professor Ethan Mollick writes, “Indeed, the most astonishing feats of AI seem to rely on their ability to be creative through ‘hallucination’…it also allows them to provide unique and original replies by connecting unlikely sources of inspiration and finding surprising linkages.”

However, any error is detrimental for applications such as medical knowledge retrieval, where accuracy is critical. For such use cases, a combination of Traditional ML (providing precision) and generative AI systems (providing creativity) may be powerful. Microsoft has already implemented this trade-off as a user-controlled feature of Bing AI, enabling users to choose between a “more creative” or “more precise” mode.

Proprietary Data

Can companies use their proprietary data in new ways to strengthen their competitive advantage? What level of data protection and privacy is the LLM provider offering with the new model?

Data is one of the most important and differentiating assets that modern corporations own. CEOs must identify the models that use their proprietary data in unique ways for competitive advantage—and each new LLM release provides this opportunity. For instance, GPT-3.5 is a text-only model and cannot be used on its own for images, sounds, or biological data. But GPT-4 and other models offer possibilities such as using multiple modalities in a single model. Beyond data modalities, developments that enable unique ways to fine-tune each model for new and exciting functionality are also rapidly being discovered. CEOs should work with their tech teams to stay on top of the latest methodologies.

CEOs must also investigate whether new LLM models allow for more extensive (or easier) fine-tuning, while keeping data protected and private. Businesses have been concerned about sending proprietary data to foundation model providers during usage or fine-tuning, for fear that their data may be used without their permission.

Model providers have begun to address these concerns to encourage greater customer adoption. For example, OpenAI announced on March 1, 2023, that it will no longer use customer data to train their models unless customers explicitly opt-in. However, other model providers still claim the right to use customer data. Executives should confirm which policies apply to new releases when deciding if and how to integrate their proprietary data.

The Economics

What are the total operating costs to use the model, and what internal capabilities or partnerships will be required to train, fine-tune, or infer from this model? How should companies evaluate the economics to retain competitive advantage?

To achieve the quality that models are currently able to provide, LLMs have rapidly increased their number of parameters. The resulting models have high computational and data requirements to train, fine-tune, and infer, and are more difficult to run on-premises. However, recent developments seem to reverse this trend. Some newer models, such as Meta’s LLaMA, run on fewer parameters without sacrificing quality, making them easier to fine-tune and accessible to run—even on common, high-end laptops.

Executives must evaluate these new models, and partner with model providers to understand how the training and inference costs evolve with each development. Also, they should bear in mind that usage costs may come down as the model is optimized. For instance, roughly three months after its original release, OpenAI announced a 90% reduction of ChatGPT costs through system-wide optimizations.

Beyond the costs of operation, the evolution of the economics raises a deeper question that executives must consider: How can companies preserve their unique competitive advantage in the era of foundation models owned by advanced tech players? These models are complex to build; the technical talent and computational requirements needed to create new models are a barrier to all but a few specialized tech players.

Given these obstacles, executives need to determine how to maximize the competitive advantage and value (for example, cost savings or profits) they can capture when implementing generative AI for their golden use cases. They’ll have to determine how much of the development process to either build in-house or outsource to players like service providers or software companies. When making this decision, they’ll need to balance the cost of outsourcing, the importance of retaining IP, and the scale of internal capabilities needed to build and maintain each aspect of the application (such as the hosting model itself, fine-tuning, and building APIs).

As generative AI models continue to advance at breakneck speed, CEOs need the ability to sift through the information deluge that comes with each new model release—enabling them to critically evaluate how it can impact their businesses.

For a more holistic view on priorities for company leaders, read “The CEO’s Guide to the Generative AI Revolution.”

Sources & Notes