BCG Henderson Institute

Search
Generic filters

Generative AI will be a powerful enabler of competitive advantage for companies that crack the code of adoption. In a first-of-its-kind scientific experiment, we found that when GenAI is used in the right way, and for the right tasks, its capabilities are such that people’s efforts to improve the quality of its output can backfire. But it isn’t obvious when the new technology is (or is not) a good fit, and the persuasive abilities of the tool make it hard to spot a mismatch. This can have serious consequences: When it is used in the wrong way, for the wrong tasks, generative AI can cause significant value destruction.

We conducted our experiment with the support of a group of scholars from Harvard Business School, MIT Sloan School of Management, the Wharton School at the University of Pennsylvania, and the University of Warwick.[1]We designed the study with input from Professor Karim R. Lakhani, Dr. Fabrizio Dell’Acqua, and Professor Edward McFowland III of Harvard Business School; Professor Ethan R. Mollick of the Wharton … Continue reading With more than 750 BCG consultants worldwide as subjects, it is the first study to test the use of generative AI in a professional-services setting—through tasks that reflect what employees do every day. The findings have critical implications across industries.

The opportunity to boost performance is astonishing: When using generative AI (in our experiment, OpenAI’s GPT-4) for creative product innovation, a task involving ideation and content creation, around 90% of our participants improved their performance. What’s more, they converged on a level of performance that was 40% higher than that of those working on the same task without GPT-4. People best captured this upside when they did not attempt to improve the output that the technology generated.

Creative ideation sits firmly within GenAI’s current frontier of competence. When our participants used the technology for business problem solving, a capability outside this frontier, they performed 23% worse than those doing the task without GPT-4. And even participants who were warned about the possibility of wrong answers from the tool did not challenge its output.

Author(s)
Sources & Notes

References

References
1 We designed the study with input from Professor Karim R. Lakhani, Dr. Fabrizio Dell’Acqua, and Professor Edward McFowland III of Harvard Business School; Professor Ethan R. Mollick of the Wharton School at the University of Pennsylvania; Professor Hila Lifshitz-Assaf at the University of Warwick; and Professor Katherine C. Kellogg at the MIT Sloan School of Management. Our academic colleagues analyzed our data. Please see our scholarly paper for more details.
Tags