What the Scarlett Johansson ChatGPT voice controversy reveals about AI’s impact on the future

Hollywood star Scarlett Johansson says OpenAI copied her voice for its generative AI model – after she turned down an invitation to voice it. The ChatGPT 4.0 demo, which included the voice of “Sky” was put out on May 15. In a statement released five days later, Johansson called the voice “eerily similar” to her own and said she’s hired lawyers who are asking that the company disclose how the voice was developed.  

A headshot of Brenda McPhail
Brenda McPhail

OpenAI says the voice was not designed to mimic Johansson’s voice and it was developed from another actress using her own natural speaking voice.  

We spoke to Brenda McPhail, director of executive education at McMaster’s Master of Public Policy in Digital Society Program to share her insights into this case and what considerations are needed as society navigates the changing landscape of AI.  

We’ve heard concerns over lack of protection over creative likenesses and use of deep fakes when it comes to AI – is this just an issue for people in Hollywood?  

The newsworthy examples we’ve seen so far about deepfakes tend to be people who live some parts of their life on the public stage, whether it’s people in Hollywood, people in politics, or people in prestigious position. It makes sense — they’re people who tend to make news, so when their image or voice is faked, it gets attention.  

Scarlett Johansson’s accusations that the “Sky” voice option in ChatGPT 4o sounds like hers, despite her refusal to allow her voice to be used, and OpenAI’s denial that they deliberately imitated it, is a new thread pulled from that same cloth.  

But AI’s capacity to generate videos, pictures or audio to look or sound like a real person is ultimately a risk to everyone’s privacy and security. We’ve seen cases of using famous voices to make disinformation seem more credible, such as when a fake of President Biden’s voice was used in a U.S. election-related robocall — a risk which doubles when the goal is to manipulate democratic participation.  

But your own voice could also be used in a scam, such as fraudster using an AI to fake your voice to get past voice fraud detection during a telephone banking transaction. With biometric identifiers that use your voice or face being pitched as the new wave of ID, we should be very aware of the risks of identity theft using deepfakes.  

Of course, synthetic media are not just used for financial fraud. Deepfake pornographic images or videos are one of the most common uses of the technology. Such fakes are also increasingly used in child sexual abuse material, to an extent that has investigators warning that it is impeding investigations and eating up investigation time.  

Many people have been wary about AI and its use now and in the future. Is this unique to this technology?   

It is nothing new that people are scared of the potential impacts of a new technology. Modern privacy laws are to some degree a result of worries about the incursions into private spaces and moments that portable cameras created back in the late 1800s.  

AI seems to up the ante when it comes to societal anxiety, however, in part because it is a technology of general application and, as you’ll know if you listen to the news, it is poised to revolutionise [insert your industry of choice here].  

Seriously though, while there are signs we’re coming slightly off the extreme hype cycle, the reality remains that these tools do have applications across many of the sectors that impact our lives, including education, health, financial services, manufacturing, public safety and national security.  

And they do raise risks, not just the ones we’ve talked about linked largely to malicious actors, but also more mundane, but equally consequential risks central to the way AI is trained and its data-intensiveness. Risks that lay bare the ways in which all the failures of humankind — not least systemic racism — are often reflected in the data we create or leave behind.  

What’s needed here to alleviate some of those concerns? Does the government need to step in and regulate? Is more transparency needed?  

We’re seeing many attempts to mitigate some of the risks of AI, and realistically, there is no silver bullet solution. We need regulation, but regulating in a rapidly evolving space is hard.  

Europe got the jump on much of the world and recently passed quite comprehensive legislation. But even there, civil society groups warn that many of the most rigorous protections they were advocating for, particularly those relating to biometrics use, failed to ultimately make it into the law.  

In Canada, we have an omnibus bill, Bill C-27, being studied currently by the INDU Standing Committee of Parliament, which includes in part three the Artificial Intelligence and Data Act (AIDA). Here, a coalition of civil society actors and individual advocates recently wrote a letter warning of the many ways in which AIDA is flawed, largely because it was produced without meaningful or broad consultation.  

We need standards, but standard development is a time-consuming task that usually relies on voluntary uptake and compliance. We need transparency, but transparency is only half of the problem because without accountability knowing what is wrong with a technology or a process as it is being applied to you isn’t especially comforting if you have no recourse. And the concept of recourse for harm takes us back to legislation, which only works if the full range of individual and social harms possible are appropriately scoped into the law. 

OpenAI has removed the voice in question for now. How important is what the company does next is? Do they have to work to regain trust?   

The Johansson case looks bad for OpenAI, in that they asked Ms. Johansson if she would allow her voice to be used, she said no, and the allegation is that one of the voices in the product is virtually indistinguishable from hers. 

It’s important to remember that this story sits in a larger context. It comes at a time when there are active lawsuits from writers saying their intellectual property was stolen for AI training without compensation or consent. It also comes at a time when the fairness of scraping the internet for information is reasonably being questioned by private citizens who wonder why, when they shared that picture or personal anecdote online so grandma could see it, that another company thinks it is fair game to scoop it up and profit from it?   

OpenAI did the right (and legally safest) thing in removing the “Sky” voice immediately. It will be interesting to see how the story unfolds, but I think there will need to be radical transparency about how that voice was actually created for any trust to emerge, and ideally, larger public conversations about the company’s data collection and use to address wider IP and fairness concerns for true social license to persist.  

How does this story impact AI and AI companies as a whole?  

We’re living through a time when a technology that we’re being told again and again will be transformational is just beginning to show what it might do to our lives and our world.  

There are so many questions. Are these energy-hungry tools responsible in the age of climate change? Can they actually transform economies and raise productivity? Where’s the proof? If they can, how can we support those workers whose jobs vanish or dramatically change on the one hand — educators are one category of workers who may be at risk — while also protecting the thousands of largely invisible workers toiling to make the tools function from exploitation? How do we require transparency while respecting trade secrets, and when they clash — as they often do — how do we make sure public interest carries the weight that it must? How do we make sure that the values and assumptions embedded in AI products are those that we want building the foundation of an AI-enabled world?  

Whether or not OpenAI stole Scarlett Johansson’s voice is not even the tip of the iceberg, it’s a snowflake on the tip. But it’s got people talking, and that is what is needed right now to help us get from questions to answers when it comes to thinking about where, when, and how we think AI should (not can) change our lives. 

Related Stories

Channels