Inside the AI Debate – Tim Frielander

Like it or not AI is here, and it will only get better. Where does that leave Voice Artists, Podcasters and Content Creators who currently have no protections in terms of owning their voice?

Tim Frielander is an award winning, voice actor, studio owner, advocate, and educator. Tim is also the Founder and President of NAVA, The National Association of Voice Actors as well as co-owner and editor of The Voice Over Resource Guide. His work with NAVA puts him at the coal face of negotiations with the like of voices.com and the AI seeding debate. We have him on the show next week to give us an insight into where we might be headed in terms of a compromise, what protections we might be able to put in place, and most troublininly the short amount of time we have to get it done before it may effectively be too late.

A big shout out to our sponsors, Austrian Audio and Tri Booth. Both these companies are providers of QUALITY Audio Gear (we wouldn’t partner with them unless they were), so please, if you’re in the market for some new kit, do us a solid and check out their products, and be sure to tell em “Robbo, George, Robert, and AP sent you”… As a part of their generous support of our show, Tri Booth is offering $200 off a brand-new booth when you use the code TRIPAP200. So get onto their website now and secure your new booth…

Tri-Booth

https://tribooth.com/

And if you’re in the market for a new Mic or killer pair of headphones, check out Austrian Audio. They’ve got a great range of top-shelf gear..

https://austrian.audio/

We have launched a Patreon page in the hopes of being able to pay someone to help us get the show to more people and in turn help them with the same info we’re sharing with you. If you aren’t familiar with Patreon, it’s an easy way for those interested in our show to get exclusive content and updates before anyone else, along with a whole bunch of other “perks” just by contributing as little as $1 per month. Find out more here..
 
 
 
George has created a page strictly for Pro Audio Suite listeners, so check it out for the latest discounts and offers for TPAS listeners.

https://georgethe.tech/tpas

If you haven’t filled out our survey on what you’d like to hear on the show, you can do it here:

https://www.surveymonkey.com/r/ZWT5BTD

Join our Facebook page here:

https://www.facebook.com/proaudiopodcast

And the FB Group here:

https://www.facebook.com/groups/357898255543203

For everything else (including joining our mailing list for exclusive previews and other goodies), check out our website

https://www.theproaudiosuite.com/

“When the going gets weird, the weird turn professional.”

Hunter S Thompson

 

  
Summary
In this episode of Pro Audio Suite, we explore the controversial topic of AI voices with special guest Tim Friedlander. Voices.com has reportedly promised not to use people’s voices from their database without permission, but the potential misuse of audition files by clients remains a concern. We discuss the fairness of voice synthesis, highlighting Nava’s call for consent and compensation for voice actors. Listeners will gain insight into the problematic quality of AI voice samples and the potential threat to new voice actors as AI begins to replace human voices in certain sectors. We also delve into the future role of agents as potential AI voice libraries, and the necessity for clear licensing fee structures and strong protections before the end of the year to prevent misuse.

#VoiceAIControversy #FairVoicesCampaign #FutureOfVoiceActing
  
Timestamps
(00:00:00) Introduction
(00:00:43) Voices.com’s Promise
(00:03:31) Copyright Laws and AI Voices
(00:11:50) Review of AI Voice Samples
(00:12:59) Risks of Recorded Audio
(00:14:25) Dangers of AI
(00:19:57) AI Replacing Human Voices
(00:23:26) AI’s Impact on Visual Artists
  
Transcript
Speaker A: Y’all ready be history.,Speaker B: Get started.,Speaker A: Welcome.,Speaker B: Hi. Hi. Hello, everyone, to the Pro Audio Suite.,Speaker A: These guys are professional and motivated with tech. To the Vo stars George Wittam, founder of Source Elements Robert Marshall, international audio engineer Darren Robbo Robertson and global voice Andrew Peters. Thanks to Triboo Austrian audio making passion heard. Source elements. George the tech. Wittam and robbo and AP’s. International demo. To find out more about us, check thepro audiosuite.com.,Speaker B: Learn up learner. Here we go.,Speaker C: And welcome. And don’t forget, if you want to get a discount of $200 off your Tribooth trip, 200 is the code you need now, this week. Very topical. Of course, this AI thing will just not go away. And I know that there was a conversation about that place. I don’t even like saying it. Anyway, I will say it. Voices.com supposedly have promised not to farm out people’s voices from their database. Tim Friedlander has been involved in this and has written an article, which is what I saw. And Tim is joining us. G’day Tim.,: Hello. Hello. I’m here.,Speaker C: So what’s the backstory to this and how did you get involved?,: The backstory to the AI voices.com thing goes back to about May when Davidcirellianvoices.com announced that they were releasing Voices AI and for the voice acting community, that was a huge concern, basically for the main part being that many people have been uploading audio to their website through their website for 20 years. So theoretically, Voices.com or either of these sites has 20 years of very high quality data and audio that they could use to synthesize our voices. So through Nava, which is association that I run along with Karen Guilfrey and a board of directors, we reached out to David and Stephanie and had a week of conversations with them to get the assurance that they had never been uploading or using or doing anything with auditions or files that have been uploaded through their website. And out of that came our Fair Voices campaign or the Fair Voices pledge that we launched. And we reached out to the other online casting sites, six other sites, to get the same assurances from them and also to make sure that they had changed their terms of service. So Voices.com at the time changed their terms of service to very explicitly say they would not be using any audio files uploaded through their site for machine learning or synthesized or synthesizing voices.,Speaker C: Was that backdated or is that from that point onward?,: The terms of service were from that point onward, but they publicly at the time and in various blog posts and other written areas have said that they have never used audio files for that. The caveat being is that once the audio files are uploaded and sent to a client, it’s possible that the client then could take those audition files and use them. We don’t know and haven’t seen any companies per se who we know are doing that but over the last ten or so years, a lot of these companies have been working in the AI TTS sphere and very potentially could have been using that audio for training. We haven’t seen it yet explicitly that we know of, but the inability to track our audio files and to know where the audio goes once we’ve emailed it out or uploaded through a website makes that a real possibility.,: So to give this some perspective, is there any sort of copyright law or anything in place at the moment that protects someone from having their voice turned into an AI voice without their permission?,: That’s a great question. Short answer is no. We’ve been working with the Copyright Office. I gave a presentation to the FTC last week at a roundtable. I’ve spoken with multiple lawyers and people across the country and across the world. We’re working with a group in Europe to help with the EU AI act. Most actors, voice actors, we give away our files as a work for hire, and the understanding is that that audio will be used for this very specific project. Unfortunately, that also basically gives the person we’ve given the audio file to the copyright and the ability to do whatever they want to with that. We’re currently looking at the possibility that since most voice actors record from home, if from like a music perspective, we could theoretically be the owners of the master files, because a lot of times there’s no contracts that are signed. But that’s an early we’re in the early stages of of exploring that. But there are copyright law does not currently protect the voice actor. It protects the copyright holder, which 99% of the time is the company who hired us. Wow. The only other thing we could fall back on is right, right of publicity. But those laws are only really in California and New York, where the strongest laws and then there’s possibly biometric and privacy laws, but those really are only strongest in Illinois and Texas of all places, privacy rights.,Speaker C: So is there a way of know? We’ve talked about this before having some kind of fingerprint of, your know, if anybody uses your voice, it’s quite obvious it’s yours because it shows some kind of a fingerprint in the waveform, potentially. I don’t know how that would work, but there must be someone who’s got.,: Something nobody does currently that we know of. I’ve spoken with people at DARPA and at NASA. We are currently working. We’ve gone very deep in this conversation to try and figure out a way to do this, what we can do. And actually, I’m working on this with another company that I started about three years ago to create voice prints that we can then use to match a human voice to a synthetic voice and also to match a human voice to a human voice to say that they’re the same person. You could theoretically, if we can get that software in place lock down a voice. So if somebody tries to upload it to a synthetic voice site, it would be locked and would be flagged as basically essentially DRM for voice is what we’re trying to do. But the only thing really that you could do that might stay is some kind of spread spectrum watermarking that you could do within that. But it’d have to be embedded so deeply in there that you could rip this into Pro Tools or rip it into something else right. And transfer it between audio files or different Daws and strip out. If it’s frequency, then it’s very easy to pull out frequencies. Most of the stuff that’s out there watermarking is pretty easy to bypass currently.,Speaker C: Well, you just have to get clarity or something and it’s gone.,: Yeah, exactly. Yeah.,: So what’s the compromise future from your perspective then? Would it be a point where Darren Robertson is selling his voice sample disc to AI people? Or would you rather not see AI at all?,: I’m a musician primarily. I was in Seattle in the was on the cusp of playing live and really exploring music when napster and everything hit. And from a consumer perspective, that was one of the most eye opening things that I’d ever seen. The ability to now have access to a massive amount of audio that I’d never heard before. Not anti technology by any means and definitely not anti AI. I’ve worked with a synthetic voice company. I have know people who are working with synthetic voice companies. The issue right now is that a lot of the foundational models, a lot of the foundations of these AI generative engines, synthetic voice engines are built on somebody’s data and more than likely they are being built on the literal voices of voice actors. So we become the foundation of a lot of these models. What Nava has been asking for is consent, control and compensation. And it’s the same thing that all artists are asking for, musicians are asking for, models are asking for, is if you’re going to take my data and what makes the essence of me. My voice or my image, or the way I walk or the way that I speak, the cadence that I have, the way that I stand. All of those things are very personal to all of us individually. And that data is basically being turned into data, right. What makes us is being turned into data and put into these synthetic voice engines or these synthetic generative engines or generative AI to produce images and videos and photos and voices that are based on real humans and sound like and look like real humans. So we try to find consent, control and compensation for those and really consent to say yes or no. You can make a synthesized version of my voice.,Speaker C: So if we’re talking about AI voices, we’re not going to stop. It’s already out. I mean, the thing’s going to happen.,: They’re out there. Yes, correct.,Speaker C: How do you perceive we control. It?,: The only thing that we can currently do right now. And this is part of what this discussion at the FTC came up with last week, is really, I think, from a consumer perspective, a consumer safety perspective, I think that there is so much danger in disinformation and false. Information and just absolute lies that are out there that can now be easily replicated and put into a video or an audio or something that is not very easily detectable. It’s almost impossible to tell a synthetic voice from a human voice that are done well. It’s hard to tell a synthetic image from a factual image. The laws and regulations currently our laws and legislation, I think, is currently the only thing that we can really do on a broad scale to help stem the tide of the damage that’s been done already. And going forward, we have to have very clear contracts and agreements in place that either do or do not allow for the use of somebody’s voice to be used in a synthetic voice or generative. AI. That’s partially what the WGA and SAG afterstrikes are about. AI is the top of that list of things that are concerns, and it’s a top concern for anybody who is in the arts right now that creates anything that any of that could be put into a synthetic engine of some kind and have a new creation made out of that. We just came out of a pandemic where we relied on artists, on musicians and filmmakers and actors and voice artists. And the first thing we do out of that pandemic is try and replace those people. That’s really essentially what’s happening. There is some accessibility. There are places that there is an argument to be made for doing things that a human couldn’t generate. But when it’s done to replace somebody, when it’s done just to save money, that’s where the concern comes in. And we know that money, those savings, are not going to be passed along to the consumer. A video game is not going to be cheaper for somebody to buy because it has synthetic voices. A movie is not going to be cheaper at the movie theater because it’s synthetically generated. So they cut out the people. They cut out the people who actually make this work, and then that money just goes to the company that gets to save that money at the expense of everybody.,: Why would voices.com say the quiet part out loud? They’re a bit like Uber basically going like, hi, please work for us. Make us money, and then we’re going to put all of our money into figuring out how to make driverless cars so we don’t need you see bitches.,Speaker C: Yeah, exactly.,: They did. I don’t know if anybody saw the news last week, but David Cicearelli is out and Morgan Stanley is it morgan Stanley who was the venture capital whoever gave them the money, they replaced him at the top. My guess is that they either went all in on AI and it’s not paying off, or they weren’t seeing this is all purely speculation. This is just what we can have for conjecture in this place. So I know nothing for fact, but they invested a massive amount of money in them, what, $18,000,015 to $18,000,000.07 years ago. And if they went all in on AI, I don’t know if anybody’s heard.,: They lost all of it.,: Yeah, they lost all of it. Has anybody actually have you guys heard their AI? The voices AI their samples. They’re terrible.,: Never heard it.,: They’re terrible. They are terrible. But they were done with consent, control and compensation.,: Is it better or worse than voicealo?,: I haven’t heard that one. But most of what I deal with, I deal with Eleven Labs and Play HT are the two that I use most often, for example, for samples in that. And both of those are phenomenal. They are really good. And voices. AI is nowhere. It sounds about ten years old, the technology, from what I heard, and some of the voice actors who had their voices synthesized, who participated in this are not happy with how that voice sounds.,Speaker C: Yeah, I was going to say, just to lighten up a bit, there’s an old gag that could actually be modernized and you can ask the question, how many voiceover artists does it take to change a light bulb? And the answer is none. You get an AI to do it.,: That was a drummer joke.,Speaker C: I know we can update it.,: It.,: Just hasn’t happened quite yet.,: I was going to say. Yeah, exactly. I’ve heard that one before somewhere. So the thing that occurs to me though, Tim, is it’s great that we’re protecting voice actors and all that sort of stuff, but obviously there’s a crapload more voice samples out there. I mean, how many podcasts are there out there? And YouTube content creators and all the rest of it? All these places they could go mining for voices.,: How do we protect know? Currently we can’t currently there is no protection for Know. This goes into Know, we talk about this being more it’s with anybody who has recorded audio is at risk. And that voice actors just happen to be the ones who make a living off of our recorded voice most of the time, but doesn’t mean that others aren’t making a living off of what they have on the podcast and YouTube. And even those who are just hobbyists at this, who just have a little bit of recorded audio, some twitch stream. I can currently record all the audio off this and make a synthetic voice of anybody on this conversation right now, as can anybody who’s listening to it.,Speaker B: Right.,: And it’s easy.,: What work does it really kill, truly kill? Like in the short term? I can see it taking out a crapload of elearning and other things like that.,: It takes that out that’s any of the stuff that is purely factual, a lot of times talk about factual stuff where I just need information read. A lot of that stuff gets taken out right away, which if you can license your voice to that, then you can still have a career as a voice actor. One of the things that I think is the dangerous part of this, and this goes for any of the arts, is that a lot of these places that are going to be replaced first are where a lot of voice actors, a lot of artists learn. This is how you cut your teeth and you come up through the industry. You do the free jobs, you do the cheap jobs, you do the entry level jobs. Those entry level jobs go away right away because it’s cheaper. But a lot of the times it’s better. Unfortunately, it is better. The audio quality of a voice actor who’s just starting out, who is using a USB mic in their living room with hardwood floors and the refrigerator running and the AC is going to be at risk for sure, and I think rightfully so.,: I’ll give you another one, is the company that doesn’t hire anybody, right. And they just see the AI voices as it’s better than having Mary Jo read it because it’s going to take her a long time and whatever. And so just type it into the system. And there’s our video. It’s our instruction video on how to use our garden hose absolutely or something. And yeah, it’s going to take out I don’t see it initially taking out real voice acting, but I agree, just like conveying voice, it’s just going to plenty of AI voices I’d rather hear.,: Instead of the president of the auto.,: Workers union, for example. One of the things that we’ve seen, I think, that’s been most hopeful in this is that those who work with voice actors already or don’t want to replace voice actors, those people who are already working in the creative sphere, who are the producers, who are the directors, they’re the people they say, I would never replace a voice actor. But it’s all of those people who don’t who have just need a voice actor for this one time, need a voice actor for this one training video, this one thing here that they would go to a friend or a referral or wherever it might be, to the online casting site and cast somebody who’s new. They’re not going to do that anymore. And we’re not going to see it’s very hard to tangibly find the damage to this because we’re not seeing auditions going out where they’re saying we’re going to audition a human versus an AI. And the AI gets the job. They’re just not even going to bother to do the auditions in the first place. And we’re never even going to know if it was a synthetic voice. So this is partially why, again, laws and legislation. There’s a Senate bill out that NAV is endorsing senate Bill 26 91, which is a labeling act of 2023, which is going to require all anything AI generated to be labeled marked same thing as you would with food. I think consumers have a right to know if what they’re taking in is synthetic or human, whether it’s emotional, spiritual, food. We have a right to know what we’re interacting with. I think.,: I want to know when I’m in the Matrix personally.,: Right, exactly. Yeah, you want to know you’re in the Matrix.,: I’m sure it puts to bed a lot of political issues. Mean, you know, imagine sitting there listening to a radio broadcast of Joe Biden declaring war on Russia when it’s actually not really you know, there’s all sorts of issues that this raises.,: Well, that as well, but also it raises the possibility of doubt. And the Donald Trump tape from years ago, if he could say, well, I never said that that’s a synthetic voice, and prove that it’s not my synthetic voice. Prove I actually said that. Right, so you’re running into proving to both sides of that and we’re coming into election.,: All sorts of possibilities raised, considering some of the possible candidates, right?,: Yes, absolutely.,Speaker C: Is there a way of a voice actor to say, okay, I’m going to actually upload to say someplace where you can license a voice from you actually give them all the information of your voice and then there’s a license fee. If people want to grab it and use it for something, then they pay you a license fee the same as you would do with library music.,: Absolutely. I’ve been pushing that example for a while. I think that one of the ways that both Europe with the GDPR and with FTC are approaching this is that we don’t need to make new laws or new regulations. We just need to enforce the ones that exist and put this into use. The precedent, I think the precedent of music licensing can directly go into voice. You have a licensing fee, you have a usage fee, you have a generation fee. If you generate new content from this, then I get paid a certain amount for the generation. There’s companies out there that do that. Vocal ID veritone was one of the earlier ones that did that. And there’s a licensing fee that they have in place for that. And the actors who do that have the consent to know where their voice goes. We’re working with a TTS company who reached out to us and we’re helping them with this exact same thing of helping to license their deployments so that the voice actor knows where their voice is being used, but also get paid for the original creation of that model and then know where the voice goes from there. There’s lots of possibilities. The one possibility that unfortunately, none of those things really exist right now. The only possibilities happen is people just can upload your voice anywhere they want to create a synthetic voice and use it. And there’s nothing really stopping anybody, even the AI sites. Right now, all you have to do is click a button that says, yes, I have the right to upload this voice.,: And at what point do you stop?,: I mean, at what point do you stop anybody?,: If you blend two people’s voices or three people’s, at a certain point, you’re.,: Like, it’s becomes you know I mean, that’s what Siri Alexa, Google voice, those are, you know, they’re all blended voices, multiple people put in together and to create a new voice. So now you have to get into now you’re talking about songwriting splits, right? Now you’re going to talk about splits and points on a song, right? So I’ve got three voices. We all get an equal split of the usage of that voice, or does it not become an issue because it doesn’t sound like anybody? Therefore, there’s no conflict. Voice actors, you’re also going to run into conflict. Right? What if my human voice is doing Pepsi? My synthetic voice can’t do Coca Cola. And if it does, who’s going to be held responsible for or or a voice that just sounds like me? At what point how do you draw the line there? How do you even know this voice sounds a lot like me? Is it my voice or is it not my voice? It’s a voice that sounds a lot like me. Do I get into conflict because of the similarity?,: It’s just like this actors are impersonated. It has to be like, all voices are synthesized, right?,: Yeah, exactly.,: From a synthetic voice saying that all voices are synthesized, including this voice.,: Yeah. Right.,Speaker C: But can you see, like, if you look into the future of the role of the agent, will the agent all of a sudden become a library of voices that can potentially be used for AI? Would that be the shift?,: I have honestly have no idea. I think there’s going to be a we’re already starting to see a split of human only no AI, and then those who are willing to have a conversation with it and explore it. I’m not by any means advocating to replace humans with AI voices, but we also know that this technology has been around for years, right? And it’s been being built for the last 20 years, ten years solidly for synthetic voices. It’s here, and we can just pretend that it’s not going to have an impact and hope that it doesn’t have an impact. Or we can go directly to these companies, which is what we’ve been doing. I’ve been speaking with the CEOs of these companies to try and talk with them about great, this is why voice actors are concerned. This is why artists in general are concerned. But this is what we’re concerned about. And we know you have a lot of money. Eleven Labs just they’re worth $100 million, or they got an investment of $100 million a month ago or so. Right. They have the money to pay the voice actors fairly for the foundation. And if they can license that, the better audio they have, the better foundational model they can create. So if those voice actors who want to do that have the right to say yes, it’s the right to say yes as much as it is the right to say no. You should have the right to say yes if you want to. I think.,Speaker C: I reckon there’s going to be a scramble with voice actors all trying to get themselves uploaded onto one of these business sites so they can be licensed out.,: Yeah, some of them have. Right now, there’s really no clear understanding of what that licensing fee would be. We’ve seen similar jobs on the casting sites that on one job is paying $500, on the next job it’s paying $20,000. And they don’t appear to be any different. We just don’t have enough a lot of people who are casting don’t have enough information to know about where those files are going to be used. Voice actors don’t know really enough about how they’re going to be used either to know what to ask, and agents don’t know what to ask either. Like just so many unknowns out there about what to even ask to come up with what a fair usage would be. Because there’s so many potentially so many uses out there that we can’t even comprehend right now that we can’t even imagine of that they could be used for. So it’s really hard to tell. That generation is kind of what we’re looking at as kind of a generation fee is what we’re kind of really interested in.,Speaker C: Well, it’s going to be interesting to watch how this all unfolds, but it’s.,: A massive can of worms, isn’t it?,Speaker C: It is incredible.,: It is a massive can of worms. Yeah. Visual artists are being hit massively, obviously, right now. They’re some of the most hard hit because those images are so distinctive and the styles are so distinct that when they come out that it’s obvious it was trained on those authors. There’s two lawsuits against multiple lawsuits against AI companies right now from authors who have had their books ingested into these and used as foundational models to train these things. And the thing is, once it’s trained, you can’t untrain it.,: Well, AP, was it? You saying that there’s a film in the Cam with starring James Dean?,Speaker C: Yeah, that’s what I’m told is sitting there waiting to go. So James Dean is going to be a co star of a New know you’ve used motion capture. So they’ve got an actor that actually can walk and move like James Dean. They’ve just done a motion capture and then they built James Dean over the top of his skeleton, so to speak. And if that thing becomes a hit, you can see they’re going to drag them all out.,: Right.,: And then Elvis really isn’t dead.,: Yeah, right, exactly. We talk about that for vo. Like speech to speech, too.,: Well, that’s the thing. How would you license that, Tim?,: It’s performed know the James Dean performed by so and so. You want to give the motion capture person the credit for it. Like speech to could I could know. Karen Guilford vice president uses example a lot, which is she could narrate audacity of Hope and then put Barack Obama’s voice over it. So it would be the voice of Barack Obama performed by Karen Guilfrey.,: Right.,: So as read by Barack Obama performed by Corn Griffin. Yeah. As puppetry.,Speaker B: Yeah.,Speaker C: If I was the ad agency for 711, I would actually get an AI of Elvis and have him in a 711. And finally, it’s true.,: Slurpee in one hand, donut in the other. Is that what you’re saying?,: When does Elvis become public domain?,: A long time. Long time. It’s a space to watch, isn’t it? It really is.,Speaker C: And the space will be filled by AI.,: Yeah, it’s interesting. And I think we’ve got three months left. I think we have about three months before something dramatically so you think there.,: Is a time frame on this? Because I was actually sitting here thinking, god, how long will this take to sort? But you’re saying you think there might be a time frame on it?,: I think we have, if anything, any legitimate and strong protections need to be in place before the end of the year. By the end of the year, it’s going to be too late for us to have any kind of protection. The technology is moving too quickly. It’s exponential. And it’s going to be beyond our control or potentially beyond the control of those who actually are running the systems. At one point, without fully taking your entire system offline and destroying your models, it could potentially get to the point where there is no control, there is no ability to consent, there is no ability to even know whose voice is being used. They’re just a multitude of generic voices that one company gets paid when you use their voice, but nobody has any idea who the human behind it is or where the content came from anymore.,: Watch this space, people.,Speaker C: Yes, indeed. Indeed. Exactly. By the way, this is actually really not me. I’m on holiday.,: This is my not hard to do.,Speaker B: Well, that was fun. Is it over?,Speaker A: The Pro Audio suite with thanks to Tribut and Austrian audio recorded using Source Connect edited by Andrew Peters and mixed by Voodoo Radio Imaging with tech support from George the Tech Wittam don’t forget to subscribe to the show and join in the conversation on our Facebook group. To leave a comment, suggest a topic or just say g’day. Drop us a note at our website. Theproudiosuite.com.
    




Check out this episode!

Leave a Reply

Your email address will not be published. Required fields are marked *