At the end of April, the artificial intelligence development firm OpenAI released a new neural net, Jukebox, which can create mashups and original music in the style of over 9,000 bands and musicians.
Alongside it, OpenAI released a list of sample tracks generated with the algorithm that bend music into new genres or even reinterpret one artist’s song in another’s style — think a jazz-pop hybrid of Ella Fitzgerald and Céline Dion.
It’s an incredible feat of technology, but Futurism’s editorial team was unsatisfied with the tracks OpenAI shared. To really kick the tires, we went to CJ Carr and Zack Zukowski, the musicians and computer science experts behind the algorithmically-generated music group DADABOTS, with a request: We wanted to hear Frank Sinatra sing Britney Spears’ “Toxic.”
And boy, they delivered.
An algorithm that can create original works of music in the style of existing bands and artists raises unexplored legal and creative questions. For instance, can the artists that Jukebox was trained on claim credit for the resulting tracks? Or are we experiencing the beginning of a brand-new era of music?
“There’s so much creativity to explore there,” Zukowski told Futurism.
Below is the resulting song, in all its AI-generated glory, followed by Futurism’s lightly-edited conversation with algorithmic musicians Carr and Zukowski.
Futurism: Thanks for taking the time to chat, CJ and Zack. Before we jump in, I’d love to learn a little bit more about both of you, and how you learned how to do all this. What sort of background do you have that lent itself to AI-generated music?
Zack Zukowski: I think we’re both pretty much musicians first, but also I’ve been involved in tech for quite a while. I approached my machine learning studies from an audio perspective: I wanted to extend what was already being done with synthesis and music technology. It seemed like machine learning was obviously the path that was going to make the most gains, so I started learning about those types of algorithms. SampleRNN is the tool we most like to use — that’s one of our main tools that we’ve been using for our livestreams and our Bandcamp albums over the last couple years.
CJ Carr: Musician first, motivated in computer science to do new things with music. DADABOTS itself comes out of hackathon culture. I’ve done 65 hackathons, and Zack and I together have won 15 or so. That environment inspires people to push what they’re doing in some new way, to do something provocative. That’s the spirit DADABOTS came out of in 2012, and we’ve been pushing it further and further as the tech has progressed.
Why did you make the decision to step up from individual hackathons and stick with DADABOTS? Where did the idea come from for your various projects?
CJ: When we started it, we were both interns at Berklee College of Music working in music tech. When I met Zack — for some reason it felt like I’ve known Zack my whole life. It was a natural collaboration. Zack knew more about signal processing than I did, I knew more about programming, and now we have both brains.
What’s your typical approach? What’s going on behind the scenes?
CJ: SampleRNN has been our main tool. It’s really fast to train — we can train it in a day or two on a new artist. One of the main things we love to do is collaborating with artists, when an artist says “hey I’d love to do a bot album.” But recently, Jukebox trumped the state of the art in music generation. They did a really good job.
SampleRNN and Jukebox, they’re similar in that they’re both sequence generators. It’s reading a sequence of audio at 44.1k or 16k sample rate, and then it’s trying to predict what the next sample is going to be. This net is making a decision at a fraction of a millisecond to come up with the next sample. This is why it’s called neural synthesis. It’s not copying and pasting audio from the training data, it’s learning to synthesize.
What’s different about them is that SampleRNN uses “Long Short Term Memory” (LSTM) architecture, whereas the jukebox uses a transformer architecture. The transformer has attention. This is a relatively new thing that’s come to popularity in deep learning, after RNN, after LSTM. It especially took over for language models. I don’t know if you remember fake news generators like GPT-2 and Grover. They use transformer architecture. Many of the language researchers left LSTM behind. No one had really applied it to audio music yet — that’s the big enhancement for Jukebox. They’re taking a language architecture and applying it to music.
They’re also doing this extra thing, called a “Vector-Quantized Variational AutoEncoder” (VQ-VAE). They’re trying to turn audio into language. They train a model that creates a codebook, like an alphabet. And they take this alphabet, which is a discrete set of 2048 symbols — each symbol is something about music — and then they train their transformer models on it.”
What does that alphabet look like? What is that “something about music?”
CJ: They didn’t do that analysis at all. We’re really curious. For instance, can we compose with it?
Zack: we have these 2048 characters, and so we wonder which ones are commonly used. Like in the alphabet we don’t use Zs too much. But what are the “vowels?” Which symbols are used frequently? It would be really interesting to see what happens when you start getting rid of some of these symbols and see what the net can do with what remains. The way we have the language of music theory with chords and scales, maybe this is something that we can compose with beyond making deepfakes of an artist.
What can that language tell us about the underlying rules and components of music, and how can we use these as building blocks themselves? They’re much higher-level than chords — maybe they’re genre-related. We really don’t know. It would be really cool to do that analysis and see what happens by using just a subset of the language.
CJ: They’ve come up with a new music theory.
Well, it sounds like the three of us have a lot of the same questions about all this. Have you started tinkering with it to learn what’s going on?
CJ: We’ve just got the code running. The first example is this Sinatra thing. But as we use this more, the philosophical implications here are that as musicians, we know intuitively that music is very language-like. It’s not just waves and noise, which is what it looks like at a small scale, but when we’re playing we’re communicating with each other. The bass and the drummer are in step, strings and vocals can be doing call-and-response. And OpenAI was just like “Hey, what if we treated music like language?”
If the sort of alphabet this algorithm uses could be seen as a new music theory, do you think this will be a tool for you two going forward? Or is it more of an oddity to play around with?
CJ: Maybe I should correct myself. Instead of being a music theory, these models can train music theory.
Zack: The theory isn’t something that we can explain right now. We can’t say “This value means this.” It’s not quite as human interpretable, I guess.
CJ: the model just learns probabilistic patterns, and that’s what music theory is. It’s these notes tend to have these patterns and produce these feelings. And those were human-invented. What if we just have a machine try to discover that on its own, and then we ask it to make music? And if it’s good at it, probably it’s learned a good quote-unquote “music theory.”
Zack: An analogy we thought of: Back in the days of Bach, and these composers who were really interested in having counterpoint — many voices moving in their own direction — they had a set of rules for this. The first melodic line the composer builds off is called cantus firmus. There was an educational game new composers would play — if you could follow the notes that were presented in the cantus firmus and guess what harmonizing notes were next, you’d be correct based on the music of the day.
We’re thinking this is kind of the machine version of that, in some ways. Something that can be used to make new music in the style of music that has been heard before.
I know it’s early days and that this is speculative, but do you have any predictions for how people might use Jukebox? Will it be more of these mashups, or do you think people will develop original compositions?
CJ: On the one hand, you have the fear of push-button art. A lot of people think push-button art is very grotesque. But I think push-button art, when a culture can achieve this — it’s a transcendent moment for that culture. It means the communication of that culture has achieved its capacity. Think about meme generators — I can take a picture of Keanu Reeves, put in some inside joke and send it to my friends, and then they can understand and appreciate what I’m communicating. That’s powerful. So it is grotesque, but it’s effectual.
On the other side, you’ll have these virtuosos — these creators — who are gonna do overkill and try to create a medium of art that’s never existed before. What interests us are these 24/7 generators, where it can just keep generating forever.
Zack: I think it’s an interesting tool for artists who have worked on a body of albums. There are artists who don’t even know they can be generated on Jukebox. So, I think many of them would like to know what can be generated in their likeness. It can be a variation tool, it can recreate work for an artist through a perspective they haven’t even heard. It can bend their work through similar artists or even very distantly-stylized artists. It can be a great training tool for artists.
You said you’d heard from some artists who approached you to generate music already — is that something you can talk about?
CJ: When bands approach us, they’ve mostly been staying within the lane of “Hey, use just my training data and let’s see what comes out — I’m really interested.”
Fans though, on YouTube, are like “Here’s a list of my four favorite bands, please make me something out of it.”
So, let’s talk about the actual track you made for us. For this new song, Futurism suggested Britney Spears’ “Toxic” as sung by Frank Sinatra. Did the technical side of pulling that together differ from your usual work?
CJ: This is different. With SampleRNN, we’re retraining it from scratch on usually one artist or one album. And that’s really where it shines — it’s not able to do these fusions very well. What OpenAI was able to do — with a giant multimillion-dollar compute budget — they were able to train these giant neural nets. And they trained them on over 9,000 artists in over 300 genres. You need a mega team with a huge budget just to make this generalizable net.
Zack: There are two options. There’s lyrics and no lyrics. No lyrics is sort of like how SampleRNN has worked. With lyrics it tries to get them all in order, but sometimes it loops or repeats. But it tries to go beginning to end and keep the flow going. If you have too many lyrics, it doesn’t understand. It doesn’t understand that if you have a chorus repeating, the music should repeat as well. So we find that these shorter compositions work better for us.
But you had lyrics in past projects that used SampleRNN, like “Human Extinction Party.” How did that differ?
CJ: That was smoke and mirrors.
Zack: That was kind of an illusion. The album we trained it on had vocals, so some made it through to. We had a text generator that made up lyrics whenever it heard a sound.
In a lot of these Jukebox mashups, I’ve noticed that the voice sounds sort of strained. Is that just a matter of the AI-generated voice being forced to hit a certain note, or does it have something more to do with the limitations of the algorithm itself?
Zack: Your guess sounds similar to what I’d say. It was probably just really unlikely that those lyrics or the phonemes, the sounds themselves of the words, showed up in a similar way to how we were forcing it to generate those syllables. It probably heard a lot more music that isn’t Frank Sinatra, so it can imagine some things that Frank Sinatra didn’t do. But it just comes down to being somewhat different from any of the original Frank Sinatra texts.
When you were creating this rendition of Toxic, did you hit any snags along the way? Or was it just a matter of giving the algorithm enough time to do its work?
CJ: Part of it is we need a really expensive piece of hardware that we need to rent on Amazon Cloud at three dollars per hour. And it takes — how long did it take to generate, Zack?
Zack: The final one I had generated took about a day, but I had been doing it over and over again for a week. You have so little control that sometimes you just gotta go again. It would get a few phrases and then it would lose track of the lyrics. Sometimes you’d get two lines but not the whole chorus in a row. It came down to luck — waiting for the right one to come along.
It could loop a line, or sometimes it could go into seemingly different songs. It would completely lose track of where it was. There are some pretty wild things that can happen. One time I was generating Frank Sinatra, and it was clearly a chorus of men and women together. It wasn’t even the right voice. It can get pretty ghostly.
Do you know if there are any legal issues involved in this kind of music? The capability to generate new music in the style or voice of an artist seems like uncharted territory, but are there issues with the mashups that use existing lyrics? Or are those more acceptable under the guise of fair use, sort of like parody songs?
CJ: We’re not legal people, we haven’t studied copyright issues. The vibe is that there’s a strong case for fair use, but artists may not like people creating these deepfakes.
Zack: I think it comes down to intention, and whatever the law decides they’ll decide. But as people using this tool, artists, there’s definitely a code of ethics that people should probably respect. Don’t piss people off. We try our best to cite the people who worked on the tech, the people who it was trained on. It all just depends how you’re putting it out and how respectful you’re being of people’s work.
Before I let you go, what else are you two working on right now?
CJ: Our long-term research is trying to make these models faster and cheaper so bedroom producers and 12-year-olds can be making music no one’s ever thought of. Of course, right now it’s very expensive and it takes days. We’re in a privileged position of being able to do it with the rented hardware.
Specifically, what we’re doing right now — there’s the list of 9,000-plus bands that the model currently supports. But what’s interesting is the bands weren’t asked to be a part of this dataset. Some machine learning researchers on Twitter were debating the ethics of that. There are two sides of that, of course, but we really want to reach out to those bands. If anyone knows these bands, if you are these bands, we will generate music for you. We want to take this technology, which we think is capable of brand-new forms of creativity, and give it back to artists.
More on DADABOTS: Researchers Trained a Neural Net Using a Cannibal Corpse Album
Original post: https://futurism.com/mind-melting-ai-frank-sinatra-toxic-britney-spears
38 comentários em “Mind-Melting AI Makes Frank Sinatra Sing “Toxic” by Britney Spears”
Howdy! This article could not be written much better! Looking at this article reminds
me of my previous roommate! He constantly kept talking about this.
I’ll send this post to him. Fairly certain he’ll have a very good read.
Thank you for sharing!
Greetings, I think your blog might be having browser compatibility issues.
When I look at your website in Safari, it looks fine however when opening in IE, it has some overlapping
issues. I merely wanted to provide you with a quick heads up!
Aside from that, excellent website!
Saved as a favorite, I like your website!
Awesome blog! Do you have any hints for aspiring writers?
I’m planning to start my own blog soon but I’m a little lost on everything.
Would you recommend starting with a free platform like WordPress or
go for a paid option? There are so many
choices out there that I’m totally confused .. Any suggestions?
I’m the owner of JustCBD Store brand (justcbdstore.com) and am aiming to grow my wholesale side of company. I am hoping someone at targetdomain is able to provide some guidance . I considered that the most ideal way to do this would be to reach out to vape companies and cbd retailers. I was really hoping if anybody at all could suggest a trustworthy site where I can buy CBD Shops Business Mailing List I am presently examining creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Not sure which one would be the most suitable option and would appreciate any support on this. Or would it be simpler for me to scrape my own leads? Suggestions?
I am the co-founder of JustCBD brand (justcbdstore.com) and am trying to grow my wholesale side of company. It would be great if someone at targetdomain can help me 🙂 I considered that the most effective way to do this would be to talk to vape stores and cbd retailers. I was hoping if anyone could recommend a dependable web-site where I can purchase Vape Shop Business Contact List I am already considering creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Unsure which one would be the most ideal selection and would appreciate any advice on this. Or would it be easier for me to scrape my own leads? Ideas?
Hi there, There’s no doubt that your site could be having internet browser compatibility issues. When I look at your website in Safari, it looks fine however, when opening in Internet Explorer, it’s got some overlapping issues. I simply wanted to provide you with a quick heads up! Aside from that, excellent blog!
I really like reading a post that can make men and women think. Also, many thanks for allowing for me to comment!
Oh my goodness! Awesome article dude! Many thanks, However I am going through troubles with your RSS. I don’t know the reason why I can’t subscribe to it. Is there anybody else having identical RSS issues? Anyone who knows the answer will you kindly respond? Thanks!!
I needed to thank you for this good read!! I absolutely loved every little bit of it. I’ve got you bookmarked to look at new things you post…
When I initially commented I seem to have clicked the -Notify me when new comments are added- checkbox and from now on every time a comment is added I receive four emails with the same comment. Is there an easy method you can remove me from that service? Thanks!
Hi there! I could have sworn I’ve been to this blog before but after browsing through many of the posts I realized it’s new to me. Regardless, I’m certainly happy I came across it and I’ll be bookmarking it and checking back frequently!
You’ve made some really good points there. I looked on the internet for more info about the issue and found most people will go along with your views on this website.
Good post. I learn something totally new and challenging on sites I stumbleupon everyday. It will always be exciting to read through articles from other writers and practice something from their sites.
Hi there! I could have sworn I’ve visited this site before but after looking at many of the articles I realized it’s new to me. Anyways, I’m certainly pleased I discovered it and I’ll be bookmarking it and checking back regularly!
A motivating discussion is definitely worth comment. I think that you should write more about this subject matter, it may not be a taboo matter but typically people do not talk about these subjects. To the next! Kind regards!!
This excellent website truly has all of the information I needed concerning this subject and didn’t know who to ask.
It’s hard to come by knowledgeable people on this topic, but you sound like you know what you’re talking about! Thanks
Achieving your fitness goal does not need a certified personal trainer or an expensive gym memberships, it’s not hard to exercise at home. It is easy to go down a training and fitness rabbit hole, however, when you are looking for the best home exercise equipment to outfit your personal home gym.
I used to be able to find good information from your content.
An impressive share! I have just forwarded this onto a colleague who was doing a little homework on this. And he actually ordered me breakfast due to the fact that I found it for him… lol. So allow me to reword this…. Thanks for the meal!! But yeah, thanx for spending some time to talk about this topic here on your web page.
I couldn’t resist commenting. Exceptionally well written!
Everything is very open with a really clear explanation of the issues. It was definitely informative. Your website is very useful. Thank you for sharing!
Spot on with this write-up, I actually feel this site needs a great deal more attention. I’ll probably be returning to read through more, thanks for the advice!
You need to be a part of a contest for one of the highest quality sites on the internet. I will highly recommend this site!
Having read this I thought it was very enlightening. I appreciate you spending some time and effort to put this article together. I once again find myself personally spending way too much time both reading and posting comments. But so what, it was still worth it!
Can I just say what a relief to discover a person that genuinely knows what they’re talking about on the net. You certainly know how to bring an issue to light and make it important. More and more people must read this and understand this side of the story. I can’t believe you’re not more popular given that you certainly have the gift.
Howdy! I simply would like to give you a huge thumbs up for your excellent information you have right here on this post. I am coming back to your site for more soon.
This web site really has all of the information I needed about this subject and didn’t know who to ask.
Excellent blog you’ve got here.. It’s difficult to find quality writing like yours these days. I honestly appreciate individuals like you! Take care!!
Hi, I do believe this is an excellent blog. I stumbledupon it 😉 I’m going to come back yet again since i have book-marked it. Money and freedom is the greatest way to change, may you be rich and continue to guide other people.
I was able to find good info from your articles.
Pretty! This has been an incredibly wonderful post. Thank you for providing these details.
After going over a few of the blog posts on your site, I really like your technique of blogging. I bookmarked it to my bookmark webpage list and will be checking back soon. Take a look at my website too and let me know how you feel.
I like it whenever people get together and share opinions. Great blog, continue the good work!
This site really has all the information I wanted about this subject and didn’t know who to ask.
Hello! I simply would like to give you a huge thumbs up for the great information you’ve got here on this post. I’ll be coming back to your web site for more soon.
This is a topic that’s close to my heart… Thank you! Where are your contact details though?