These days roughly 20% of the songs coming through our platform for promotion are AI-generated. Roughly 75% of them are honest and declare their AI usage - but another 25% try to hide it. Some of them are actually writing scripts to "clean" their audio so that it can bypass detection.
Do not try to solve an unsolvable problem, you'll end up hurting real users quite a bit more than you might imagine. Imagine new enthusiastic users trying your platform getting hit with an AI label because of inevitable false positives.
'Detecting AI' is not a problem that has real solutions, the only avenue is something supply side like synthid. But that harms users too, by introducing further barriers for indie users.
I train music generation models. They are very trivial to detect. In fact, detecting them then training them to evade detection by the detection model is a big part of training them! But the detectors win instantly without some hardcore regularization. Simply turn that off and you've instantly got a perfect classifier.
This isn't like text classification, the signal many orders of magnitude higher bitrate and so many more corners need to be cut. It's likely going to be nearly impossible or at least not remotely worth it to generate an audio signal that is truly undetectable in the foreseeable future.
You are right, the output of a model that generates music directly is, for now, easy to categorize as AI.
What this big flux of AI generated music online isn't really that. It'a a tiny bit autogenerated stuff and a whole lot of automatically remixed stuff. The reason it can not be easily classified as AI is because quite a bit of human produced music is also that, and you'd just shut out real users.
Today. Trying to detect AI is like extracting water from puddles in a lake that is quickly drying up. What is the point in the short term if it's impractical in the long term? It will catch some low-hanging fruit in the best case, and will find false positives in the worst.
My point is you should consider creating truly undetectable audio end to end with AI to be effectively impossible for the foreseeable future (i.e., I would bet money it is still trivially detectable five years from now). It won't be detectable to humans, though, only models.
in the broad strokes of ai generated, i wouldnt be so sure.
if the ai picked a bunch of samples and combined them together and mastered using an mcp to a DAW, how is that particularly distinguishable vs a person doing the same thing badly?
i can see how the llm generation pictures of spectrograms is essy to spot, but much less so with tool following.
even worse of you using a vla to have it actually play the guitar and use the recording as a sample.
theres some time and setup to make it happen sure, but somebody put that all in a studio and expose an mcp
This is an aside, but thank you for doing this work! As a musician who plays real instruments and submits real songs to Submithub, it's nice to know that hard work is going into validation and prevention of scammers passing off AI as their own talent. Keep fighting the good fight.
"AI detectors" are fun like horoscopes are fun, until they flag your music as AI generated, and distribution channels blacklist you and your label sues you. On the bright side, you can sue the creator of the AI detector in return.
I've had my digital art flagged a few times for various reasons (automatic copyright infringement and NSFW filters) -- so this is nothing new (in particular the artwork blocked the upload for some artist songs). The only thing is to have a reasonable appeal process. In all cases we got an automated approval after appeal, but it can put an untimely delay.
Honestly I hope that the AI filter would be much better in terms of false positive than the aforementioned one, if only because it should be easier via statistical methods.
The only reason you're saying that is because you haven't tried to build such a detector yourself. It's not like text where it's impossible to tell reliably if something's AI generated or not, from a technical perspective it's very trivial to detect anything coming straight out of a Suno/Udio prompt.
Nobody open sourced their detection algorithm as that would just trigger a cat-and-mouse game between Suno/Udio and a detection platform (and Suno/Udio have way more VC money than you do), but plenty are being sold as a service and work very reliably.
> …from a technical perspective it's very trivial to detect anything coming straight out of a Suno/Udio prompt.
It's trivial to vibe-code something that detects watermarked output and accidental model fingerprints. But next week the watermarks will be defeated, and the accidental fingerprints will change and ultimately disappear. It's not possible to generally solve the "To what degree is this audio AI generated?" problem, any more than it has been to solve the same problem for text and images. https://mitsloanedtech.mit.edu/ai/teach/ai-detectors-dont-wo...
You're discussing pure hypotheticals, I'm discussing what you can do today with very little effort. If that changes, it changes, but so far it's trivially easy.
The question I'm more interested in is why other music streaming services are not interested in doing this trivially easy work to get rid of spam, even if it's just for the short run as you assume it will be.
Has this happened to you? Or anyone you know? Or do you know of a lawsuit by a label against an artist for making AI music, and a lawsuit by the same artist against an AI detector for flagging a false positive. This story seems extremely unplausible.
Aside, your analogy doesn’t make sense. Horoscopes are generally not in the business of signal detection, and are usually enjoyed by the reader of the horoscope, like any other art. If you had used a sudoku solver your analogy would make a bit more sense.
do you have any idea on what percentage of musicians use AI to create the song and then also create the sheet music so they can play it themselves? That seems like a decent workflow, use AI to get the song right, and then record yourself playing it with you're own creative tweaks. That's kind of how I do AI assisted coding.
I assume you're not a musician, because that sounds insane. If you're good enough to play at full speed from brand new sheet music, then you don't need the AI. Playing from sheet music isn't like typing.
There are some composers who use a workflow like this - Suno is a scratchpad which can be used to quickly trial ideas, clarify concepts with collaborators, etc. don't think it's common, either among composers, or Suno users at large
I'm curious how your platform might avoid false positives with intentionally repetitive music, in particular techno (either produced via a DAW or hardware).
Our relationship to slop is a bit more complicated than that, no?
Whether it's terrible music or not is somewhat irrelevant. Plagiarized music doesn't sound worse than the original.
What seems to matter more is the story behind the work. Basically, if the author is a grifter trying to make a quick buck, then it's slop. You can make an argument that Taylor Swift qualifies as slop, but most people will disagree. The public will be the final arbiter. All I really want is a big red lever to cast my vote.
These days roughly 20% of the songs coming through our platform for promotion are AI-generated. Roughly 75% of them are honest and declare their AI usage - but another 25% try to hide it. Some of them are actually writing scripts to "clean" their audio so that it can bypass detection.