I can't disagree with your AI doomerism perspective. I firmly believe that AI companies should buy one copy of whatever work they use for training. While this won't provide the never-ending royalty stream on copyrighted material that corporations strive for, it would foster the mindset that AI companies must pay society in some way. And I truly think that if AI companies are going to train on all the knowledge in the world, their profits should go back to everyone in the world. i.e., LLM models are a public good.
I have an almost unshakable conviction that LLM-type AI systems should become a repository of all human knowledge. When LLMs give you an answer, you should be able to ask, what are the sources behind your answer? People won't do this, but curious, wanting-to-learn people will. Which leads to one of the important questions. How do you keep people curious?
But all these companies violated that on a massive scale. It's done. They're not paying. Oh, and when asked what the consequences are for people doing illegal downloading, ChatGPT helpfully answers:
> About $750 to $30,000 per copyrighted work
> Can go up to $150,000 per work if it’s considered willful
... it was definitely willful. And these are amounts that would bankrupt even OpenAI. But I guess only you and me will have to pay these sorts of amounts, not big companies ...
I don't disagree with either of you regarding the doomerism, but Anthropic just paid out the largest US copyright settlement ever, based upon their exposure to the liability of $150k per copyrighted work they faced.
I haven't gotten my $150k for one (like a lot of people, I wrote an IT book that chatgpt can 95% repeat sentences from), and nobody I know has gotten theirs either.
The settlement is for $3k per protected work of class members. Are you a class member? You should've been contacted by your publisher if you were. If you weren't in the shadow library, then you are not in the settlement.
I have an almost unshakable conviction that LLM-type AI systems should become a repository of all human knowledge. When LLMs give you an answer, you should be able to ask, what are the sources behind your answer? People won't do this, but curious, wanting-to-learn people will. Which leads to one of the important questions. How do you keep people curious?