Yes, I think there's a range between 'give it away I don't care what happens after' (public domain) to the "just cite me bro" (BSD/MIT style) and farther into the full copyleft where you have rights to see and modify the source but also obligations if you redistribute from there.
AI can't follow all the rules across that range, but GPL-style licenses can also be hard to show infringement of even confined only to humans, which is one reason why you see things like encoding license onto module boundaries in projects like the Linux kernel.
Even there, something like io_uring might be GPL but if I use the same techniques to build my library around a send/receive queue concept, as an AI might generate, is it still copyright infringement? I would argue it's never really been the open source model to prevent developers from reimplementing architectural styles or approaches (that's the logic of proprietary software authors), they've instead been after protecting their specific distributable libraries and applications.
These are dicey questions even outside of AI, AI just brings them front and center because the linkage between training data and generated output seems more direct. But humans do also learn the same way, by studying source code when possible, even if they later write something from scratch you can't go and unlearn what you've learned.
AI can't follow all the rules across that range, but GPL-style licenses can also be hard to show infringement of even confined only to humans, which is one reason why you see things like encoding license onto module boundaries in projects like the Linux kernel.
Even there, something like io_uring might be GPL but if I use the same techniques to build my library around a send/receive queue concept, as an AI might generate, is it still copyright infringement? I would argue it's never really been the open source model to prevent developers from reimplementing architectural styles or approaches (that's the logic of proprietary software authors), they've instead been after protecting their specific distributable libraries and applications.
These are dicey questions even outside of AI, AI just brings them front and center because the linkage between training data and generated output seems more direct. But humans do also learn the same way, by studying source code when possible, even if they later write something from scratch you can't go and unlearn what you've learned.