Hacker Newsnew | past | comments | ask | show | jobs | submit | derektank's commentslogin

I’m not a lawyer, but I read the decision, and how is this section not a ruling on fair use?

“To summarize the analysis that now follows, the use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use under Section 107 of the Copyright Act. And, the digitization of the books purchased in print form by Anthropic was also a fair use but not for the same reason as applies to the training copies. Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies. However, Anthropic had no entitlement to use pirated copies for its central library. Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.”

Or in the final judgement, “This order grants summary judgment for Anthropic that the training use was a fair use. And, it grants that the print-to-digital format change was a fair use for a different reason.”


There's two parts here.

The first:

> it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library

It is only fair use where Anthropic had already purchased a license to the work. Which has zero to do with scraping - a purchase was made, an exchange of value, and that comes with rights.

The second, which involves a section of the judgement a little before your quote:

> And, as for any copies made from central library copies but not used for training, this order does not grant summary judgment for Anthropic.

This is where the court refused to make any ruling. There was no exchange of value here, such as would happen with scraping. The court made no ruling.


I believe you are misinterpreting the ruling. Remember that a copyright claim must inherently argue that copies of the work are being made. To that end, the case analyzes multiple "copies" alleged to have been made.

1) "Copies used to train specific LLMs", for which the ruling is:

> The copies used to train specific LLMs were justified as a fair use.

> Every factor but the nature of the copyrighted work favors this result.

> The technology at issue was among the most transformative many of us will see in our lifetimes.

Notable here is that all of the "copies used to train specific LLMs" were copies made from books Anthropic purchased. But also of note is that Anthropic need not have purchased them, as long as they had obtained the original sources legally. The case references the Google Books lawsuit as an example of something Anthropic could have done to avoid pirating the books they did pirate where in Google obtained the original materials on loan from willing and participating libraries, and did not purchase them.

2) "The copies used to convert purchased print library copies into digital library copies", where again the ruling is:

> justified, too, though for a different fair use. The first factor strongly

> favors this result, and the third favors it, too. The fourth is neutral. Only

> the second slightly disfavors it. On balance, as the purchased print copy was

> destroyed and its digital replacement not redistributed, this was a

> fair use.

Here one might argue where the use of GPL code is different in that in making the copy, no original was destroyed. But it's also very likely that this wouldn't apply at all in the case of GPL code because there was also no original physical copy to convert into a digital format. The code was already digitally available.

3) "The downloaded pirated copies used to build a central library" where the court finds clearly against fair use.

4) "And, as for any copies made from central library copies but not used for training" where as you note Judge Alsup declined to rule. But notice particularly that this is referring to copies made FROM the central library AND NOT for the purposes of training an LLM. The copies made from purchased materials to build the central library in the first place were already deemed fair use. And making copies from the central library to train an LLM from those copies was also determined to be fair use.The copies obtained by piracy were not. But for uses not pertaining to the training of an LLM, the judge is declining to make a ruling here because there was not enough evidence about what books from the central library were copied for what purposes and what the source of those copies was. As he says in the ruling:

> Anthropic is not entitled to an order blessing all copying “that Anthropic has ever made after obtaining the data,” to use its words

This declination applies both to the purchased and pirated sources, because it's about whether making additional copies from your central library copies (which themselves may or may not have been fair use), automatically qualifies as fair use. And this is perfectly reasonable. You have a right as part of fair use to make a copy of a TV broadcast to watch at a later time on your DVR. But having a right to make that copy does not inherently mean that you also have a right to make a copy from that copy for any other purposes. You may (and almost certainly do) have a right to make a copy to move it from your DVR to some other storage medium. You may not (and almost certainly do not) have a right to make a copy and give it to your friend.

At best, an argument that GPL software wouldn't be covered under the same considerations of fair use that this case considers would require arguing that the copies of GPL code obtained by Anthropic were not obtained legally. But that's likely going to be a very hard argument to make given that GPL code is freely distributed all over the place with no attempts made to restrict who can access that code. In fact, GPL code demands that if you distribute the software derived from that code, you MUST make copies of the code available to anyone you distribute the software to. Any AI trainer would simply need to download Linux or emacs and the GPL requires the person they downloaded that software from to provide them with the source code. How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?


> How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?

By the license and terms such copies are under.

> For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.

You _must_ show the terms. If you copy the GPL code, and it inherits the license, as the terms say it does, then you must also copy the license.

The GPL does not give you an unfettered right to copy, it comes with terms and conditions protecting it under contract law. Thus, you must follow the contract.

The GPL goes to some lengths to define its terms.

> A "covered work" means either the unmodified Program or a work based on the Program.

> Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.

It is not just the source code that you must convey.


>Again, if you aren't sure, the answer is likely NO.

Likely no, I agree. But I think there are probably a lot of companies selling enterprise software that later attempt to solicit a FedRAMP authorization that would benefit from planning ahead and building a compliant version from the jump. Worth considering and having a conversation internally.


Steam doesn’t really attempt to gatekeep submitted content the same way that Apple or Google do so I would expect those companies to have much larger teams supporting, in mostly non-development roles. Steam support has also historically been kind of a joke (not sure if it’s improved in the last 5 years) though I don’t know if Google/Apple provide a better experience

Most oil benchmarks are futures contracts, so they don’t reflect the spot or current price. So commodities traders are trying to predict the situation 2-4 weeks from now

Obviously they have boats. The question is, do they still have boats which are capable of serving as a launch platform for ballistic missiles? And could those boats meaningfully close the distance between Iran and its adversaries.

This launch demonstrates that if the answer to both of those questions is still no, they can still place them at threat.


The question is do they have a launcher that fits in a shipping container...

Identifying a biomarker for a psychological condition, particularly one like schizophrenia which is hugely disruptive for individuals affected and has a seemingly random onset around early adulthood, is significant in its own right even if it doesn’t lead to a pharmaceutical intervention. It could help identify new risk factors, potential non-pharmaceutical interventions like life style changes, and maybe even identify people who are at risk of developing schizophrenia and preparing them for its onset before their first hallucination and avoiding a downward spiral.

>The problem is that in places like Seattle and the Bay Area, there are hard geographic limits to construction, even if you turn them into endless high-rises

Over three quarters of all residential land in Seattle is zoned single family and the population density of the city is less than a third that of NYC. The geography is not the hard constraint in this city.


It’s obviously worse for your privacy to have third parties handle full images of your drivers license or video of your entire face, which can then be leaked, rather than using a zero knowledge proof that only sends e.g. a birth year. And no, it’s not spite, it’s incoherence. Lawmakers are single minded seekers of re-election to a first degree approximation and will do things to get votes, even if those things don’t logically make sense together, such as requiring age verification without providing the tools for companies to abide by the law themselves.

US lawmakers are single-minded seekers of lobbying and insider trading money, they will sign and trade on whatever ALEC hands them so they receive more money.

Your examples do kind of reinforce the point being made.

Mathematics and (theoretical) physics are capital-light research sectors. Weapons platforms and space technology were state managed (I.e. didn’t require private sector capital financing).


The Department of War is an “alternate title”. Department of Defense continues to be correct


> The Department of War is an “alternate title”.

Like "alternative facts"?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: