Hacker Newsnew | past | comments | ask | show | jobs | submit | sarchertech's commentslogin

Most of the time taken during this process is spent getting feedback, processing it, and learning that it's not it. So even if LLMs drive the build time to zero, they won't speed up the process very much at all. Think 10% improvement not 10x improvement.

> A messy codebase is still cheaper to send ten agents through than to staff a team around. And even if the agents need ten days to reason through an unfamiliar system, that is still faster and cheaper than most development teams operating today. The liability argument holds in a human-to-human or agent-to-human world. In an agent-to-agent world, it largely dissolves.

What experience is this guy basing this on? My guess is absolutely none at all.

Maybe this will be the case in the future, but as of right now if I cut 10 agents loose for 10 days one of our repos at work and tell them to clean it up but but keep the tests passing, we’d be drowning in support tickets.

Tests don’t cover all observable behavior. Every single production bug we’ve had made it through the test suite.

Also this guy only had a vague idea of how platform engineering teams work in large organizations.

Platform teams are the engineering org’s immune system. They’re how we fight back against the tech debt accumulated by the relentless march of features of the week.

If anything the extra code people are cranking out with AI make them more necessary.


This does nothing to shield Linux from responsibility for infringing code.

This is essentially like a retail store saying the supplier is responsible for eliminating all traces of THC from their hemp when they know that isn’t a reasonable request to make.

It’s a foreseeable consequence. You don’t get to grant yourself immunity from liability like this.


Shield from what exactly? The Linux kernel is not a legal entity. It's a collection of contributions from various contributors. There is the Linux Foundation but they do not own Linux.

If Linux were to contain 3rd party copyrighted code the legal entity at risk of being sued would be... Linux users, which given how widely deployed Linux is is basically everyone on Earth, and all large companies.

Linux development is funded by large companies with big legal departments. It's safe to say that nobody is going to be picking this legal fight any time soon.


The Linux DCO system was designed to shield Linus and the Linux foundation from copyright and patent infringement liability, so they were certainly worried that it was a possibility.

However, there is no legal precedent that says that because contributors sign a DCO and retain copyright, the Linux Foundation is not liable. The entire concept is unproven.

Large company legal departments aren’t a shield against this kind of thing. Patent trolls routinely go after huge companies and smaller companies routinely sue much larger ones over copyright infringement.


Quite a lot of companies use and release AI written code, are they all liable?

1. Almost definitely if discovered

2. Infringement in closed source code isn’t as likely to be discovered

3. OpenAI and Anthropic enterprise agreements agree to indemnify (pay for damages essentially) companies for copyright issues.


What would be "discovered" exactly? You can't patent a basic CRUD application.

There has to be an analogy to music or something here - except that code is even less copyrightable than melodies.

Yes, there might be some specific algorithms that are patented, but the average programmer won't be implementing any of those from scratch, they'll use libraries anyway.


I’m not talking patents. Code is 100% copyrightable.

Code being copyrightable is the entire basis for open source licenses.


s/patent/copyright/ in my comment then.

What part of a bog-standard HTTP API can be copyrighted? Parsing the POST request or processing it or shoving it to storage? I'm genuinely confused here and not just being an ass.

There are unique algorithms for things like media compression etc, I understand copyrighting those.

But for the vast majority of software, is there any realistic threat of hitting any copyrighted code that's so unique it has been copyrighted and can be determined as such? There are only so many ways you can do a specific common thing.

I kinda think of it like music, without ever hearing a specific song you might hit the same chord progressions by accident because in reality there are only so many combinations you can make with notes that sound good.


Unlike patents, independent creation is a valid defense to copyright infringement.

Copyright is the literal expression of the idea. The identifier names, how the functions are broken up, which libraries are used etc…

Given more than a dozen lines or so, 2 people aren’t going to write the exact same code to solve the same problem. It might be equivalent code, but it’s not going to be the exact same.

  def copyright_warning(times) do
    for _ <- 1..times do
      IO.puts("hey man this code is copyrighted. Don't copy it pretty please")
    end
  end
That code is copyright protected. I don’t have to do anything. I automatically own the copyright once I create it.

If you copy that you are infringing.

You could do something similar if you wanted. But if you copy that directly, you are infringing on my copyright.


But isn't it literally impossible to determine whether I copied those 5 lines or wrote them myself?

Especially in languages like Go where there's an Official Formatter that makes all code look identical as much as possible?

There are a multitude of reasons why I'm not a lawyer and vague crap like this is a big part :D


It’s literally impossible to prove anything outside of a formal system.

Courts would look at the preponderance of the evidence in a civil trial.

Did you have access to my code? Is the copy long enough that it’s statistically very unlikely that could have came up with it exactly on your own?

They’ll look at things like did you copy misspellings in variable names. Did you copy the missing period at the end of the output string etc…


Yep, and honestly it's going to come up with things other than lawsuits.

I've worked at a company that was asked as part of a merger to scan for code copied from open source. That ended up being a major issue for the merger. People had copied various C headers around in odd places, and indeed stolen an odd bit of telnet code. We had to go clean it up.


Headers are normally fine. GPL license recognises that you might need them to read binary files.

An open-source project receiving open-source contributions from (often anonymous) volunteers is not even close to analogous to a storefront selling products with a consumer guarantee they are backing on the basis of their supply chain.

Do you think that Goodwill should be able to offload all liability for everything they sell at their thrift shops to their often anonymous donors?

Linus makes $1.5 million per year from the Linux foundation. And the foundation itself pulls in $300 million a year in revenue.

They are directly benefiting from contributors and if they cause harm through their actions there’s a good chance they’ll be held liable.


> Do you think that Goodwill should be able to offload all liability for everything they sell at their thrift shops to their often anonymous donors?

I don't even think this is an appropriate analogy worth answering. Goodwill are selling products to consumers in a direct exchange of money-for-goods.

No one is buying Linux.


Ok so if goodwill gives products away they should be able to absolve themselves of all liability related to those products?

And any company selling a device or software the includes the Linux code should be liable then.

And they are according to the law btw. Along with anyone distributing the software which includes the owner of the Linux repositories.


> This does nothing to shield Linux from responsibility for infringing code.

It’s no worse than non-AI assisted code.

I could easily copy-paste proprietary code, sign my name that it’s not and that it complies with the GPL and submit it.

At the end of the day, it just comes down to a lying human.


That’s the difference. In practice a human has to commit fraud to do this.

But a human just using an LLM to generate code will do it accidentally. The difference is that regurgitation of training text is a documented failure mode of LLMs.

And there’s no way for the human using it to be aware it’s happening.


You can not accidentally sign your name saying “this code is GPL compliant”

If you can’t be sure, don’t sign.


I’m not gonna. A lot of other people now will.

Well yes, people break the law and expose themselves to liability everyday. Nothing new there.

Are far as I know there’s one incidence of a company asserting copyright infringement against the Linux kernel, even if I’ve missed a few, it doesn’t have frequently. That will change with AI generated code, and it exposes everyone commercial entity that distributes Linux in any form to liability.


Yes but if you do that manually you are in bad faith, if you ask an AI to do it you have no idea if you are going to be liable of something or not.

> you have no idea if you are going to be liable of something or not

In life that is a very strong indicator you should not do <thing>


True, but everybody is doing it nonetheless.

People break laws and open themselves up to liability every day of the week.

That has no impact on what you should do with your life.


Are far as I know there’s one incidence of a company asserting copyright infringement against the Linux kernel, even if I’ve missed a few, it doesn’t have frequently. That will change with AI generated code, and it exposes everyone commercial entity that distributes Linux in any form to liability.


There’s no reasonable way for you to use AI generated code and guarantee it doesn’t infringe.

The whole use it but if it behaves as expected, it’s your fault is a ridiculous stance.


If you think it's an unacceptable risk to use a tool you can't trust when your own head is on the line, you're right, and you shouldn't use it. You don't have to guarantee anything. You just have to accept punishment.

That’s just it though it’s not just your head. The liability could very likely also fall on the Linux foundation.

You can’t say “you can do this thing that we know will cause problems that you have no way to mitigate, but if it does we’re not liable”. The infringement was a foreseeable consequence of the policy.


This policy effectively punts on the question of what tools were used to create the contribution, and states that regardless of how the code was made, only humans may be considered authors.

From the foundation's point of view, humans are just as capable of submitting infringing code as AI is. If your argument is sound, then how can Linux accept contributors at all?

EDIT: To answer my own question:

    Instead of a signed legal contract, a DCO is an affirmation that a certain person confirms that it is (s)he who holds legal liability for the act of sending of the code, that makes it easier to shift liability to the sender of the code in the case of any legal litigation, which serves as a deterrent of sending any code that can cause legal issues.
This is how the Foundation protects itself, and the policy is that a contribution must have a human as the person who will accept the liability if the foundation comes under fire. The effectiveness of this policy (or not) doesn't depend on how the code was created.

Anyone distributing copyrighted material can be liable that DCO isn’t going to stop anyone.

If that worked any corporation that wanted to use code they legally couldn’t could just use a fork from someone who assumed responsibility and worst case they’d have to stop using it if someone found out.


> liability could very likely also fall on the Linux foundation.

It’s just the same as if I copy-paste proprietary code into the kernel and lie about it being GPL.

Is the Linux foundation liable there?


Maybe. DCOs haven’t been tested. But you can at least say that the person who did this committed fraud and that you had no reasonable way to know they would do that.

LLMs can and do regurgitate code without the user’s knowledge. That’s the problem, the user has no way to mitigate against it. You’re telling contributors “use this thing that has a random chance of creating infringing code”. You should have foreseen that would result in infringing code making its way into the kernel.


If someone sent you some code and said “it’s all good bro, you can put it in the kernel with your name on it”, would you?

If you don’t feel comfortable about where some code has come from, don’t sign your name.

The fact LLMs exist and can generate code doesn’t change how you would behave and sign your name to guarantee something.


Are you being purposely obtuse?

Not at all.

Linus and the rules have always been very clear. If you don’t know where code came from, don’t submit it.


The only lawsuits so far have been over training on open source software. You're inventing a liability problem that essentially does not exist.

OpenAI and Anthropic added an indemnity clause to their enterprise contracts specifically to cover this scenario because companies wouldn’t adopt otherwise.

Yeah, but that's not a useful thing to do because not everybody thinks about that or considers it a problem. If somebody's careless and contributes copyrighted code, that's a problem for linux too, not only the author.

For comparison, you wouldn't say, "you're free to use a pair of dice to decide what material to build the bridge out of, as long as you take responsibility if it falls down", because then of course somebody would be careless enough to build a bridge that falls down.

Preventing the problem from the beginning is better than ensuring you have somebody to blame for the problem when it happens.


It was already necessary to solve the problem of humans contributing infringing code. It was solved by having contributors assume liability with a DCO. The policy being discussed today asserts that, because AI may not be held legally liable for its contributions, AI may not sign a DCO. A human signature is required. This puts the situation back to what it was with human contributors. What you are proposing goes beyond maintaining the status quo.

It’s not solved. It hasn’t been tested in court to my knowledge and in my opinion is unlikely to hold up to serious challenge. You can be held liable for just distributing copyrighted code even if the whole “the Linux foundation doesn’t own anything” holds up.

> Preventing the problem from the beginning is better than ensuring you have somebody to blame for the problem when it happens.

that's assuming that the problems and incentives are the same for everyone. Someone whose uncle happens to own a bridge repair company would absolutely be incentivized to say

> "you're free to use a pair of dice to decide what material to build the bridge out of, as long as you take responsibility if it falls down"


Their position is probably that LLM technology itself does not require training on code with incompatible licenses, and they probably also tend to avoid engaging in the philosophical debate over whether LLM-generated output is a derivative copy or an original creation (like how humans produce similar code without copying after being exposed to code). I think that even if they view it as derivative, they're being pragmatic - they don't want to block LLM use across the board, since in principle you can train on properly licensed, GPL-compatible data.

>There’s no reasonable way for you to use AI generated code and guarantee it doesn’t infringe.

I guess we’ll need to reevaluate what copy rights mean when derivatives grow on trees?


How could you do that though? You can’t guarantee that there aren’t chunks of copied code that infringes.

Let me introduce you to the concept of submarine patents...

But the responsible party is still the human who added the code. Not the tool that helped do so.

The practical concern of Linux developers regarding responsibility is not being able to ban the author, it's that the author should take ongoing care for his contribution.

That's not going to shield the Linux organization.

A DCO bearing a claim of original authorship (or assertion of other permitted use) isn't going to shield them entirely, but it can mitigate liability and damages.

Can it though? As far as I know this hasn’t been tested.

In a court case the responsibility party very well could be the Linux foundation because this is a foreseeable consequence of allowing AI contributions. There’s no reasonable way for a human to make such a guarantee while using AI generated code.

It’s not about the mechanism: responsibility is a social construct, it works the way people say that it works. If we all agree that a human can agree to bear the responsibility for AI outputs, and face any consequences resulting from those outputs, then that’s the whole shebang.

Sure we could change the law. It would be a stupid change to allow individuals, organizations, and companies to completely shield themselves from the consequences of risky behaviors (more than we already do) simply by assigning all liability to a fall guy.

In this case, the "fall guy" is the person who actually introduced the code in question into the codebase.

They wouldn't be some patsy that is around just to take blame, but the actual responsible party for the issue.


Imagine your a factory owner and you need a chemical delivered from across the country, but the chemical is dangerous and if the tanker truck drives faster than 50 miles per hour it has a 0.001% chance per mile of exploding.

You hire an independent contractor and tell him that he can drive 60 miles per hour if he wants to but if it explodes he accepts responsibility.

He does and it explodes killing 10 people. If the family of those 10 people has evidence you created the conditions to cause the explosion in order to benefit your company, you're probably going to lose in civil court.

Linus benefits from the increase velocity of people using AI. He doesn't get to put all the liability on the people contributing.


Cool analogy! Which has nothing to do with the topic in hand.

Want to bring something meaningful to the conversation?

That is a nonsensical analogy on multiple levels, and doesn't even support your own argument.

Nice rebuttal.

Why would I put much effort into responding to a post like yours, which makes no sense and just shows that you don't understand what you're talking about?

Why would you put any effort into it at all?

What law exactly are you suggesting needs to be changed? How is this any different from what already happens right now, today?

Right now it's very easy not to infringe on copyrighted code if you write the code yourself. In the vast majority of cases if you infringed it's because you did something wrong that you could have prevented (in the case where you didn't do anything wrong, inducement creation is an affirmative defense against copyright infringement).

That is not the case when using AI generated code. There is no way to use it without the chance of introducing infringing code.

Because of that if you tell a user they can use AI generated code, and they introduce infringing code, that was a foreseeable outcome of your action. In the case where you are the owner of a company, or the head of an organization that benefits from contributors using AI code, your company or organization could be liable.


So it's a bit as if Linux Organization told its contributors you can bring in infringing code but you must agree you are liable for any infringement?

But if a lawsuit was later brought who would be sued? The individual author or the organization? In other words can an organization reduce its liability if it tells its employees "You can break the law as long as you agree you are solely responsible for such illegal actions?

It would seem to me that the employer would be liable if they "encourage" this way of working?


It’s a foreseeable outcome that humans might introduce copyrighted code into the kernel.

I think you’re looking for problems that don’t really exist here, you seem committed to an anti AI stance where none is justified.


A human has to willingly violate the law for that to happen though. There is no way for a human to use AI generated that doesn't have a chance of producing copyrighted code though. That's just expected.

If you don't think this is a problem take a look at the terms of the enterprise agreements from OpenAI and Anthropic. Companies recognize this is an issue and so they were forced to add an indemnification clause, explicitly saying they'll pay for any damages resulting in infringement lawsuits.


> Right now it's very easy not to infringe on copyrighted code if you write the code yourself.

Humans routinely produce code similar to or identical to existing copyrighted code without direct copying.


And that's not an infringement. Actual copying is the infringement, not having the same code. The most likely way to have the same code is by copying, but it's not the only way.

They don’t produce enough similar code to infringe frequently. And if they did independent creation is an affirmative defense to copyright infringement that likely doesn’t apply to LLMs since they have the demonstrated capability to produce code directly from their training set.

You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.

On independent creation: you are conflating the tool with the user. The defense applies to whether the developer had access to the copyrighted work, not whether their tools did. A developer using an LLM did not access the training set directly, they used a synthesis tool. By your logic, any developer who has read GPL code on GitHub should lose independent creation defense because they have "demonstrated capability to produce code directly from" their memory.

LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case). Training set contamination happens, but it is rare and considered a bug. Humans also occasionally reproduce code from memory: we do not deny them independent creation defense wholesale because of that capability!

In any case, the legal question is not settled, but the argument that LLM-assisted code categorically cannot qualify for independent creation defense creates a double standard that human-written code does not face.


> You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.

Practically speaking humans do not produce code that would be found in court to be infringing without intent.

It is theoretically possible, but it is not something that a reasonable person would foresee as a potential consequence.

That’s the difference.

> LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case).

Exactly. It is a documented failure mode that you as a user have no capacity to mitigate or to even be aware is happening.

Double standards are perfectly fine. LLMs are not conscious beings that deserve protection under the law.

>not settled.

What appears to likely be settled is that human authorship is required, so there’s no way that an LLM could qualify for independent creation.


Responsibility is an objective fact, not just some arbitrary social convention. What we can agree or disagree about is where it rests, but that's a matter of inference, an inference can be more or less correct. We might assign certain people certain responsibilities before the fact, but that's to charge them with the care of some good, not to blame them for things before they were charged with their care.

Because contributions to Linux are meticulously attributed to, and remain property of, their authors, those authors bear ultimate responsibility. If Fred Foobar sends patches to the kernel that, as it turns out, contain copyrighted code, then provided upstream maintainers did reasonable due diligence the court will go after Fred Foobar for damages, and quite likely demand that the kernel organization no longer distribute copies of the kernel with Fred's code in it.

Anyone distributing infringing material can be liable, and it’s unlikely that this technicality will actually would shield anyone.

Anyone who thinks they have a strong infringement case isn’t going to stop at the guy who authored the code, they’re going to go after anyone with deep pockets with a good chance of winning.


> Anyone distributing infringing material can be liable

There is still the "mens rea" principle. If you distribute infringing material unknowingly, it would very likely not result in any penalties.


Copyright is strict liability. There’s no mens rea required.

> Today's Spirit of Ecstasy, from the 2003 Phantom model onward, stands at 3 inches (7.6 cm) and, for the safety of any person being accidentally hit, is mounted on a spring-loaded mechanism designed to retract instantly into the radiator shell if struck from any direction.

I don't think that mentions it being motorized

The motor allows you to raise it after it has been retracted.

Then why is it spring-loaded?

> For the safety of any person being accidentally hit

(the spring handles the retraction)


But if a motor is needed to counter the retracting spring, doesn't that mean any impact must overcome the motor before the spring will engage?

Presumably the motor runs once to extend it and then it locks into position (with some kind of mechanical catch that's calibrated to come loose if you hit anything), rather than being constantly running.

This is the exact opposite of the theory of bullshit asymmetry. It’s much easier to come up with bullshit than it is to debunk it.

The real skill is known which ideas to shoot down or heavily rework, which is probably the most valuable thing a senior engineer brings to the table.


Yes, the author assumes that the idea was carefully vetted and cultivated. If, in fact, that was the case, then, as other commenters have pointed out, the idea's champion should have ready answers to most obvious critiques.

>founder of Yoast SEO,

>wrote about how he migrated his personal blog from WordPress to Astro

>he’s since migrated again to EmDash

Do you need to know anything more about this guy? If that's one of the articles sources, I think you can ignore anything it says.


Yoast SEO is a huge paid plugin for Wordpress, so the founder not using Wordpress is at least moderately interesting, in a "cobbler buys shoes at walmart" kind of way.

I had no idea it was a product. I thought it was an SEO firm. That is a bit more interesting. But I also wouldn't be very surprised when a cobbler that makes work boots, buys sneakers from Walmart.

Based on the way subscriptions work for every other business, if you’re hitting the limits, you are not profitable for them.

My guess is a plan with double the limits would need to be 5-10x as expensive.


This is only an issue between 12pm and ~4pm ET. If I work at any other time of day, I never hit my usage limit.

No it’s worse for them. A person on an H-1B has a ticking time bomb to find a new job or leave the country.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: