Good lord. Reading all these comments makes me feel so much better for dumping anthropic the first time their opus started becoming dumber (circa Month ago). It feels like most people in this thread are somehow bound to Claude, even though it is alread fully enshittfied.
Given that they haven’t even gone public yet, doesn’t that seem like putting the cart before the horse a bit? And if they’re already enshittifying it won’t be long until the other placers start doing so as well. Have we passed peak LLM intelligence and are we now watching it degrade as they fail to roll these new advanced models out to their increasing user base? Are the finances not adding up?
Its quite possible there's some tacit collusion going on - it benefits both OAI and Anthropic to make moves that benefit both if they both intend to go public.
Oof. I know of a startup that recently Show HN'd here, the agent mail.to, that is NOT having a good time right now. I don't know what all these new startups having moats thinner than Durex are thinking -- like, what the plan if someone does what you do, faster and cheaper?
I'm building something similar (Dead Simple Email - same category, different pricing structure). The moat criticism is fair and worth being honest about.
The defensible part isn't the feature set, it's infrastructure and price. We run our own mail servers rather than reselling SES, which gives us direct control over deliverability and costs. That's what lets us charge $29/mo for 100 inboxes where AgentMail is at $200. Whether that's a real moat or just a head start is a legitimate question.
Email deliverability is genuinely hard to get right at scale, but I can't say with confidence they won't eventually just absorb this. Building fast and staying cheap is the only real answer I have to that.
> new startups having moats thinner than Durex are thinking
Haha, great visual. Really illustrative of what these AI startups and bootstrapped indie developers are dealing with (and, if I had to guess, why most of them don't go anywhere).
Well that part was impressive. It looks like they focused on receiving emails, that is probably even worse, as I expect OpenAI/Anthropic to add such ability directly to agents, if it really is useful.
Classic "is this a feature or a product?" problem. You're going to have a bad time if you spend all your effort on a feature and nothing to set it apart.
Write an angry blog post about how big business is using their power to kill their _totally_ unique original idea that nobody could possibly copy in a hour?
Forgive my senses, but this writing feels like a low effort Claude response. What's the point adding responses like this to a Show HN post? I don't think you are fooling anyone.
I swore to not be burned by google ever again after TensorFlow. This looks cool, and I will give this to my Codex to chew on and explain if it fits (or could fit what I am building right now -- the msx.dev) and then move on. I don't trust Google with maintaining the tools I rely on.
I'm VERY curious about your case. What kind of switching costs do you guys have? I'm working at a very young startup that is still not locked into either AI provider harnesses -- what causes switching costs, just the subscription leftovers or something else?
What's the legal theory here? Is the argument that leaked source code loses copyright protection, or simply that Codeberg is outside US jurisdiction and therefore harder to enforce against?
They will file an appeal within 7 days and then it needs to be decided by a court. My guess is that since AI generated stuff isn’t copyright able that they will lose.
Doesn't seem relevant here. TurboQuant isn't a domain-specific technique like the BL is talking about, it's a general optimisation for transformers that helps leverage computation more effectively.
Models aren't just big bags of floats you imagine them to be. Those bags are there, but there's a whole layer of runtimes, caches, timers, load balancers, classifiers/sanitizers, etc. around them, all of which have tunable parameters that affect the user-perceptible output.
It's still engineering. Even magic alien tech from outer space would end up with an interface layer to manage it :).
ETA: reminds me of biology, too. In life, it turns out the more simple some functional component looks like, the more stupidly overcomplicated it is if you look at it under microscope.
There's this[1]. Model providers have a strong incentive to switch (a part of) their inference fleet to quantized models during peak loads. From a systems perspective, it's just another lever. Better to have slightly nerfed models than complete downtime.
That isn't true. The whole point it to quickly pick up statistically significant variations quickly, and with the volume of tests they are doing there is plenty of data.
If you turn on the 95% CI bands you can see there is plenty of statistical significance.
Anybody with more than five years in the tech industry has seen this done in all domains time and again. What evidence you have AI is different, which is the extraordinary claim in this case...
I'm very happy to say calculators are far better than me in calculations (to a given precision). I'm happy to admit computers are so much better than me in so many aspects. And I have problem saying LLMs are very helpful tools able to generate output so much better than mine in almost every field of knowledge.
Yet, whenever I ask it to do something novel or creative, it falls very short. But humans are ingenious beasts and I'm sure or later they will design an architecture able to be creative - I just doubt it will be Transformer-based, given the results so far.
But the question isn't whether you can get LLMs to do something novel, it's whether anyone can get them to do something novel. Apparently someone can, and the fact that you can't doesn't mean LLMs aren't good for that.
Novel is a tricky word. In this case, the LLM produced a python program that was similar to other programs in its corpus, and this oython program generated examples of hypergraphs that hadn't been seen before.
That's a new result, but I don't know about novel. The technique was the same as earlier work in this vein. And it seems like not much computational power was needed at all. (The article mentions that an undergrad left a laptop running overnight to produce one of the previous results, that's absolute peanuts when compared to most computational research).
If all art is derivative then the earlier statement is a tautology.
People still call things other people do novel. There's clear social proof that humans do things that other humans consider novel.
Otherwise the word would probably not exist.
Just today I wrote a python program that did not resemble anything I'd written before, nor had I seen anything similar.
I had to reason it out myself. That passes thr test that the original comment set.
Your threshold for "resemble" is obviously quite high, which is fair, but assuming that you're an encultured programmer your python code represents other people's python code. It might be doing something novel, but that thing it's doing is interacting or in response to, or otherwise relative to existing concepts you learned or saw elsewhere. All art is derivative, we can do things other people haven't done before but all of it derives from the works of others in some way.
Anyway, I've coded all kinds of wacky shit with claude that I guarantee nobody has implemented before, if only because they're stupid and tedious ideas. They can't all be winners, but they were novel, and yet claude code implemented them as confidently as if they were yet another note taking app. They have no problem handling novel ideas, and although the novel ideas in this case were my own, its easy to see how finding new ideas could be automated by exploring the combinatorial space of existing ideas.
I'm not talking about wacky. My barrier for novel is 1) new capabilities 2) useful, and 3) end-to-end tested.
For example, what I refered to that I've written is a dynamic storage solution for n-dimensional grids, that can grow arbitrarily in any direction, and is locally dense (organized into spatially indexed blocks of contiguous data).
I had never considered this problem before, and I certainly had never seen a solution before (even though there may well be one).
I worked it out on paper, considering how integer lattices can be partitioned and indexed, and then I transformed that into a design which I then implemented. Working purely from the design, not considering existing solutions.
When it comes to LLMs doing novel things, is it just the infinite monkey theorem[0] playing out at an accelerated rate, helped along by the key presses not being truly random?
Surely if we tell the LLM to do enough stuff, something will look novel, but how much confirmation bias is at play? Tens of millions of people are using AI and the biggest complaint is hallucinations. From the LLMs perspective, is there any difference between a novel solution and a hallucination, other than dumb luck of the hallucination being right?
This argument doesn't go the way you want it to go. Billions of people exist, but maybe a few tens of thousands produce novel knowledge. That's a much worse rate than LLMs.
I’m not sure how we equate the number of humans to AI to determine a success rate.
We also can’t ignore than it was humans who thought up this problem to give to the AI. Thinking has two parts, asking and answering questions. The AI needed the human to formulate and ask the question to start. AI isn’t just dropping random discoveries on us that we haven’t even thought of, at least not that I’ve seen.
To have a proper discussion we would have to define the word "novel" and that's a challenge in itself. In any case, millions of poeple tried to ask LLMs to do something creative and the results were bland. Hence my conclusion LLMs aren't good for that. But I'm also open they can be an element of a longer chain that could demonstrate some creativity - we'll see.
This is objectively wrong. If that was the case every scientist performing a test would have always had their expectations and beliefs proven true. If you're trying to disprove something also because you believe it to be wrong you would never be proven wrong.
reply