Hacker Newsnew | past | comments | ask | show | jobs | submit | mysterydip's commentslogin

There’s also a “video of this in action” on the linked page :)

I love raycasters! This is great, thanks for sharing. I noticed the “fisheye” effect, was the cosine correction too expensive an operation given the hardware limitations?

That sounds exactly like the kind of thing chatGPT would say to hide the fact it’s chatGPT… :)

Is the reason it can show steps for solving an integral because the training set contained webpages or books showing how to do it?


if we have steps for understanding any author's english and creative process (generally not specific to an author) would you agree then it is possible for an llm to do it?


The real sticking point for me is I don't even believe that authors themselves FULLY understand their process. The idea that anybody could achieve such full introspection as to understand and articulate every little thing that influences their output seems astoundingly improbable.


Repeating a process, yes for sure, even (pseudorandom?) variations on a process. Understanding a process is a different question, and I’m not sure how you would measure that.

In school we would have a test with various questions to show you understand the concept of addition, for example. But while my calculator can perfectly add any numbers up to its memory limit, it has no understanding of addition.


> while my calculator can perfectly add any numbers up to its memory limit, it has no understanding of addition.

"my calculator can perfectly add any numbers up to its memory limit" This kind of anthropomorphic language is misleading in these conversations. Your calculator isn't an agent so it should not be expected to be capable of any cognition.


It’s the degree of generalisability. And LLMs do have understanding. You can ask it how it came up with the process in natural language and it can help - something a calculator can’t do.


> And LLMs do have understanding.

They absolutely do not. If you "ask it how it came up with the process in natural language" with some input, it will produce an output that follows, because of the statistics encoded in the model. That output may or may not be helpful, but it is likely to be stylistically plausible. An LLM does not think or understand; it is merely a statistical model (that's what the M stands for!)


how would you empirically disprove that it doesn't have understanding?

i can prove that it does have understanding because it behaves exactly like a human with understanding does. if i ask it to solve an integral and ask it questions about it - it replies exactly as if it has understood.

give me a specific example so that we can stress test this argument.

for example: what if we come up with a new board game with a completely new set of rules and see if it can reason about it and beat humans (or come close)?


> how would you empirically disprove that it doesn't have understanding?

The complete failure of Claude to play Pokemon, something a small child can do with zero prior instruction. The "how many r's are in strawberry" question. The "should I drive or walk to the car wash" question. The fact that right now, today all models are very frequently turning out code that uses APIs that don't exist, syntax that doesn't exist, or basic logic failures.

The cold hard reality is that LLMs have been constantly showing us they don't understand a thing since... forever. Anyone who thinks they do have understanding hasn't been paying attention.

> i can prove that it does have understanding because it behaves exactly like a human with understanding does.

First, no it doesn't. See my previous examples that wouldn't have posed a challenge for any human with a pulse (or a pulse and basic programming knowledge, in the case of the programming examples). But even if it were true, it would prove nothing. There's a reason that in math class, teachers make kids show their work. It's actually fairly common to generate a correct result by incorrect means.


> The complete failure of Claude to play Pokemon, something a small child can do with zero prior instruction

cherry picking because gemini and gpt have beat it. claude doesn't have a good vision set up

> The "how many r's are in strawberry" question

it could do this since 2024

> The "should I drive or walk to the car wash" question

the SOTA models get it right with reasoning

> fact that right now, today all models are very frequently turning out code that uses APIs that don't exist, syntax that doesn't exist, or basic logic failures.

not when you use a harness. even humans can't write code that works in first attempt.


We don't need to come up with a new board game. How about a board game that has been written about extensively for hundreds of years

LLMs can't consistently win at chess https://www.nicowesterdale.com/blog/why-llms-cant-play-chess

Now, some of the best chess engines in the world are Neural Networks, but general purpose LLMs are consistently bad at chess.

As far as "LLM's don't have understanding", that is axiomatically true by the nature of how they're implemented. A bunch of matrix multiplies resulting in a high-dimensional array of tokens does not think; this has been written about extensively. They are really good for generating language that looks plausible; some of that plausable-looking language happens to be true.


false, chess ELO is pretty good

https://maxim-saplin.github.io/llm_chess/

ets not cherry pick and actually see benchmarks please. i would say even ~1000 elo means that it can reason better than the average human.


If you look at the "workflow" section of that page, they had to add a bunch of scaffolding around telling the model what moves are legal -- an llm can't keep enough context to know how to play chess; only to choose an advantageous move from a given list. But feel free to "cherry pick".


why do you think this falsifies that it can't reason?


i ran the benchmark without the valid moves tool as well as the three mistakes grace and gpt-5.4 holds well. it can achieve 1000 ELO which is much higher than my own.

this clearly tells me that GPT is good at chess, at least better than a normal person who has played ~30-40 games in their life.


I loved Stratego as a kid but couldn’t find time to play as an adult. I stumbled across a mini version recently, which is 10v10 and goes much faster while still giving the overall feel of the game: https://boardgamegeek.com/boardgame/197108/stratego-quick-ba...


How did you determine which answers were wrong or half-truths?


By looking at the code?

> And I asked LLM to show me the code for each answer, on all 5 engines.

They same way I would have done without AI, but it sped up finding the relevant parts to a velocity that made it viable in my limited time.


I think the problem is determining who is contributing, intention, and those other nuances take a human’s time and effort. And at some point the number of contributions becomes too much to sort through.


I think building enough barriers, processes, and mechanisms might work. I don't think it needs to be human effort.


If it's not human effort, it costs tokens, lots of tokens, that need to be paid for by somebody.

The LLM providers will be laughing all the way to the bank because they get paid once by the people who are causing the problem and paid again by the person putting up the "barriers, processes, and mechanisms" to control the problem. Even better for them, the more the two sides escalate, the more they get paid.


So open source development should be more like job-hunting and hiring, where humans feed AI-generated resumes into AI resume filters which supposedly choose reasonable candidates to be considered by other humans? That sounds... not good.


Makes me wonder if at some point we’ll have bots that have forked every open source project, and every agent writing code will prioritize those forks over official ones, including showing up first in things like search results.


I genuinely believe that all open source projects with restrictive or commercially-unviable licenses will be cloned by LLM translation in the next few years. Since the courts are finding that its OK for GenAI to interpret copyrighted works of art and fiction in their outputs, surely that means the end of legal protection for source code as well.

"Rewrite of this project in rust via HelperBot" also means you get a "clean room" version since no human mind was influenced in its creation.


I give it 4 weeks


North Carolina passed Senate Bill 266, changing how utilities can recover costs for projects under construction amid rising energy demand, particularly from data centers. Now Duke Energy wants a double digit price rate increase: https://starw1.ncuc.gov/NCUC/ViewFile.aspx?Id=0ac12377-99be-...


“But what if people used notepad without our permission?!” -dev/boss somewhere


"But I bought it!" - naïve customer somewhere


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: