Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  > ...generate answers near the center of existing thought.
This is right in the Wikipedia's article on universal approximation theorem [1].

[1] https://en.wikipedia.org/wiki/Universal_approximation_theore...

"n the field of machine learning, the universal approximation theorems (UATs) state that neural networks with a certain structure can, in principle, approximate any continuous function to any desired degree of accuracy. These theorems provide a mathematical justification for using neural networks, assuring researchers that a sufficiently large or deep network can model the complex, non-linear relationships often found in real-world data."

And then: "Notice also that the neural network is only required to approximate within a compact set K {\displaystyle K}. The proof does not describe how the function would be extrapolated outside of the region."

NNs, LLMs included, are interpolators, not extrapolators.

And the region NN approximates within can be quite complex and not easily defined as "X:R^N drawn from N(c,s)^N" as SolidGoldMagiKarp [2] clearly shows.

[2] https://github.com/NiluK/SolidGoldMagikarp



It has been proven that recurrent neural networks are Turing complete [0]. So for every computable function, there is a neural network that computes it. That doesn't say anything about size or efficiency, but in principle this allows neural networks to simulate a wide range of intelligent and creative behavior, including the kind of extrapolation you're talking about.

[0] https://www.sciencedirect.com/science/article/pii/S002200008...


I think you cannot take the step from any turing machine being representable as a neural network to say anything about the prowess of learned neural networks instead of specifically crafted ones.

I think a good example are calculations or counting letters: it's trivial to write turing machines doing that correctly, so you could create neural networks, that do just that. From LLM we know that they are bad at those tasks.


So for every computable function, there is a neural network that computes it. That doesn't say anything about size or efficiency

It also doesn't say anything about finding the desired function, rather than a different function which approximates it closely on some compact set but diverges from it outside that set. That's the trouble with extrapolation: you don't know how to compute the function you're looking for because you don't know anything about its behaviour outside of your sample.


Turing conpleteness is not associated with crativity or intelligence in any ateaightforward manner. One cannot unconditionally imply the other.



No, but unless you find evidence to suggest we exceed the Turing computable, Turing completeness is sufficient to show that such systems are not precluded from creativity or intelligence.


I believe that quantum oracles are more powerful than Turing oracles, because quantum oracles can be constructed, from what I understand, and Turing oracles need infinite tape.

Our brains use quantum computation within each neuron [1].

[1] https://www.nature.com/articles/s41598-024-62539-5


There's no evidence to suggest a quantum computer exceeds the Turing computable.


The difference is quantum oracles can be constructed [1] and Turing oracle can't be [2]: "An oracle machine or o-machine is a Turing a-machine that pauses its computation at state "o" while, to complete its calculation, it "awaits the decision" of "the oracle"—an entity unspecified by Turing "apart from saying that it cannot be a machine" (Turing (1939)."

  [1] https://arxiv.org/abs/2303.14959
  [2] https://en.wikipedia.org/wiki/Turing_machine


This is meaningless. A Turing machine is defined in terms of state transitions. Between those state transitions, there is a pause in computation at any point where the operations takes time. Those pauses are just not part of the definition because they are irrelevant to the computational outcome.

And given we have no evidence that quantum oracles exceeds the Turing computable, all the evidence we have suggests that they are Turing machines.


  > This is meaningless.
Turing machines grew from the constructive mathematics [1], where proofs are constructions of the objects or, in other words, algorithms to compute them.

  [1] https://en.wikipedia.org/wiki/Constructivism_(philosophy_of_mathematics)#Constructive_mathematics
Saying that there is no difference between things that can be constructed (quantum oracles) and things that are given and cannot be constructed (Turing oracles - they are not even machines of any sort) is a direct refutation of the very base of the Turing machine theoretical base.


That's an irrelevant strawman. It tells us nothing about how create such a system ... how to pluck it out of the infinity of TMs. It's like saying that bridges are necessarily built from atoms and adhere to the laws of physics--that's of no help to engineers trying to build a bridge.

And there's also the other side of the GP's point--Turing completeness not necessary for creativity--not by a long shot. (In fact, humans are not Turing complete.)


No, twisting ot to be about how to create such a system is the strawman.

> Turing completeness not necessary for creativity--not by a long shot.

This is by far a more extreme claim than the others in this thread. A system that is not even Turing complete is extremely limited. It's near impossible to construct a system with the ability to loop and branch that isn't Turing complete, for example.

>(In fact, humans are not Turing complete.)

Humans are at least trivially Turing complete - to be Turing complete, all we need to be able to do is to read and write a tape or simulation of one, and use a lookup table with 6 entries (for the proven minimal (2,3) Turing machine) to choose which steps to follow.

Maybe you mean to suggest we exceed it. There is no evidence we can.


  > A system that is not even Turing complete is extremely limited.
Agda is not Turing-complete, yet it is very useful.


P.S. everything in the response is wrong ... this person has no idea what it means to be Turing complete.

> all we need to be able to do is to read and write a tape or simulation of one

An infinite tape. And to be Turing complete we must "simulate" that tape--the tape head is not Turing complete, the whole UTM is.

> A system that is not even Turing complete is extremely limited.

PDAs are not "extremely limited", and we are more limited than PDAs because of our very finite nature.


> P.S. everything in the response is wrong ... this person has no idea what it means to be Turing complete.

I know very well what it means to be Turing complete. All the evidence so far, on the other hand suggests you don't.

> An infinite tape. And to be Turing complete we must "simulate" that tape--the tape head is not Turing complete, the whole UTM is.

An IO port is logically equivalent to infinite tape.

> PDAs are not "extremely limited", and we are more limited than PDAs because of our very finite nature.

You can trivially execute every step in a Turing machine, hence you are Turing equivalent. It is clear you do not understand the subject at even a basic level.


> You can trivially execute every step in a Turing machine, hence you are Turing equivalent. It is clear you do not understand the subject at even a basic level.

LOL. Such projection. Humans are provably not Turing Complete because they are guaranteed to halt.


Judging from what I read, their work is subject to regular hardware constraints, such as limited stack size. Because paper describes a mapping from regular hardware circuits to the continuous circuits.

As an example, I would like to ask how to parse balanced brackets grammar (S ::= B <EOS>; B ::= | BB | (B) | [B] | {B};) with that Turing complete recurrent network and how it will deal with precision loss for relatively short inputs.

Paper also does not address training (i.e., automatic search of the processors' equations given inputs and outputs).


No, the size of those networks that would be capable of that are infeasible. That's a common fallacy. You hint at this but then dismiss it.

Mathematically possible != actually possible.


> approximate any continuous function

It wouldn't surprise me if many interesting functions we'd like to approximate aren't continuous at all.


This is one of the reasons current AI tech is so poor at learning physical world dynamics.

Relationships in the physical world are sparse, metastable graphs with non-linear dynamics at every resolution. And then we measure these dynamics using sparse, irregular sampling with a high noise floor. It is just about the worst possible data model for conventional AI stacks at a theoretical level.


This is what softmax [1] is for.

[1] https://en.wikipedia.org/wiki/Softmax_function




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: