People throw that proof around all the time; but all it does is show that a neural net is equivalent to a lookup table; and a lookup table with enough memory can approximate any function. It's miles away from explaining how real world, useful, neural nets, like conv-nets, transformers, LSTMs, etc. actually work.