'the underlying “rules” that produce any distribution of data...' is clearly meant to convince the reader that it can produce something we would describe as a "rule", that is, a coherent and comprehensible regulating principle. This isn't just because he isn't being precise enough; he quite clearly wants the reader to understand this as neural networks being able to create a mental model of anything, in a manner similar to how a biological neural network would.
It doesn't, it can't, and it won't in our lifetimes.
It doesn't, it can't, and it won't in our lifetimes.