Statistical Mechanics Analogy

The idea that simple rules can explain complex phenomena, drawing parallels between thermodynamics and potential deep learning theory

While some view the analogy between deep learning and statistical mechanics as a plausible path toward understanding, skeptics argue that the inherent messiness of training data and the massive scale of models prevent the emergence of a clean, foundational theory. These critics suggest that without specific mathematical properties like concentration of measure, deep learning may remain as elusive to human comprehension as the nature of consciousness. Nevertheless, others remain optimistic, asserting that the primary goal of any scientific theory is to uncover the simple rules governing such complex systems, regardless of their current scale or apparent disorder.

View on HN · Topics

There is an analogy with statistical mechanics. It's not crazy.

View on HN · Topics

I’m in the skeptical camp. Whatever theory that will eventually emerge will not be as solid as:
1. Theory of pattern recognition (as developed in 80s and 90s)
2. Theory of thermodynamics
3. Theory of gravity
4. Theory of electromagnetism
5. Theory of relativity
Etc. because of two reasons:
1. While half of deep learning is how humans construct the architecture of networks, the more important half relies on data. This data is a hodgepodge of scraped internet data (text and videos), books, user interactions etc., which really has no coherent structure
2. To extract meaningful insights from this much data, it takes models of enormous size like 10B+. The thing about random systems (in the mathematical sense) is that it takes “something” of order of magnitude bigger size to “understand” it, unless there is some concentration of measure type mathematical niceties (as in thermodynamics), which I don’t think is there in these models and data. This is the same reason I don’t think humans will ever be able to “understand” human consciousness. It will take something of an order of magnitude bigger than our own brains to do that.
Here is Terence Tao explaining this concentration stuff in another context: https://mathstodon.xyz/@tao/113873092369347147
I would love to be proven wrong though.

View on HN · Topics

The whole point about theory, though, is that simple rules can define complex phenomena. I don’t think anything you wrote fundamentally rules out the idea that we could find a theory of deep learning.

Summarizer