Deterministic Computation Integration

Interest in incorporating deterministic calculations into normally non-deterministic model behavior, potentially like having calculators built into brains.

Integrating deterministic computation directly into non-deterministic models presents a shift toward "pseudosymbolic" LLMs that could function like a brain with a built-in calculator. While internalizing these processes might improve latency and redefine our understanding of AI comprehension, skeptics point to the massive logistical hurdles of managing I/O unpredictability and GPU throughput within a synchronous inference context. Ultimately, the debate hinges on whether it is more efficient to treat models as traditional tool-users or to architect specialized, frozen layers capable of executing precise logic natively.

View on HN · Topics

Honestly, the most interesting thing here is definitely that just 2D heads are enough to do useful computation (at least they are enough to simulate an interpreter) and that there is an O(log n) algorithm to compute argmax attention with 2D heads. It seems that you could make an efficient pseudosymbolic LLM with some frozen layers that perform certain deterministic operations, but also other layers that are learned.

View on HN · Topics

Why must models be analogous to humans using tools? Or to take the analogy route further wouldn't it be better if humans had calculators built into their brains, provided they are determisitic and reduce latency

View on HN · Topics

I really liked the article, but food for thought: is a transformer that offloads computation to python really that different from Python code being read and then executed by a compiler?

Both examples are of a system we created to abstract most of the hard work.

I think a more important concept here is that the term "AI" has a lot of built-in assumptions, one of which being that it is (or will be) super intelligent, and so folks like the author here think (correctly) that it's important for the AI to be actually doing the work itself.

View on HN · Topics

one of the most interesting pieces I've read recently. Not sure I agree with all the statements there (e.g. without execution the system has no comprehension) - but extremely cool

View on HN · Topics

If you push tool execution into the model itself, you inherit all the I/O unpredictability and error handling baggage, but now inside a GPU context that's allergic to latency. Inference throughput tanks if external calls start blocking, and A100s make expensive waiters. Batching is fantasy unless you know up front exactly what gets executed, which is the opposite of dynamic tools. If you want "faster" here, the trade is reliable deterministic compute versus the usual Wild West of system calls and side effects.

View on HN · Topics

Very interesting read. Would love to learn more about incorporating deterministic calculations where it's normally non-deterministic.

Summarizer