Summarizer

LLM Input

llm/3fd5f01c-dce0-45f5-821d-a9c655fbe87c/batch-0-dc6a6574-4f08-4c4c-a2da-de8ada53a72e-input.json

Pretty-print

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. AI-Generated Writing Detection
Related: Extensive debate about whether the article was written by AI, with discussion of telltale signs like repetitiveness, fluff language, lack of benchmarks, and 'schmoozing salesman feel'. Some defend calling out AI writing while others find accusations obnoxious.
2. Practical Benefits Unclear
Related: Multiple commenters question what actual benefits this approach provides over external tool calling, asking for benchmarks, speed comparisons, and concrete use cases beyond elegance.
3. External Tools vs Internal Execution
Related: Discussion of tradeoffs between having models call external tools versus executing computation internally, including security implications, latency concerns, and the overhead of process forking.
4. Neurosymbolic AI Approaches
Related: References to traditional neurosymbolic computing debates, with some dismissing this as 'old neurosymbolic garbage restated' while others see potential in embedding computational primitives into LLMs.
5. Differentiability Advantage
Related: The ability to backpropagate through the computation is highlighted as a key difference from external tools, making this a trainable computational substrate.
6. O(log n) Attention Scaling
Related: Technical interest in the logarithmic scaling attention mechanism using 2D convex hull exploration, enabling rapid token generation in 'focus mode'.
7. Missing Benchmarks and Weights
Related: Criticism that no model weights or compiler tools were released, and lack of performance benchmarks against baseline approaches limits reproducibility and evaluation.
8. Speculative Execution Architecture
Related: Discussion of using these models for speculative token generation where a fast model proposes tokens and a slower model verifies, similar to CPU speculative execution.
9. Interpretability Implications
Related: Interest in how pseudo-symbolic execution could improve model interpretability, especially if significant model behavior occurs through deterministic operations.
10. GPU vs CPU Execution Tradeoffs
Related: Concerns about pushing tool execution into GPU context where I/O unpredictability and blocking calls cause latency issues, versus cheaper CPU execution.
11. MoE Integration Possibilities
Related: Speculation about combining this approach with Mixture of Experts architectures, where routers could select deterministic solvers for appropriate problem subsets.
12. WebAssembly and VM Embedding
Related: Discussion of why WebAssembly was chosen for the VM, with alternative suggestions like embedded Elixir or other lightweight interpreters.
13. Chain of Thought Enhancement
Related: Potential for models to modify programs mid-execution similar to 'aha moments' observed in chain-of-thought reasoning, enabling on-the-fly debugging.
14. Human Brain Analogy
Related: Comparisons to human cognition, noting brains can slowly simulate Turing machines but we use external computers for speed and reliability.
15. Reinforcement Learning Potential
Related: Interest in combining this approach with RL to optimize models for computational thinking, generating and testing hypotheses in unified thought processes.
16. Article Presentation Quality
Related: Praise for the animated figures and visual presentation while criticizing the text structure with too many small paragraphs not making cogent arguments.
17. Security Benefits
Related: Suggestion that eliminating external tool calling improves security by avoiding potentially corrupted external tools.
18. Batching Feasibility
Related: Questions about whether this approach can be batched efficiently, noting batching requires knowing execution paths upfront which contradicts dynamic tool use.
19. Comment Quality and AI Accusations
Related: Meta-discussion about how accusations of AI-generated content could harm community discourse through paranoia, even without technical enforcement methods.
20. Deterministic Computation Integration
Related: Interest in incorporating deterministic calculations into normally non-deterministic model behavior, potentially like having calculators built into brains.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[

{
"id": "47362409",
"text": "This seems way cooler than just computation (which is easy to hand off to a tool, and arguably more predictable that way). The broader point here is that you can have your model switch dynamically to/from a kind of attention that scales with the log of the token count, by only exploring the convex hull in a 2D space. A less capable version of attention, to be sure, but one capable of tracing a program’s execution with text representations of registers and stack - which is a meaningful level of flexibility, and one many humans would find difficult to do reliably!\n\nWhat could you do with an LLM that can go into “focus mode” and generate tokens extremely rapidly? How much more powerful would a reasoning-token-generation phase be that can explore and cull large numbers of paths/hypotheses, so long as they are well defined? Does this have implications for multi-modal models and spatial reasoning?\n\nAs the paper suggests:\n\n> These models could be useful in several modes: as a dedicated fast path paired with a slower, more general model; as part of a fast/slow hybrid architecture inside a single system; or as a speculative execution model that proposes tokens quickly while a regular-attention model verifies and accepts them. Regardless of their eventual capability ceiling, they already suggest a powerful systems primitive for speeding up larger models."
}
,

{
"id": "47363421",
"text": "This seems like it has some potential, but is pretty much useless as it is.\n\nShame there are no weights released - let alone the \"compiler\" tool they used to actually synthesize computational primitives into model weights. It seems like a \"small model\" system that's amenable to low budget experiments, and I would love to see what this approach can be pushed towards.\n\nI disagree with the core premise, it's basically the old neurosymbolic garbage restated, but embedding predefined computational primitives into LLMs could have some uses nonetheless."
}
,

{
"id": "47363681",
"text": "If you want to experiment with hardcoding small programs into transformer weights, maybe try ALTA: https://arxiv.org/abs/2410.18077v2"
}
,

{
"id": "47362071",
"text": "This shows the downside of using AI to write up your project. I see the eloquent sentences, but don't get the message.\n\n> This works, but the actual execution happened outside the model. The model specified the computation, then waited for an external system to carry it out.\n> Our transformer also emits a program, but instead of pausing for an external tool, it executes that program itself, step by step, within the same transformer.\n\nWhat's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?\n\nWhy is it good that it's \"inside\" the model? Just making it more elegant and nice? The tool was already \"inside\" the overall hybrid system. What's the actual problem?"
}
,

{
"id": "47362148",
"text": ">This shows the downside of using AI to write up your project. I see the eloquent sentences, but don't get the message.\n\nNot really sure what this obsession with calling things you don't like AI generated is but it's poor form. If you have something to say about the text then say it. Otherwise leave baseless accusations out of it.\n\n>What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?....\n\nIt's pretty clearly an ideological thing. Some people are firmly on the 'some sort of symbolic logic is necessary' camp. From the article, 'A system that cannot compute cannot truly internalize what computation is.'\n\nSome things are just interesting for the sake of it. This is one of those things. I don't agree with the authors on the above and I'm still glad they shared. It's a very interesting read regardless."
}
,

{
"id": "47362393",
"text": "> If you have something to say about the text then say it.\n\nI could point out the individual phrases and describe the overall impression in detail, or I can just compactly communicate that by using the phrase \"AI\". If it bothers you, read it as \"AI-like\", so there is a pretension.\n\nI have no problem with using AI for writing. I do it too, especially for documentation. But you need to read it and iterate with it and give it enough raw input context. If you don't give it info about your actual goals, intentions, judgments etc, the AI will substitute some washed-out, averaged-out no-meat-on-the-bone fluff that may sound good at first read and give you a warm wow-effect that makes you hit publish, but you read into it all the context that you have in your head, but readers don't have that.\n\nFormatting and language is cheap now. We need a new culture around calling out sloppy work. You would not have had a problem with calling out a badly composed rambling article 5 years ago. But today you can easily slap an AI filter on it that will make it look grammatical and feel narratively engaging, now it's all about deeper content. But if one points that out, replies can always say \"oh, you can't prove that, can you?\""
}
,

{
"id": "47362491",
"text": ">\"This shows the downside of using AI to write up your project.\"\n\nI just find phrases like this a bit obnoxious at times.\n\n>You would not have had a problem with calling out a badly composed rambling article 5 years ago.\n\nThen why not just say that? It's rambling bla bla bla. What's so hard about that? Why invent a reason for issues, as if rambling articles didn't get written 5 years ago.\n\nLike No, being written by an LLM or not is not the reason the article has no benchmarks or interpretability results. Those things would be there regardless if the author was interested in that, so again, it just seems there's little point in making such assertions."
}
,

{
"id": "47362511",
"text": "It's very hard to discuss this. To some people it's obvious, to some it isn't. To me, every single paragraphs is obvious fluff AI writing. One problem with it is the repetitiveness and the schmoozing salesman feel. The other is the lack of benchmarks and stuff. It's both. The two are connected because the AI has to lean in to its bullshitter persona when it's not given enough raw material to write up something strong. But whenever an AI writes in its default voice like this, it also indicates that the context was not well curated.\n\nBut anyway, yes, I can also just move on to the next article. Most of the time I indeed do that."
}
,

{
"id": "47362750",
"text": "For what it’s worth, I agree with you; the article is LLM written although not with the usual gotchas, so they’re more subtle.\n\nThe subtle ones like this I don’t mind too much, as long as they get the content correct, which in this case leaves quite a bit to be desired.\n\nI’m also noticing that some people around me appear to just be oblivious to some LLM signals that bother me a lot, so people consume media differently.\n\nI absolutely do believe that AI generated content needs to be called out, although at this point it’s safe to say that pretty much all online content is LLM written."
}
,

{
"id": "47362500",
"text": "I'm glad they shared too! Wish they shared without letting the LLM process it so heavily, it makes it too hard to read, it gives monotone importance to every piece of text. Mostly it does this by bringing everything up to a slight over-importance with tone and fluff language, and by turning everything into dry statements of fact.\n\nAs to why people call this out without going into great detail about the problems with the actual text, it's because this is happening all over the place and it's very disrespectful to readers, who dig into an article that looks very well written on the surface, only to discover it's a lot of labor to decode and often (but not always) a total waste of time. Asking for a critical report of the text is asking even more of a reader who already feels duped."
}
,

{
"id": "47362237",
"text": "I got the same impression as the parent post. Even if its not AI-generated, the text reads like a politician's speech at a lot of places. Talks a lot, says little.\n\nThe idea itself was very cool, so I endured it. But it was not a pleasant read."
}
,

{
"id": "47362884",
"text": "Agreeing first that it is genuinely interesting, let me make a constructive comment on the text: Early on, there are too many small paragraphs that don't on their own make a cogent argument. That important but easily overlooked structural work is pushed back to the reader. I felt rewarded in pushing past that though. Bravo."
}
,

{
"id": "47362566",
"text": "This is a nice case study of the downside of creating explicit policies of \"no AI comments\" without a technical method of enforcing it. I am sure the hacker news comment quality will suffer almost as much from an escalating culture of accusation and paranoia that it will from LLM comment themselves."
}
,

{
"id": "47363106",
"text": "Well, for one, by eliminating external tool calling, the model gains an amount of security. This occurs because the tools being called by an LLM can be corrupted, and this this scenario corrupted tools would not be called."
}
,

{
"id": "47362814",
"text": "> Is it speed?\n\n> Is it that you can backprop through this computation? Do you do so?\n\nWith respect, I feel that you may not have read the article.\n\n> Because the execution trace is part of the forward pass, the whole process remains differentiable: we can even propagate gradients through the computation itself. That makes this fundamentally different from an external tool. It becomes a trainable computational substrate that can be integrated directly into a larger model.\n\nand,\n\n> By storing points across nested convex hulls, this yields a decoding cost of O(k+log⁡ n).\n\nand,\n\n> Regardless of their eventual capability ceiling, they already suggest a powerful systems primitive for speeding up larger models.\n\nSo yes, and yes.\n\n> Where are the benchmarks?\n\nNot clear what they should benchmark it against. They do compare speed to a normal KV Cache. As for performance.. if it's actually executing a Sudoku solver with a 100% success rate, it seems pretty trivial to find any model doing < 100% success rate. Sure, it would be nice to see the data here, agree with you there.\n\nPersonally I think it would be really interesting to see if this method can be combined with a normal model MoE-style. It is likely possible, the router module should pick up quite quickly that it predicts the right tokens for some subset of problems deterministically. I like the idea of embed all sorts of general solvers directly into the model, like a prolog solver for example. In fact it never would have occurred to me to just go straight for WASM, pretty interesting choice to directly embed a VM. But it makes me wonder what \"smaller\" interpreters could be useful in this context."
}
,

{
"id": "47362753",
"text": "What are the AI tells? The only one I found is redundancy, but it makes sense because this is trying to be approachable to laymen.\n\nLike, you have a great point (the benefit of this approach isn't explained), but that's a mistake humans frequently make."
}
,

{
"id": "47362181",
"text": "Honestly, the most interesting thing here is definitely that just 2D heads are enough to do useful computation (at least they are enough to simulate an interpreter) and that there is an O(log n) algorithm to compute argmax attention with 2D heads. It seems that you could make an efficient pseudosymbolic LLM with some frozen layers that perform certain deterministic operations, but also other layers that are learned."
}
,

{
"id": "47362763",
"text": "The key difference is that the model is able to write the program as it’s executing it.\n\nBefore it needs to write the code and have an external program execute it. Here it can change its mind mid execution. Kinda like what was observed in the CoT’s ah ha moment"
}
,

{
"id": "47349824",
"text": "This seems a really interesting path for interpretability, specially if a big chunk of a model's behavior occurs pseudo-symbolically. This is an idea I had thought about, integrating tools into the main computation path of a model, but I never imagined that it could be done efficiently with just a vanilla transformer.\n\nTruly, attention is all you need (I guess)."
}
,

{
"id": "47362076",
"text": "Interesting... But why? What is the benefit, other than increasing our understanding of model architectures?\n\nOur brains can also simulate turing machines, slowly. We automated that with computers that are faster and more reliable. So why not allow a model to use external much faster and reliable tools, just as we do?"
}
,

{
"id": "47362909",
"text": "I spent the entire time reading it pondering the same thing.\n\n1. The article presents that calling out to a tool like python is \"expensive\" because of the overhead of forking a process, loading up the python env etc, but why not just eliminate that overhead and embed WebAssembly so this \"tool call\" is near zero? This feels very similar to the discussion in the 90's around the overhead of threads v.s. processes or kernel space v.s. user space. Could even go further and have a running beam vm so the LLM can write elixir which is ideal for LLM's that stream out code? Elixir programs will be a lot shorter than webassembly.\n\n2. The core argument stated is \"A system that cannot compute cannot truly internalize what computation is.\" The idea being that it could write a program, execute it and by seeing all of the steps maybe even part way through stop and change its mind or when writing new programs write them better, aka be able to debug on the fly?\n\n3. Not mentioned, but there is a 3rd x factor that LLM's will use this new found computation engine to do overall better at \"thinking\". Computing in very unexpected ways and to unexpected problems. Maybe it would do dramatically better at some benchmark because of this?\n\nUnfortunately these are not explored and it is just an execution engine even resulting in the conclusion stating \"arbitrary programs can be compiled directly into the transformer weights, bypassing the need to represent them as token sequences at all.\" which goes to point number 1 of if we are compiling to weights why not just optimize the tool calling?"
}
,

{
"id": "47362540",
"text": "Why must models be analogous to humans using tools? Or to take the analogy route further wouldn't it be better if humans had calculators built into their brains, provided they are determisitic and reduce latency"
}
,

{
"id": "47361837",
"text": "I'd like to see this combined with reinforcement learning to optimize models to think computationally. Generating ideas with hypothetical results and then running them in the same thought. Their solution sounded like a lot of tokens though."
}
,

{
"id": "47362293",
"text": "I really liked the article, but food for thought: is a transformer that offloads computation to python really that different from Python code being read and then executed by a compiler?\n\nBoth examples are of a system we created to abstract most of the hard work.\n\nI think a more important concept here is that the term \"AI\" has a lot of built-in assumptions, one of which being that it is (or will be) super intelligent, and so folks like the author here think (correctly) that it's important for the AI to be actually doing the work itself."
}
,

{
"id": "47351426",
"text": "It makes sense that a next token predictor could execute assembly code. This is fascinating work, especially with the memory implementation."
}
,

{
"id": "47362002",
"text": "This is brilliant, game changing level.\n\nHey, give it also access to the dump of its weights and way to propose updates so it can see and tinker its brain directly."
}
,

{
"id": "47363005",
"text": "Besides being a very interesting conceptual exercise, the animated figures in this article are absolutely stunning - best I’ve ever seen."
}
,

{
"id": "47351279",
"text": "one of the most interesting pieces I've read recently. Not sure I agree with all the statements there (e.g. without execution the system has no comprehension) - but extremely cool"
}
,

{
"id": "47362815",
"text": "This is really important work."
}
,

{
"id": "47362761",
"text": "The original title is \"Can LLMs be computers?\"\n\nBut the right question is, should they?"
}
,

{
"id": "47363027",
"text": "This looks like a hack. Yes, being able to interpret webassembly is a general oracle. Still falls short of solving the real problem directly."
}
,

{
"id": "47362533",
"text": "very cool idea. But, time savings are not true for every tool call, and it's not clear to me yet whether this is batch-able; also, intuitively, for most of the models that run on GPU, you'd still want to offload tool exec part to CPU since it's much cheaper..."
}
,

{
"id": "47363274",
"text": "If you push tool execution into the model itself, you inherit all the I/O unpredictability and error handling baggage, but now inside a GPU context that's allergic to latency. Inference throughput tanks if external calls start blocking, and A100s make expensive waiters. Batching is fantasy unless you know up front exactly what gets executed, which is the opposite of dynamic tools. If you want \"faster\" here, the trade is reliable deterministic compute versus the usual Wild West of system calls and side effects."
}
,

{
"id": "47361953",
"text": "Is this genius? Or just a new binary executable format? Can't tell."
}
,

{
"id": "47363299",
"text": "Very interesting read. Would love to learn more about incorporating deterministic calculations where it's normally non-deterministic."
}
,

{
"id": "47362073",
"text": "big question is how efficient is this compare to executing assembly on CPU"
}
,

{
"id": "47363132",
"text": "ooh"
}
,

{
"id": "47362044",
"text": "what!"
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[

{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,

{
"id": "comment_id_2",
"topics": [
2
]
}
,

{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

← Back to job