Summarizer

LLM Input

llm/e6f7e516-f0a0-4424-8f8f-157aae85c74e/topic-4-3500ed87-050b-4a7d-8987-ba42a0d03473-input.json

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Definition of Agentic Success # Disagreement over whether AI "joined the workforce." Some argue failing to replace humans entirely (the "secretary" model) is a failure of 2025 predictions, while others claim deep integration as a tool (automating loops, drafting emails) constitutes a successful, albeit different, type of joining.
</topic>

<comments_about_topic>
1. What constitutes real "thinking" or "reasoning" is beside the point. What matters is what results we getting.

And the challenge is rethinking how we do work, connecting all the data sources for agents to run and perform work over the various sources that we perform work. That will take ages. Not to mention having the controls in place to make that the "thinking" was correct in the end.

2. a stellar piece, Cal, as always. short and straight to the point.

I believe that Codex and the likes took off (in comparison to e.g. "AI" browsers) because the bottleneck there was not reasoning about code, it was about typing and processing walls of text. for a human, the interface of e.g. Google Calendar is ± intuitive. for a LLM, any graphical experience is an absolute hellscape from performance standpoint.

CLI tools, which LLMs love to use, output text and only text, not images, not audio, not videos. LLMs excel at text, hence they are confined to what text can do. yes, multimodal is a thing, but you lose a lot of information and/or context window space + speed.

LLMs are a flawed technology for general, true agents. 99% of the time, outside code, you need eyes and ears. we have only created a self-writing paper yet.

3. It was not a well thought out piece and it is discounting the agentic progress that has happened.

>The industry had reason to be optimistic that 2025 would prove pivotal. In previous years, AI agents like Claude Code and OpenAI’s Codex had become impressively adept at tackling multi-step computer programming problems.

It is easy to forget that Claude Code CAME OUT in 2025. The models and agents released in 2025 really DID prove how powerful and capable they are. The predictions were not really wrong. I AM using code agents in a literal fire and forget way.

Claude Code is a hugely capable agentic interface for sovling almost any kind of problem or project you want to solve for personal use. I literally use it as the UX for many problems. It is essentially a software that can modify itself on the fly.

Most people haven't really grasped the dramatic paradigm shift this creates. I haven't come up with a great analogy for it yet, but the term that I think best captures how it feels to work with claude code as a primary interface is "intelligence engine".

I'll use an example, I've created several systems harnessed around Claude Code, but the latest one I built is for stock porfolio management (This was primarily because it is a fun problem space and something I know a bit about). Essentially you just used Claude Code to build tools for itself in a domain. Let me show how this played out in this example.

Claude and I brainstorma general flow for the process and roles. Then figure out what data each role would need, research what providers have the data at a reasonable price.

I purchase the API keys and claude wires up tools (in this case python scripts and documentation for the agents for about 140 api endpoints), then builds the agents and also creates an initial vesrion of the "skill" that will invoke the process that looks something like this:

Macro Economist/Strategist -> Fact Checker -> Securities Sourcers -> Analysts (like 4 kinds) -> Fact Checker/Consolidator -> Portfolio Manager

Obviously it isn't 100% great on the first pass and I have to lean on expertise I have in building LLM applications, but now I have a Claude Code instance that can orchestrate this whole research process and also handle ad-hoc changes on the fly.

Now I have evolved this system through about 5 significant iterations, but I can do it "in the app". If I don't like how part of it is working, I just have the main agent rewire stuff on the fly. This is a completely new way of working on problems.

4. I think it depends on what "join" means. I see no reason why it has to be "replace a human". People used to have secretaries back in the day, we don't anymore, we all do our own thing, but in a way, LLMs are our secretaries of sorts now. Or our personal executive assitants, even if you're not an executive.

I don't know what else LLMs need to do? get on the payroll? People are using them heavily. You can't even google things easily without triggering an LLM response.

I think the current millenial and older generation is too used to the pre-LLM way of things, so the resistance will be there for a long time to come. but kids doing homeworks with LLMs will rely on them heavily once they're in the work force.

I don't know how people are not as fascinated and excited about this. I keep watching older scifi content, and LLMs are now doing for us what "futuristic computer persona" did in older scifi.

Easy example: You no longer need copywriters because of LLMs. You had spell/grammar checkers before, but they didn't "understand" context and recommend different phrasing, and check for things like continuity and rambling on.

5. "People using AI" had a meaningful change when they "joined the workforce" in 2025.

We may not have gotten fully-autonomous employees, but human employees using AI are doing way more than they could before, both in depth and scale.

Claude Code is basically a full-time "employee" on my (profitable) open source projects, but it's still a tool I use to do all the work. Claude Code is basically a full-time "employee" at my job, but it's still a tool I use to do all the work. My workload has shifted to high-level design decisions instead of writing the code, which is kind of exactly what would have happened if AI "joined the workforce" and I had a bunch of new hires under me.

I do recognize this article is largely targeted at non-dev workforces though, where it _largely_ holds up but most of my friends outside of the tech world have either gotten new jobs thanks to increased capability through AI or have severely integrated AI into whatever workflows they're doing at work (again, as a tool) and are excelling compared to employees who don't utilize AI.

6. > Claude Code is basically a full-time "employee" on my (profitable) open source projects,

What fulltime employee works for 30 minutes and then stops working for the next 5 hours and 30 minutes like Claude does?

7. If you think about the real-world and the key bottleneck with most creative work projects (this includes software), it's usually context (in the broadest sense of the word).

Humans are good at this because they are truly multi-modal and can interact through many different channels to gather additional context to do the requisite task at hand. Given incomplete requirements or specs, they can talk to co-workers, look up old documents from a previous release, send a Slack or Teams message, setup a Zoom meeting with stakeholders, call customers, research competitors, buy a competitors product and try it out while taking notes of where it falls short, make a physical site visit to see the context in which the software is to be used and environmental considerations for operation.

Point is that humans doing work have all sorts of ways to gather and compile more context before acting or while they are acting that an LLM does not and in some cases cannot have without the assistance of a human. This process in the real world can unfurl over days or weeks or in response to new inputs and our expectation of how LLMs work doesn't align with this.

LLMs can sort of do this, but more often than not, the failure of LLMs is that we are still very bad at providing proper and sufficient context to the LLM and the LLMs are not very good at requesting more context or reacting to new context, changing plans, changing directions, etc. We also have different expectations of LLMs and we don't expect the LLM to ask "Can you provide a layout and photo of where the machine will be set up and the typical operating conditions?" and then wait a few days for us to gather that context for it before continuing.

8. Agents as staff replacements that can tackle tasks you would normally assign to a human employee didn't happen in 2025.

Agents as LLMs calling tools in a loop to perform tasks that can be handled by typing commands into a computer absolutely did.

Claude Code turns out to be misnamed: it's useful for way more than just writing code, once you figure out how to give it access to tools for other purposes.

I think the browser agents (like the horribly named "ChatGPT Agent" - way to burn a key namespace on a tech demo!) have acted as a distraction from this. Clicking links is still pretty hard. Running Bash commands on the other hand is practically a solved problem.

9. We still sandbox, quarantine and restrict them though, because they can't really behave as agents, but they're effective in limited contexts. Like the way waymo cars kind of drive on a track I guess? Still very useful, but not the agents that were being sold, really.

Edit: should we call them "special agents"? ;-)

10. I really don’t agree with the author here. Perplexity has, for me, largely replaced Cal Newport’s job (read other journalists work and synthesize celebrity and pundit takes on topic X). I think the take that Claude isn’t literally a human so agents failed is silly and a sign of motivated reasoning. Business processes are going to lag the cutting edge by years in any conditions and by generations if there is no market pressure. But Codex isn’t capable of doing a substantial portion of what I would have had to pay a freelancer/consultant to do? Any LLM can’t replace a writer for a content mill? Nonsense. Newport needs to open his eyes and think harder about how a journalist can deliver value in the emerging market.

11. But it isn’t joining the workforce. Your perspective is that it could, but the point that it hasn’t is the one that’s salient. Codex might be able to do a substantial portion of what a freelancer can do, but even you fell short of saying it can replace the freelancer. As long as every ai agent needs its hand held the effect on the labor force is an increase in costs and an increase in outputs where quality doesn’t matter. It’s not a reduction of labor forces

12. This article seems based in a poorly defined statement. What does "joining the workforce" actually mean?

There are plenty of jobs that have already been pretty much replaced by AI: certain forms of journalism, low-end photoshop work, logo generation, copywriting. What does the OP need to see in order to believe that AI has "joined the workforce"?

13. It was from Altman's blog:

> We are now confident we know how to build AGI as we have traditionally understood it. We believe that, in 2025, we may see the first AI agents “join the workforce” and materially change the output of companies...

"materially change the output of companies" seems fairly defined and didn't happen in most cases. I guess some kicked out more slop but I don't think that's what he meant.

14. "We are now confident we know how to build AGI as we have traditionally understood it. We believe that, in 2025, we may see the first AI agents 'join the workforce' and materially change the output of companies."

We know how to build it and it will be entering the workforce in 2025. Well, we're in 2026 now and we don't have it in the workforce or anywhere else because they haven't built it because they don't really know how to build it because they're hucksters selling vaporware built on dead end technologies they cannot admit to.

15. Cal Newport looked in the wrong places. He has no visibility into the usage of ChatGPT to do homework. The collapse of Chegg should tell you, with no other public information, that if 30% of students were already cheating somehow, somewhat weakly, they are now doing super-powerful cheating, and surely more than 30% of students at this stage.

It’s also kind of stupid to hand wave away, programming. Programmers are where all the early adopters of software are. He’s merely conflating an adoption curve with capabilities. Programmers, I’m sure, were also the first to use Google and smartphones. “It doesn’t work for me” is missing the critical word “yet” at the end, and really, is it saying much that forecasts about adoption in the metric, “years until when Cal Newport’s arbitrary criteria of what agent and adoption means meets some threshold only inside Cal Newport’s head” is hard to do?

There are 700m active weeklies for ChatGPT. It has joined the workforce! It just isn’t being paid the salaries.

16. Wow, homework is an insane example of a "workforce."

Homework is in some ways the opposite of actual economic labor. Students pay to attend school, and homework is (theoretically) part of that education; something designed to help students learn more effectively. They are most certainly not paid for it.

Having a LLM do that "work" is economically insane. The desired learning does not happen, and the labor of grading and giving feedback is entirely wasted.

Students use ChatGPT for it because of perverse incentives of the educational system. It has no bearing on economic production of value.

17. read it again. he criticizes the hype built around 2025 as the Year X for agents. many were thinking that "we'll carry PCs in our pockets" when Windows Mobile-powered devices came out. many predicted 2003 as the Year X for what we now call smartphones.

no, it was 2008, with the iPhone launch.

18. It pretty much did join the work force. Listen to the fed chair, listen to related analysis, the unexpected overperformance of GDP isn’t directly attributed AI but it is very much in the “how did that happen?” conversation. And there’s plenty of softer, more anecdotal evidence in addition to that to respond to the headline with “It did.” The fact that it has been gradual and subtle as the very first agent tools reach production readiness, gain awareness in the public, start being used…? That really doesn’t seem at all unexpected as the path than “joining” would follow.

19. I get the point you are making, but the hypothetical question from your manager doesn't make sense to me.

It's obviously true that any of your particular coworker wouldn't be useful to you relative to an AI agent, since their goal is to perform their own obligations to the rest of the company, whereas the singular goal of the AI tool is to help the user.

Until these AI tools can completely replace a developer on its own, the decision to continue employing human developers or paying for AI tools will not be mutually exclusive.
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Definition of Agentic Success # Disagreement over whether AI "joined the workforce." Some argue failing to replace humans entirely (the "secretary" model) is a failure of 2025 predictions, while others claim deep integration as a tool (automating loops, drafting emails) constitutes a successful, albeit different, type of joining.

commentCount

19

← Back to job