Summarizer

LLM Input

llm/e6f7e516-f0a0-4424-8f8f-157aae85c74e/topic-8-990e41bf-1b97-4ed7-92b9-86d5a155ccd8-input.json

Pretty-print

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Integration into Legacy Systems # The challenge of applying AI to real-world, messy environments versus greenfield demos. Discussion includes the difficulty of getting agents to work with proprietary codebases, expensive dependencies, lack of documentation for obscure vendor tools, and the failure of browser agents on standard web forms.
</topic>

<comments_about_topic>
1. What constitutes real "thinking" or "reasoning" is beside the point. What matters is what results we getting.

And the challenge is rethinking how we do work, connecting all the data sources for agents to run and perform work over the various sources that we perform work. That will take ages. Not to mention having the controls in place to make that the "thinking" was correct in the end.

2. > connecting all the data sources for agents to run

Copilot can't jump to definition in Visual Studio.

Anthropic got a lot of mileage out of teaching Claude to grep, but LLM agents are a complete dead-end for my code-base until they can use the semantic search tools that actually work on our code-base and hook into the docs for our expensive proprietary dependencies.

3. Any topic with little coverage in the training data. LLMs will keep circling around the small bits in the training data, unable synthesize new connections.

This is very obvious when trying to use LLMs to modify scripts in vendor-specific languages that have not been widely documented and don't have many examples available. A seasoned programmer will easily recognize common patterns like if-else blocks and loops, but LLMs will get stuck and output gibberish.

4. In the gap between reality and executive speak around LLMs, I’m wondering about motives.

Getting executives, junior devs, HR, and middle management hooked onto an advice and document template machine owned and operated by your corporation would seemingly have a huge upside for an entity like Microsoft. Their infatuation might be more about how profitable such arrangements would be versus any meaningful productivity improvement for developers.

Like, in ways that BizTalk, Dynamics, and SharePoint attempt to capture business processes onto a pay-for-play MS stack, and all benefit when being pitched to non-technical customers, Copilot provides an ever evolving sycophantic exec-centred channel to push and entangle it all as MS sees fit.

Having all parts of your business divulge in real-time through saveable chats every part of your business, strategy, tooling, and process to MS servers and Azure services is itself a pretty stunning arrangement. Imagining those same services directly selling busy customers entangling integrations, or trendy azure services, through freewheeling MCP-like glue, all inline in that customers own business processes? It sounds like tech exec nirvana, automated self-directed sales.

They don’t need job deleting sentience to make the share price go up and rationalize this LLM push. They are far more aware of the limitations than we…

5. I doubt this simply because of the inertia of medicine. The industry still does not have a standardized method for handling automated claims like banking. It gets worse for services that require prior authorization; they settle this over the phone ! These might sound like irrelevant ranting, but my point is that they haven't even addressed the low-hanging fruits, let alone complex ailments like cancer.

6. I recall someone saying stories of LLMs doing something useful to "I have a Canadian girlfriend" stories. Not trying to discredit or be a pessimist, can anyone elaborate how exactly they use these agents while working in interdependent projects in multi-team settings in e.g. regulated industries?

7. Thanks for that. It's a really interesting data point. My takeaway, which I've already felt and I feel like anyone dealing with insurance would anyway, is that the industry is wildly outdated. Which I guess offers a lot of low hanging fruit where AI could be useful. Other than the email drafting, it really seems like all of that should have been handled by just normal software decades ago.

8. That sounds a lot like "LLMs are finally powerful enough technology to overcome our paper/PDF-based business". Solving problems that frankly had no business existing in 2020.

9. Seriously, I’m lucky if 10% of what I do in a week is writing code. I’m doubly lucky if, when I do, it doesn’t involve touching awful corporate horse-shit like low-code products that are allergic to LLM aid, plus multiple git repos, plus having knowledge from a bunch of “cloud” dashboard and SaaS product configs. By the time I prompt all that external crap in I could have just written what I wanted to write.

Writing code is the easy and fast part already .

10. If you think about the real-world and the key bottleneck with most creative work projects (this includes software), it's usually context (in the broadest sense of the word).

Humans are good at this because they are truly multi-modal and can interact through many different channels to gather additional context to do the requisite task at hand. Given incomplete requirements or specs, they can talk to co-workers, look up old documents from a previous release, send a Slack or Teams message, setup a Zoom meeting with stakeholders, call customers, research competitors, buy a competitors product and try it out while taking notes of where it falls short, make a physical site visit to see the context in which the software is to be used and environmental considerations for operation.

Point is that humans doing work have all sorts of ways to gather and compile more context before acting or while they are acting that an LLM does not and in some cases cannot have without the assistance of a human. This process in the real world can unfurl over days or weeks or in response to new inputs and our expectation of how LLMs work doesn't align with this.

LLMs can sort of do this, but more often than not, the failure of LLMs is that we are still very bad at providing proper and sufficient context to the LLM and the LLMs are not very good at requesting more context or reacting to new context, changing plans, changing directions, etc. We also have different expectations of LLMs and we don't expect the LLM to ask "Can you provide a layout and photo of where the machine will be set up and the typical operating conditions?" and then wait a few days for us to gather that context for it before continuing.

11. > In one example I cite in my article, ChatGPT Agent spends fourteen minutes futilely trying to select a value from a drop-down menu on a real estate website

Man dude, don't automate toil add an API to the website.It's supposed to have one!

12. It probably has one that the web form is already using, but if agentic AI requires specialized APIs, it's going to be a while before reality meets the hype.
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Integration into Legacy Systems # The challenge of applying AI to real-world, messy environments versus greenfield demos. Discussion includes the difficulty of getting agents to work with proprietary codebases, expensive dependencies, lack of documentation for obscure vendor tools, and the failure of browser agents on standard web forms.

commentCount

← Back to job