Integration into Legacy Systems

The challenge of applying AI to real-world, messy environments versus greenfield demos. Discussion includes the difficulty of getting agents to work with proprietary codebases, expensive dependencies, lack of documentation for obscure vendor tools, and the failure of browser agents on standard web forms.

While AI thrives in sterile demos, its integration into the real world is currently hamstrung by the "messy middle" of proprietary codebases, undocumented vendor languages, and fragmented data sources that lack clean APIs. A significant hurdle is the "context gap," where LLMs struggle to emulate the human ability to proactively gather multi-modal information across social and physical channels to solve complex, interdependent problems. Beyond technical failures—such as agents fumbling with basic web forms—there is a growing skepticism that the current AI push serves primarily to entangle corporate processes into proprietary tech stacks rather than delivering meaningful productivity gains. Ultimately, the promise of agentic AI remains at odds with the inertia of outdated industries that still rely on manual toil and "paper-based" workflows that should have been digitized decades ago.

View on HN · Topics

What constitutes real "thinking" or "reasoning" is beside the point. What matters is what results we getting.

And the challenge is rethinking how we do work, connecting all the data sources for agents to run and perform work over the various sources that we perform work. That will take ages. Not to mention having the controls in place to make that the "thinking" was correct in the end.

View on HN · Topics

> connecting all the data sources for agents to run

Copilot can't jump to definition in Visual Studio.

Anthropic got a lot of mileage out of teaching Claude to grep, but LLM agents are a complete dead-end for my code-base until they can use the semantic search tools that actually work on our code-base and hook into the docs for our expensive proprietary dependencies.

View on HN · Topics

Any topic with little coverage in the training data. LLMs will keep circling around the small bits in the training data, unable synthesize new connections.

This is very obvious when trying to use LLMs to modify scripts in vendor-specific languages that have not been widely documented and don't have many examples available. A seasoned programmer will easily recognize common patterns like if-else blocks and loops, but LLMs will get stuck and output gibberish.

View on HN · Topics

In the gap between reality and executive speak around LLMs, I’m wondering about motives.

Getting executives, junior devs, HR, and middle management hooked onto an advice and document template machine owned and operated by your corporation would seemingly have a huge upside for an entity like Microsoft. Their infatuation might be more about how profitable such arrangements would be versus any meaningful productivity improvement for developers.

Like, in ways that BizTalk, Dynamics, and SharePoint attempt to capture business processes onto a pay-for-play MS stack, and all benefit when being pitched to non-technical customers, Copilot provides an ever evolving sycophantic exec-centred channel to push and entangle it all as MS sees fit.

Having all parts of your business divulge in real-time through saveable chats every part of your business, strategy, tooling, and process to MS servers and Azure services is itself a pretty stunning arrangement. Imagining those same services directly selling busy customers entangling integrations, or trendy azure services, through freewheeling MCP-like glue, all inline in that customers own business processes? It sounds like tech exec nirvana, automated self-directed sales.

They don’t need job deleting sentience to make the share price go up and rationalize this LLM push. They are far more aware of the limitations than we…

View on HN · Topics

I doubt this simply because of the inertia of medicine. The industry still does not have a standardized method for handling automated claims like banking. It gets worse for services that require prior authorization; they settle this over the phone ! These might sound like irrelevant ranting, but my point is that they haven't even addressed the low-hanging fruits, let alone complex ailments like cancer.

View on HN · Topics

I recall someone saying stories of LLMs doing something useful to "I have a Canadian girlfriend" stories. Not trying to discredit or be a pessimist, can anyone elaborate how exactly they use these agents while working in interdependent projects in multi-team settings in e.g. regulated industries?

View on HN · Topics

Thanks for that. It's a really interesting data point. My takeaway, which I've already felt and I feel like anyone dealing with insurance would anyway, is that the industry is wildly outdated. Which I guess offers a lot of low hanging fruit where AI could be useful. Other than the email drafting, it really seems like all of that should have been handled by just normal software decades ago.

View on HN · Topics

That sounds a lot like "LLMs are finally powerful enough technology to overcome our paper/PDF-based business". Solving problems that frankly had no business existing in 2020.

View on HN · Topics

Seriously, I’m lucky if 10% of what I do in a week is writing code. I’m doubly lucky if, when I do, it doesn’t involve touching awful corporate horse-shit like low-code products that are allergic to LLM aid, plus multiple git repos, plus having knowledge from a bunch of “cloud” dashboard and SaaS product configs. By the time I prompt all that external crap in I could have just written what I wanted to write.

Writing code is the easy and fast part already .

View on HN · Topics

If you think about the real-world and the key bottleneck with most creative work projects (this includes software), it's usually context (in the broadest sense of the word).

Humans are good at this because they are truly multi-modal and can interact through many different channels to gather additional context to do the requisite task at hand. Given incomplete requirements or specs, they can talk to co-workers, look up old documents from a previous release, send a Slack or Teams message, setup a Zoom meeting with stakeholders, call customers, research competitors, buy a competitors product and try it out while taking notes of where it falls short, make a physical site visit to see the context in which the software is to be used and environmental considerations for operation.

Point is that humans doing work have all sorts of ways to gather and compile more context before acting or while they are acting that an LLM does not and in some cases cannot have without the assistance of a human. This process in the real world can unfurl over days or weeks or in response to new inputs and our expectation of how LLMs work doesn't align with this.

LLMs can sort of do this, but more often than not, the failure of LLMs is that we are still very bad at providing proper and sufficient context to the LLM and the LLMs are not very good at requesting more context or reacting to new context, changing plans, changing directions, etc. We also have different expectations of LLMs and we don't expect the LLM to ask "Can you provide a layout and photo of where the machine will be set up and the typical operating conditions?" and then wait a few days for us to gather that context for it before continuing.

View on HN · Topics

> In one example I cite in my article, ChatGPT Agent spends fourteen minutes futilely trying to select a value from a drop-down menu on a real estate website

Man dude, don't automate toil add an API to the website.It's supposed to have one!

View on HN · Topics

It probably has one that the web form is already using, but if agentic AI requires specialized APIs, it's going to be a while before reality meets the hype.

Summarizer