Causal, Semantic, and Just the Right Amount of Chaos

Why Agentic AI Still Needs to Think

Jul 31, 2025

Introduction

Everyone seems to be talking about automation, but no one seems to agree on what it actually means. For some, it’s code that replaces human labor. For others, it’s software that makes processes easier, faster, or cheaper. Somewhere along the way, "automation" became a catchall for anything slightly less manual than before. But automation without understanding is just a mess of buttons you hope does the right thing when you push it.

The truth is, there isn’t a tidy consulting framework for this kind of automation. There’s no grid or quadrant that tells you how to build it (sorry consultants, the old playbook wont work, now). Because it doesn’t come from theory, it comes from the actual work people do. And if you don't understand the work, you’re not going to build anything that sticks.

In this piece, I dig into what automation really means when we start talking about agents. Not just scripts or workflows, but agentic AI systems that can operate with a sense of context and purpose. That requires more than a well-engineered prompt. It takes causal reasoning. It takes semantics. It takes a little humility about what software can and can't do.

Let’s talk about what’s actually powering this next era of automation, and why getting it right matters.

Agents as far as the eye can see…

Somewhere along the line, we decided the best way to get artificial intelligence to do useful things was to build agents. Not just static models that answer questions or write your marketing emails, but systems that act. They move through digital environments, make decisions, and ideally, they accomplish tasks that feel like work. That is the promise of agentic AI.

But here’s the problem. Most of the agents we’ve built so far aren’t actually thinking. They’re pattern matchers on a leash. They string together actions based on large language models that don’t really understand what’s happening. And when they fail, which they do often, it’s not because the task was too complex. It’s because the agent couldn’t reason its way through what needed to happen.

If you want an AI to be useful beyond a demo, it has to reason about cause and effect. It also has to understand the meaning of the things it’s dealing with…this is where causal and semantic reasoning come in.

Causal Reasoning: The “Why” That Makes Everything Work

Causal reasoning is not just a math problem. It’s the ability to say, “If I do X, then Y will happen,” and know that this isn’t just a correlation. It’s cause. It’s the backbone of every choice you’ve ever made that wasn’t a coin flip.

For agentic AI, this kind of thinking is non-negotiable. Let’s say you’re building an AI that manages supply chain logistics. If it doesn’t understand that a storm in Texas affects delivery times in Atlanta, it’s not an agent. It’s a bad spreadsheet with a personality.

Even in simpler tasks, like booking meetings or troubleshooting bugs, agents need to simulate possible outcomes, predict bottlenecks, and act accordingly. That requires more than a language model predicting the next best token. It requires a mental model of how the world works.

I built an agent once to manage my calendar. Gave it a few prompts, told it to prep my schedule, and didn’t double-check its work. That’s how I ended up sitting in a Zoom room 30 minutes early, wondering if I was ghosted or just really ambitious. Turns out it confused time zones and stripped out the reminder.

Whoops.

Semantic Reasoning: Understanding Isn’t Optional

On the other side, semantic reasoning is about understanding meaning. Not just definitions, but context. What does a user mean when they say, “Push this to production”? What’s the difference between a user, a customer, and a stakeholder in a given conversation?

Most current AI agents hallucinate or misinterpret because they don’t actually understand the concepts they're referencing. They parrot patterns. Semantic reasoning brings in the ability to link actions to the underlying meaning, not just words. Without it, agents are just guessing.

Why It Matters for Automation

When companies talk about AI automation, they often focus on productivity. Cheaper, faster, scalable. But agents that lack causal and semantic reasoning aren’t productive. They are chaotic. They break workflows, generate errors at scale, and require constant human babysitting.

The entire point of automation is to remove friction. But when the agent doesn’t understand what it’s doing or why it’s doing it, you end up creating a new layer of complexity. It’s like hiring an intern who is really enthusiastic but can’t tell the difference between a backup file and a production database.

He is so very excited, but please don’t let him next to the coffee.

Causal and semantic reasoning are what allow humans to operate with intention. Without them, automation is just a series of disconnected tasks that look good in a product demo but collapse the moment reality changes.

Thinking in Trees, Not Just Chains

We’ve made progress with prompting techniques like “chain-of-thought,” where models are asked to show their work, reasoning step-by-step. But real agents need more than a breadcrumb trail. They need to branch out.

This is where “tree-of-thought” reasoning comes in. Instead of marching forward linearly, agents can explore multiple possible paths, discard bad ideas, and recover when things get weird (an ecosystem of sorts with solid feedback mechanisms). If causal reasoning is the “why,” tree-of-thought is the “how” of intelligent action. Most LLMs don’t do this well on their own, and it shows.

Same goes for memory. Retrieval-Augmented Generation (RAG) is a useful shortcut. Give a model some context from a vector database, and it sounds smart. But this isn’t memory in any meaningful sense. It’s lookup. Real agents will need persistent semantic memory; something that understands relationships, updates over time, and reflects actual change in the environment. Right now, we’re mostly pasting facts into prompts and hoping it works.

And then there’s counterfactual reasoning. Ask most current models to imagine what would happen if a key detail were different, and you get something that sounds plausible but often misses the point. Few-shot learning doesn’t fix this. Neither does fine-tuning. Without a grounded model of cause and consequence, even the most polished agent can still be confidently wrong.

When Agents Actually Work

There are places where agentic systems already provide real ROI. Customer support bots that follow strict flows. HR assistants that retrieve policy docs. Ticket triaging systems that shave hours off help desk queues. These aren’t AGI and they aren’t really all that magical. They’re rule-followers with just enough language fluency to be helpful and that’s okay.

When the domain is narrow, the stakes are low, and the workflow is repeatable, agents don’t need to be brilliant. They just need to be predictable. These are the success stories companies trot out, and they’re worth acknowledging. But that’s not the frontier. That’s the warm-up act.

Borders, Bots, and Bureaucracy

There’s another wrinkle in this whole automation story. Global policy.

As companies race to deploy AI, they’re running headfirst into a thicket of data localization laws, GDPR-style privacy regulations, and international labor protections. You can’t just spin up an AI agent in the Philippines to process European health data. And you can’t offshore everything if regulators start asking where your automated decisions come from.

Cross-border automation introduces questions that go beyond code. Whose data is it? Who audits the decisions? What happens when an AI system built in San Francisco misclassifies a job applicant in Nairobi?

We’re entering a world where automation doesn’t just replace labor. It competes with policy, identity, and sovereignty. And no agent, no matter how well prompted, is going to navigate that without help.

The Path Forward

If we want agentic AI to work, we have to build in these reasoning capabilities. This doesn’t mean we need to make AI sentient. It means we need it to have better models of action and context. Some teams are working on this using knowledge graphs, program synthesis, and hybrid systems that combine language models with symbolic logic.

We don’t need AI to be human. We need it to be smart enough to know when it doesn’t know something, and capable enough to figure it out when it matters.

The companies that get this right won’t just build better agents. They’ll build better systems for work. Because the future of automation isn’t about replacing people. It’s about building tools that actually understand the jobs they are doing.

Until then, we’ll keep pretending our agents are autonomous, while quietly staffing up the humans behind the scenes who make them look like they are.

A Note of Caution: Agents Are Just Tech Debt in a Hoodie

For all the excitement around agents, let’s not forget what they really are: software. And like all software, they get weird over time. They break. They depend on fragile chains of APIs and scripts. They come with edge cases, monitoring needs, error handling, and logs no one checks until something goes sideways. It’s not magic, it’s code. And the more “autonomous” your system becomes, the harder it is to untangle when it doesn’t work.

Connect the red to the red…uh oh, I don’t know colors…

The worst part? Every agent that gets spun up is a new little island of logic that lives somewhere between your workflows and your infrastructure. Multiply that across teams, and suddenly you’ve got a jungle of agents with no clear ownership, limited documentation, and zero testing. Which means your AI “assistant” is just future tech debt wearing a clever disguise. Unless you are creating agent organizations…(hint: foreshadow).

Before you get too excited about giving every team their own agent, ask yourself: who’s going to maintain it? Who’s going to troubleshoot when it inevitably collides with a calendar sync, breaks a permissions model, or forgets to ask for approval? If the answer is “whoever built it,” you’ve just created the most expensive kind of internal tooling…the kind that no one admits exists until it costs you a deal or a client. So they might feel magical, but they break like everything else that runs on code.

Autonomy sounds great until it leads to an autonomous mess. Build thoughtfully, or prepare to spend your weekends unraveling the bots you once celebrated.

Oh, and for those who are excited to replace people with agents…I am so sorry to break this to you, but that engineer needed to maintain that agent is probably more expensive than Sharon in *nameless org*. Maybe we should be talking about how to train Sharon to maintain the agent, but more on that in another post…

Sources

Pearl, Judea and Mackenzie, Dana. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.
Marcus, Gary and Davis, Ernest. Rebooting AI: Building Artificial Intelligence We Can Trust. Pantheon Books, 2019.
LeCun, Yann. “A Path Towards Autonomous Machine Intelligence.” Meta AI, 2022.
OpenAI. “Function Calling and Tool Use.” OpenAI Developer Documentation, 2023.
Bubeck, Sébastien et al. “Sparks of Artificial General Intelligence: Early Experiments with GPT-4.” Microsoft Research, 2023.
IBM Research. “Neuro-Symbolic AI: Combining Neural Networks and Symbolic Reasoning.” IBM Research Blog, 2021.
Stanford HAI. “The Role of Common Sense and Causality in Building Intelligent Systems.” Stanford Human-Centered AI, 2023.
CB Insights. “AI 100: The Most Promising Artificial Intelligence Startups of 2023.” CB Insights, 2023.
McKinsey & Company. “The State of AI in 2023: Generative AI’s Breakout Year.” McKinsey, 2023.
Anthropic. “Claude System Card.” Anthropic Documentation, 2023.

Field Notes

Discussion about this post