When Small Errors Become Big Problems in AI
- aiomniconversation
- Mar 20
- 2 min read
One of the most striking insights from my recent conversation with Janusz Marecki was deceptively simple. AI systems do not just make mistakes. They accumulate them.

This is not a surface-level flaw. It sits at the core of how large language models operate. Each word, or token, is generated based on probability rather than certainty. That means every step introduces a small chance of error. On its own, that error is negligible. Over hundreds of tokens, it compounds.
The result is AI systems that can begin with coherence and drift into inaccuracy without any clear signal that something has gone wrong.
This behaviour is well documented. The GPT-4 Technical Report notes that even advanced models ‘can produce content that is factually incorrect or nonsensical’ despite appearing confident (OpenAI, 2023). AI systems have been described as stochastic parrots, generating plausible language without grounding in truth. The implication is not just occasional hallucination. It is structural unreliability over longer outputs.
This accumulation effect helps explain a pattern many organisations are now encountering. AI performs well in bounded tasks such as summarisation or drafting. As soon as tasks require extended reasoning, synthesis, or multi-step outputs, performance begins to degrade.
Attempts to mitigate this, such as breaking tasks into smaller components through agentic AI workflows, are already emerging in practice. These approaches reduce error propagation by limiting how far any single model runs unchecked. They improve outcomes, but they do not eliminate the underlying issue.
For organisations, the consequence is reliability is not a binary condition. It is a function of how long and how far a system is allowed to operate without constraint.
The challenge is not simply that AI makes mistakes. It is that those mistakes compound quietly, often beneath a layer of confidence that masks their presence.
That should reshape how we think about deploying these systems, particularly in environments where accuracy is non-negotiable.




Comments