The Data You Trust Has a Past You Need to Examine

aiomniconversation
Apr 30
4 min read

The systems we build carry the history of the data we feed into them. That is not a design flaw waiting to be patched. It is a structural reality that every leader deploying AI in their organisation needs to understand. Facial recognition that fails to identify Black women accurately, a clinical tool derived from measurements of 2,000 white men, policing algorithms built on decades of racially skewed stop-and-search records: these are not

isolated examp

les of technical imperfection. They are the consequence of building powerful systems on foundations that were never neutral to begin with.

I explored this in depth in my most recent conversation on The AI Adoption Podcast with Dawn Butler, Member of Parliament for Brent East and a member of the Speaker's AI Commission.

The History Embedded in Every Data Set

Dawn made a distinction that I think is worth sitting with. Most discussions of AI bias treat it as a statistical problem: skewed training data producing skewed outputs, amenable to correction through better sampling or algorithmic adjustment. Dawn's framing is more searching than that. She argues that every data set carries a history, and that if you trace the history of policing data, for instance, you find it is built on a foundation of structural racism.

In England and Wales, Black people are 8 to 12 times more likely to be stopped and searched than their white counterparts. Officers are statistically more likely to report smelling cannabis in a vehicle driven by a Black man, even when no cannabis is present. These are the records from which predictive policing tools learn. The system does not introduce the bias; it inherits it and then scales it. When facial recognition is deployed in predominantly Black crowds, Dawn noted, the technology becomes substantially less reliable. The error is not random. It falls on the same communities that have historically borne the burden of discriminatory policing.

The healthcare example she raised is equally instructive. Body mass index, used widely in clinical decision-making and increasingly embedded in digital health tools, was derived from measurements of roughly 2,000 white men by a mathematician who never intended it for medical application. It has been adapted and adjusted over time, but it remains a proxy built on a deeply unrepresentative foundation. The concern for leaders is not simply about race and fairness, though those concerns are serious enough. It is about the reliability of the tools on which consequential decisions rest.

Kate Jones https://open.spotify.com/episode/1LqAcRFxrA1uScZWOeNRdh?si=CfS_db5QQ6GWrTFVkYlbAQ

Regulation Is Not the Enemy of Progress

One of the most persistent arguments against AI regulation is that it constrains innovation and places domestic organisations at a disadvantage relative to less regulated competitors. Dawn dismissed this directly, and her reasoning is worth attending to.

She argued that a robust ethical and regulatory framework will, over time, become a mark of quality that organisations and citizens actively seek out. People will look for systems they can trust. Jurisdictions that establish clear standards will attract organisations that want to build on credible foundations. The analogy she offered is the seatbelt: the requirement to fit and wear seatbelts did not prevent the development of the motor car. It made the motor car safer and, ultimately, more widely trusted.

The counter-argument deserves acknowledgement. Regulatory frameworks that are poorly designed, too slow-moving, or jurisdictionally fragmented can impose costs without delivering protection. Dawn's response is not that all regulation is sufficient but that regulation with teeth, the kind that results in meaningful financial penalties for organisations that knowingly deploy biased or harmful systems, is a necessary condition for accountability. Without it, the incentive structure favours speed and market capture over care.

Practical Implications for Leaders

For the senior leaders listening to this podcast, the practical question is not abstract. It is: do you know the history of the data that powers your systems?

Most organisations cannot answer this fully, because the opacity of commercially supplied AI tools makes it genuinely difficult. But that opacity is itself a governance question. Demanding transparency from technology suppliers, asking what training data was used and what its provenance was, is no longer optional for organisations that take governance seriously. The regulatory environment will tighten. The reputational exposure from deploying systems that produce discriminatory outcomes is real and growing. Leaders who treat data provenance as a technical detail delegated to their technology teams are accepting a risk they may not have fully priced.

Dawn also raised a point about democratic accountability that goes beyond organisational governance. Organisations openly selling AI-powered election influence services represent a challenge to the foundations of democratic life. The Speaker's AI Commission is working to address this within Parliament, including protecting the content of parliamentary speeches from AI manipulation. But the business world is not separate from this. The tools, the data, and in some cases the commercial incentives behind AI-enabled disinformation run through private organisations. Leaders have a stake in this, and a responsibility.

Josh Tan https://open.spotify.com/episode/63tXIWUJBWH9luCVQrGcaR?si=NZR-sEVPRDWxlLPjXXZoHg

The Question Every Leader Should Be Asking

Dawn ended our conversation by asking what it means to be human in an age of artificial intelligence. It is a question that sounds philosophical but has very direct operational implications. If kindness, judgement, and genuine understanding cannot be replicated, then the organisations that invest in those distinctively human capabilities alongside their technology will be better placed than those that treat human input as a cost to be automated away.

The history inside your data is not someone else's problem. It is your accountability. Trace it.

Listen to the full conversation on

Spotify: open.spotify.com/show/296zibtjU3w4ANuUsPSu2D

Listen on Apple Podcasts: podcasts.apple.com/gb/podcast/the-ai-adoption-podcast/id1811897501

The Data You Trust Has a Past You Need to Examine

Recent Posts

Comments