If You Want AI to Work, Start by Redesigning Your Data
MUMBAI, IN / ACCESS Newswire / October 30, 2025 / Boards keep asking, "What's our AI strategy?" The better question is, "What's our data strategy for AI?" Most enterprises still capture information the way yesterday's BI tools wanted it - transactional rows tucked neatly into normalized tables. That is the wrong shape for modern AI. If you want reliable copilots, recommendations, and autonomous workflows, you must design the data with AI in mind. Data Science & AI leader Archana Acharya argues that machines learn best from stateful, event-driven signals and that companies should "shift information design left" so product and engineering define events before code ships. It's not a governance tax; it's an operating-model change.
Why the old data playbook fails AI
Large language models (LLMs) don't consume dashboards; they ingest tokens and embeddings - text chopped into units and mapped into dense vectors that capture semantic meaning. This is how a model "knows" cart abandoned is close to checkout dropped. If your enterprise signals are missing, late, or ambiguous, the model cannot reason about your business with any reliability.
Relational schemas optimized for storage minimization and post-hoc reporting were great for BI. For AI, the bottleneck is no longer storage; it's compute, especially inference at scale. That flips the design goal from "store once, join later" to "capture events richly, compute efficiently." Analysts and operators increasingly report that ongoing inference (the model's day-to-day work) is the dominant cost driver once pilots become products. McKinsey's recent analysis, for example, highlights inference as a rising line item particularly for reasoning-heavy models, while industry practitioners warn of runaway unit costs without deliberate design.
Design for events, not just tables
Event-driven data treats every meaningful change in your business as a first-class record: Session Started, Item Added to Shopping Cart, Loan Prequalified, Claim Submitted, Payment Failed. Instead of reconstructing state from a thicket of joins, AI systems read a clean, chronological story of what actually happened. This is the essence of event sourcing persisting changes as an append only sequence and derive state on demand. It's simpler to reason about, easier to replay, and dramatically better for training and evaluating models.
Archana Acharya's guidance turns this into concrete design work: define meaningful events; standardize schemas and metadata (traceability, routing, lineage); assign unique identifiers; and build for idempotency and asynchronous flows so replays don't corrupt state. In short, make it easy for machines to consume your business as a sequence of unambiguous signals.
Shift information design left
We've long "shifted left" for security and testing. Do the same for data. Instead of instrumenting after launch, product managers and engineers should collaborate on tracking plans during design decide what events matter, who owns them, and how they'll be versioned. Archana Acharya calls this the shortest path to trustworthy AI and details how event design becomes the connective tissue between business questions and model inputs. Her direction piece is explicit: machines thrive on event-driven data, and thoughtful tracking plans are the jet fuel.
Compute economics demand compute-friendly data
Storage continues to remain cheap while compute explodes once AI usage scales. That means optimizing data so models can answer with minimal runtime work. Industry evidence shows why this matters: as organizations operationalize LLMs, inference unit costs become the daily lever on AI economics; models with advanced reasoning capabilities carry significantly higher inference costs; and academic work on energy use finds inference dominates lifecycle costs in deployed systems because of the multiplicative effect of queries.
Archana Acharya's rule of thumb is blunt: with storage cost being minimal and AI/ML compute being the largest cost driver, store data in a compute-friendly way. Fewer ad-hoc cross-service joins at inference time; more pre-digested signals your models can use instantly.
From data lakes to AI factories
Leaders converge on the same pattern: event-driven data + domain ownership (data mesh) + real-time pipelines. Data mesh reassigns accountability from a central team to domain teams that publish well-documented data products are exactly what AI consumers need. Zhamak Dehghani distills mesh into four principles: domain-oriented decentralized ownership, data as a product, self-serve data infrastructure, and federated computational governance. Treat events and features as products with contracts, not as exhaust to be cleaned later.
If that sounds abstract, consider the real-world benchmark. When the SARS-CoV-2 genome was released on January 10, 2020, Moderna had a vaccine design within days and shipped the first clinical batch around 42 days later. The speed owed to years of platform investment: standardized data, reusable pipelines, and automated decisioning are the hallmarks of an AI factory.
Archana Acharya's "AI factory blueprint" maps closely to this: event-driven architecture, mesh-style domain ownership, and governance embedded in schemas and lineage. Companies like Intuit, Amazon, Netflix, Moderna, and Fidelity illustrate the pattern.
What "AI-ready data" looks like (executive checklist)
Events > rows. Model the business as events with immutable records and clear semantics. Maintain an append-only event store.
Contracted schemas. Versioned, documented event schemas with owners, backward-compatibility rules, and validation.
LLM semantics. Remember that models consume tokens and embeddings; design events that translate cleanly into semantic vectors (stable IDs, consistent labels, compact fields).
Rich metadata. Routing hints, provenance, and lineage baked into each event for traceability and governance.
Idempotency & replay. Design so reprocessing the same events never corrupts state; make replay a first-class operation.
State tracking. Use event sourcing to reconstruct state; avoid inference-time fan-out joins across multitude of data systems.
Compute-aware storage. Format/partition for common reads; push heavy joins to batch precomputation or feature stores.
Mesh the domains. Treat event streams and features as data products owned by the teams who know them.
A 90-day plan for CIOs and product leaders
Days 1-30: Inventory & intent. Pick 3-5 priority AI use cases (agentic support, personalization, fraud triage). For each, draft a tracking plan: 10-30 events, owners, schemas, and success metrics. Instrument one end-to-end user journey.
Days 31-60: Platform & contracts. Stand up an event bus and event store with a schema registry, validation on write, and dead-letter queues. Enforce idempotency keys and formal replay procedures. Publish docs and SDKs so product teams ship events correctly the first time.
Days 61-90: Ship & measure. Build a thin feature layer or retrieval index that consumes your events; wire a pilot LLM/RAG or model to it. Instrument compute KPIs (tokens per request, p95 latency, inference $/1K interactions) and data quality KPIs (schema coverage, event timeliness, lineage completeness). Validate that AI outcomes improve as event quality improves.
What to stop doing
Treating the data warehouse as the single source of truth for operational AI.
Launching copilots without contracted events and lineage then blaming the model when answers drift.
Measuring success solely on model accuracy; track data freshness, schema adherence, and compute efficiency alongside precision/recall.
The payoff
Designing data for AI is not a side quest. It's the fastest, least risky path from dazzling demos to dependable capability and the most effective lever on unit economics. When events are clean and compute pathways are predictable, AI becomes less magic and more machinery. As Archana Acharya puts it, the hunt for "data quality" is really a hunt for better data design and the place to start is with the business events you already own. Capture them well, and the models will do the rest.
Key sources
Archana Acharya, Designing AI-friendly data to fuel next-gen AI models & use-cases (Ai4 talk deck).
Archana Acharya, "Information design - Event-based tracking plan development & management is the jet fuel for our AI models," Medium, Jul 23, 2024. Medium
OpenAI documentation on embeddings (how models encode meaning). OpenAI Platform
Martin Fowler, "Event Sourcing" and "What do you mean by Event-Driven?" martinfowler.com+1
Zhamak Dehghani, "Data Mesh Principles and Logical Architecture." martinfowler.com
Evidence on inference economics and scale. McKinsey & Company+1
Media Contact:
Manish Bhattacharjee
Brown rich media
7002028840
SOURCE: Brown rich media
Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact [email protected]

