Engineering Lead · Data & AI Platforms at the desk · — IST

Aaditya
Gupta.

I build the unglamorous machinery that lets Data & AI products actually ship — at scale, on budget, in regulated environments.

Currently — shipping AI products that survive their second year. writing eval harnesses that tell the truth. cutting LLM cost without cutting quality. defending multi-tenancy boundaries. arguing that data quality is still the bottleneck. preferring boring infrastructure that doesn't break. building the data foundations the AI era still needs. shipping production AI inside regulated boundaries. thinking about cost-aware AI architecture. reading Designing Data-Intensive Applications.

Based: India · NCR
Now: Senior Associate L2, Publicis Sapient
Focus: Data platforms, multi-tenant AI
Open to: Leadership & consulting

Currently shipping a multi-tenant AI platform ✶ Reading Karpathy & Cherny threads on shipping with AI ✶ Thinking about cost-per-correct-answer, not cost-per-token ✶ Watching agentic eval methods mature ✶ Last shipped aggregated data product · May '26 ✶ Working on tenant-aware model routing across the LLM stack ✶ Open to data & AI leadership conversations ✶ Currently shipping a multi-tenant AI platform ✶ Reading Karpathy & Cherny threads on shipping with AI ✶ Thinking about cost-per-correct-answer, not cost-per-token ✶ Watching agentic eval methods mature ✶ Last shipped aggregated data product · May '26 ✶ Working on tenant-aware model routing across the LLM stack ✶ Open to data & AI leadership conversations ✶

i. ✶ Preface

A short introduction.

✶About

For seven years I've worked on the less-visible side of software — pipelines, schemas, warehouses, tenancy boundaries, and the architecture decisions that quietly determine whether a system survives its second year. That data engineering foundation is what I'm bringing to the current era of AI products: they need clean ingestion, sane evaluation, and predictable cost just as badly as any analytics platform ever did. Today I lead teams building these platforms for healthcare, pharma, and enterprise analytics.

That foundation came from years on sales analytics and payment-integrity programs measured in tens of millions of dollars — HIPAA-compliant pipelines, data quality systems, on-prem-to-cloud migrations. It's also what's let me co-architect a multi-tenant competitive-intelligence platform: multiple services & AI agents, a multi-quarter scope shipped in eight weeks of focused delivery. I care about clean data, evaluation harnesses that tell the truth, cost-aware AI, and engineers who feel trusted enough to ship.

7 years data engineering 6 engineers led ~90% data quality lift, DQMS on Databricks AI PLATFORM

ii. ✶ Trajectory

Seven years, four chapters.

From building data pipelines to designing the platforms they live in — a chronological account of the work, the teams, and the wins worth remembering.

May 2025 — Present

Engineering Lead, Data & AI Platforms

Publicis Sapient · Senior Associate, Data Engineering L2
Leading data engineering and AI platform delivery across two enterprise client engagements in healthcare and pharma analytics — owning architecture decisions, sprint scoping, and cross-functional collaboration with product, compliance, and US-based client-partner teams.
- Multi-source patient analytics consolidation — unified five disparate clinical, pharmacy and spend sources into a single Snowflake-native data product, collapsing analyst time-to-insight from days to seconds.
- Streaming MDM crosswalk on Kafka + Databricks for insurance and member identity — established the publish-subscribe pattern as a reusable engagement asset.
- Two production AI services shipped on a 15-microservice multi-tenant SaaS platform, plus platform-wide cost & quality patterns (see project below).
June 2022 — March 2025

Data Engineering Lead, Analytics Programs

ZS Associates · Business Technology Solutions Consultant
Led an engineering team of six across two flagship programs — Sales Analytics Growth Engine and Payment Integrity Analytics — driving roadmap, sprint planning, mentorship, and end-to-end delivery against multi-million-dollar business outcomes.
- Architected scalable, HIPAA-compliant platforms on Azure / Databricks, including end-to-end cloud migration of legacy on-prem systems with 100% data consistency via Control-M, Jenkins, UCD.
- Data Quality Management System on Databricks improved data accuracy by ~90% across multiple datasets — became the standard layer for downstream business reporting.
- Strategic alt-architecture proposal aligned with the client's future roadmap — saved ~2 months of implementation time while preserving project scope and outcomes.
Nov 2021 — June 2022

Senior Data Engineer

Mindtree Limited · Senior Software Engineer
Built and optimized ETL pipelines and validation frameworks for production-grade ingestion across multiple sources — the kind of work that gets noticed only when it stops failing at 3am.
- ~60% faster processing — rebuilt pipelines in PySpark + Pandas on Azure Databricks, with consistency holding across ingestion sources.
- Automated scheduled loads via Python and Azure Data Factory, ensuring timely ingestion for downstream web applications.
- Cross-source validation framework improving data accuracy and reliability across the stack.
July 2019 — Oct 2021

Data Engineer, Cloud & Search

Tata Consultancy Services · System Engineer
The first chapter — Azure data solutions on Cosmos DB, ADF and Cognitive Search, with a steady migration from Pandas to PySpark as workloads grew.
- Pandas → PySpark migration — improved memory efficiency and scalability for large datasets.
- Multi-source real-time pipelines reduced manual intervention by ~80% and improved reliability through automated monitoring and alerting.

iii. ✶ Selected Work

A few things worth describing.

Three programs shipped end-to-end — each measured against business outcomes, each with its own architectural argument.

2026

Pharma & CPG · Multi-tenant SaaS

Multi-Tenant AI / Competitive Intelligence Platform

Co-architected a multi-tenant competitive media intelligence platform on Azure — fifteen FastAPI microservices, seven AI agents, forty-four ADRs — shipping a multi-quarter scope in eight weeks of focused, production-grade delivery. Owned the data ingestion service (medallion architecture, scale-to-zero Container Apps Jobs, 4-layer dedup) and the multimodal AI decomposition service (Video Indexer + ffmpeg + GPT-5.1 Vision) that replaced a proprietary vendor — the kind of work that only holds together when the data engineering underneath is right.

Azure Container AppsKEDAEvent HubsFastAPIGPT-5.1ClaudeBicepTerraformRedisOPA

$1.5–2kmonthly LLM savings per tenant, via tenant-aware ModelRouter.
60–80%LLM cost reduction from a Redis-cached Content Fragments layer.
99.6%creative-asset extraction (lifted from 10%); <0.1% duplicates on 1.6M+ daily records via a 4-layer dedup.
87%test coverage across 530+ tests, plus 16 LLM-as-judge eval scenarios.

2023 — 25

Healthcare benefits · Enterprise analytics

SAGE — Sales Analytics Growth Engine

Led a six-engineer team across an enterprise platform that contributed to multi-million-dollar account acquisitions and an estimated $30–50M in annual revenue growth for clients. Owned end-to-end cloud migration from on-prem to Azure with 100% data consistency, and built the dashboard surfacing Form 5500, Dun & Bradstreet, and proprietary datasets for targeted prospecting.

DatabricksADLSControl-MJenkinsUrbanCode DeployPower BIShell

$30–50Mest. annual revenue growth for clients, attributed to SAGE-driven account acquisition.
~90%data accuracy lift via the Data Quality Management System on Databricks.
4h → 50mcore pipeline runtime cut through Databricks Workflows + shell automation.
100%data consistency through legacy on-prem → Azure migration.

2024 — 25

Healthcare payments · HIPAA-compliant

Payment Integrity Analytics Engine

Led development of a HIPAA-compliant payment integrity analytics platform on Azure + Databricks — a unified view of pre- and post-payment savings across vendor partners, projected to deliver $7M in vendor-fee and commission savings. Built a standardized ingestion framework consolidating multiple disparate payment sources into a single decision-grade dashboard for executive stakeholders.

DatabricksADLSAzurePower BIHIPAA controls

$7Mprojected vendor-fee & commission savings.
1unified executive view across previously disparate vendor sources.
PHIde-identification & re-identification controls integrated directly into pipeline architecture.
Audit-grade orchestration via Databricks Workflows for full HIPAA compliance.

iv. ✶ Toolkit

What I reach for.

A working list, in plain prose. Anything italicized is something I'd happily lead an architecture conversation about tomorrow morning.

✶Skills

Programming

Python· PySpark· SQL & T-SQL· Shell scripting

Data Platforms

Databricks· Snowflake· Azure Data Lake (ADLS)· Azure SQL· Cosmos DB· Teradata· Hive

Streaming & Orchestration

Kafka· Azure Data Factory· Databricks Workflows· Logic Apps· Control-M· Event Hubs

AI / LLM Engineering

Multi-agent systems· RAG patterns· LLM-as-judge evaluation· Tenant-aware model routing· Prompt engineering· Azure AI Video Indexer· Azure AI Speech· GPT-5.1· Claude APIs· AI-augmented engineering workflows

Cloud & Infrastructure

Azure (Container Apps, KEDA, Synapse, ADF)· AWS (Redshift, Glue, S3)· Bicep / Terraform· OPA

DevOps & Quality

GitHub· Azure DevOps· CI / CD· Jenkins· UrbanCode Deploy· Automated testing· AI-assisted code review

Reporting

Power BI· Tableau

Leadership

Engineering leadership & mentorship· Multi-tenant SaaS patterns· Data modelling & pipeline design· HIPAA-compliant data engineering· Stakeholder & client engagement

✶Credentials

Education

B.Tech, Computer Science — Manav Rachna University (2015–2019)

Certifications & Publications

SQL Advanced Certification — TechGig
Databases & SQL for Data Science — IBM
Data Science in Python — Univ. of Michigan
Python Specialization — Univ. of Michigan
AI for Everyone — deeplearning.ai
Python & RDBMS Module — Infosys
“Ground Water Quality Monitoring Using Wireless Sensors and Machine Learning” — IEEE
“Role of Hybrid Neural Network in Bankruptcy Prediction” — IEEE, accepted

✶v. Coda

Let's talk.

Open to data engineering leadership roles, data & AI platform consulting, and the occasional deep-dive conversation about pipeline architecture, evaluation harnesses, or multi-tenancy boundaries.

aadi24.gupta@gmail.com

✶ Set in Fraunces & Manrope. Hand-built, no frameworks.