Jabirhusain K P jbrhsn

Jabirhusain K P

Senior Data Engineer at IBM · Bengaluru, India · Open to remote-first roles

4+ years designing and operating petabyte-scale Azure Databricks Lakehouse platforms at IBM for global enterprise clients. Fast-tracked from fresher to Senior Data Engineer in under 3 years by building infrastructure that creates lasting, measurable value.

I also build production-grade AI systems from scratch — not API wrappers. My self-engineered multi-agent platform (LangGraph, RAG, vector memory) runs 5 specialized autonomous agents with config-driven prompt architecture that eliminates hallucinations and cuts token cost dramatically.

What I have shipped

DataOps Observability Platform — Designed and built a unified monitoring system from scratch for 200+ pipelines across SAP Data Intelligence, Azure Data Factory, and Databricks. Centralized into a Power BI dashboard with automated Logic App alerting. Cut daily manual monitoring by 80% (1.0 FTE to 0.2 FTE). Adopted as the account-wide standard.

FinOps Optimization — Migrated short-duration pipelines from always-on interactive compute to ephemeral Job Compute nodes, and applied strategic Z-Ordering on SHA256 key columns across Delta tables. Delivered ~50% reduction in Databricks compute cost and ~30% reduction in storage spend — yielding €1,000+/month in recurring savings.

Enterprise AI Assistants — Independently built three domain-specialized AI assistants on the IBM Consulting Advantage platform (Databricks Transform Expert, User Story Generator, RAG-powered Operations Helper). Secured IBM Data Service Line-wide adoption. Recovered 50-60 engineering hours per month across the practice.

Aria — Multi-Agent AI Platform — Self-engineered production AI platform using LangGraph. YAML config-driven engine injects tailored prompts and tool sets per action at runtime. Async Plan-Execute-Evaluate-Respond ReAct graph with a self-evaluation node scoring output quality across 5 vectors. Hybrid sqlite-vec relational and semantic memory for long-horizon retrieval. Runs 5 specialized agents on cost-effective models via OpenRouter.

Data Corruption Recovery — Recovered 200M to 2B row production tables using Delta Lake time-travel and targeted partition-level reprocessing. Saved 70% compute cost and 30% engineering effort vs. a full-table rerun while keeping downstream SLA timelines intact.

Stack

Data platform Azure Databricks PySpark Delta Lake Unity Catalog Azure Data Factory SAP Data Intelligence Medallion Architecture Azure DevOps

Languages Python SQL PySpark

AI / GenAI LangGraph RAG Multi-Agent Systems Prompt Engineering sqlite-vec OpenRouter Docker

Analytics Microsoft Fabric Power BI Azure Monitor Azure Logic Apps

Certifications

Credential	Issuer	Valid
Databricks Certified Data Engineer Professional	Databricks	Jan 2027
Microsoft Certified: Azure AI Engineer Associate (AI-102)	Microsoft	Jun 2027
Microsoft Certified: Fabric Analytics Engineer Associate (DP-600)	Microsoft	Apr 2027
Microsoft Certified: Azure Data Engineer Associate (DP-203)	Microsoft	Retired

Currently building

Kafka + Spark Structured Streaming end-to-end pipeline with exactly-once semantics and schema evolution
dbt transformation layer over Lakehouse Gold
Terraform provisioning for full-stack data infrastructure

Senior Data Engineer roles · Data Platform Engineer · AI Infrastructure Engineer · Bengaluru hybrid / Kochi hybrid / Remote

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jabirhusain K P jbrhsn

Block or report jbrhsn

Jabirhusain K P

What I have shipped

Stack

Certifications

Currently building

Popular repositories Loading

Uh oh!