I am a Data Engineer specializing in architecting massive-scale data infrastructure, streaming pipelines, and AI-integrated systems. I focus on building robust, fault-tolerant platforms using Rust, Python, and Google Cloud.
- ποΈ Currently: Architecting multi-tenant clinical data platforms with end-to-end ETL ownership, processing 1M+ weekly records at Centauri Health Solutions.
- π Recently: Won 2nd Place at MLSys 2026 (Track B) for building a neurosymbolic multi-agent pipeline for tensor DAG scheduling, achieving a 7.77x speedup over baselines.
- π Background: MS in Data Science from the University of Arizona | 2x Google Cloud Certified.
| Domain | Key Technologies | Active Projects |
|---|---|---|
| Distributed Systems | Rust, gRPC, Tokio | WAL Consensus, GitCortex |
| AI Data Platforms | Python, LLMs, GCP | Agentic Scheduler, MCP Server |
| Streaming Pipelines | Apache Beam, Spark | Clinical Data Platform (CLARION) |
A branch-aware code knowledge graph built with Rust, tree-sitter, KuzuDB, and gRPC. Utilizes an MCP server for AI assistants to perform blast-radius analysis.
Neurosymbolic multi-agent pipeline for tensor DAG scheduling on memory-limited hardware. An LLM proposes op fusion and traversal structures, while a deterministic solver enforces constraints to achieve a 7.77x aggregate speedup.
Engineered a Write-Ahead Log with full Raft consensus in Rust and Tokio. Features a crash-safe segment engine running at 429 MiB/s, log compaction, and Prometheus metrics tracking.
An MCP server and CLI allowing AI assistants to generate ready-to-extend Apache Beam pipeline templates with a single command, standardizing CI/CD and deployment for GCP Dataflow.



