project_case_study
dataGenie - Multi-Agent Data Analytics Platform
**Resume (3 bullets):** - Non-technical users shouldn't need to write SQL to explore their own data. dataGenie lets you upload a CSV, ask questions in plain English, and get back query results, charts, and written explanations - making data analysis as easy as having a conversation. - Designed a hybrid query architecture that routes simple questions directly to SQL and sends complex, multi-step questions through an agentic ReAct loop that decomposes them into sub-tasks. This was the hardest design decision - a single path either over-engineered simple queries or couldn't handle complex ones. - Built the full backend: FastAPI with DuckDB for fast analytical queries, a multi-provider LLM layer with automatic fallback (Claude > OpenAI > Ollama), async task processing via Celery/Redis, and a profiling engine that auto-generates column-level data quality scores before any query runs. Tech: Python, FastAPI, DuckDB, Redis, Docker. **LinkedIn (longer form):** dataGenie started from a real frustration: every time a non-technical teammate needed data from a CSV, they'd either ask an engineer to write a query or struggle with Excel pivot tables. I wanted to build something where you could just ask "what were the top 5 products by revenue last quarter?" and get back a proper answer with a chart. The interesting engineering challenge was query routing. Simple questions ("how many rows?") don't need an agent - they map directly to SQL. But complex questions ("compare Q3 vs Q4 trends, break down by region, and explain what changed") need to be decomposed into sub-tasks. I built a hybrid architecture: an intent classifier decides the path, and complex queries go through a ReAct loop that plans, executes, and synthesizes across multiple SQL calls. The LLM layer supports Claude, OpenAI, and Ollama with automatic fallback chains - so if one provider is down or rate-limited, queries still work. Before any question is answered, a profiling engine auto-generates column-level stats (null…
2026-03-16 04:35 PM EDT