Build With Moenu buildwithmoenu.com
Case Study

dataGenie — Multi-Agent Data Analytics Platform

MVP complete · local

Upload a CSV, ask in plain English, get back a chart, a written answer, and the SQL you can verify.

Overview

Project framing

A platform for the question 'what does this data say?' that doesn't require the asker to know SQL or even the column names. Upload a file, ask in plain English, get an interactive chart with a written answer and the underlying query you can verify. The agents handle the work, the user reads the answer.

Problem

Every non-technical teammate who needs data from a CSV either asks an engineer or fights Excel pivot tables. The engineering loop is slow; the Excel loop is error-prone and gets abandoned. The non-technical user ends up making decisions on intuition instead of data — not because they want to, but because the tooling demanded too much to use.

Role

Designer, engineer, evaluator. Solo build.

Why now

Two shifts made this newly possible: LLMs got good enough at writing correct SQL against schemas they were just shown, and in-process analytical databases (DuckDB) got fast enough that you don't need a data warehouse to query a CSV. Combine those, and a real user can go from 'I have a file' to 'I have an answer' without ever seeing a query.

Topline

Constraints

  • Shipped as a single 154-file monolithic commit — feature-complete MVP, zero production hours.
  • Multi-provider fallback only earns its keep if the simple path stays cheap; the router has to be honest about complexity.

Outcomes

  • Runs locally via `docker compose up`; full stack (FastAPI + Next.js 16 + DuckDB + Redis / Celery) wired and tested end-to-end.
  • 448 tests passing.
  • 10+ Plotly chart types; PNG / SVG / CSV / Excel / PDF export.
  • 1,193-line conversation engine tracking current dataset, columns, and prior results for follow-up queries.
Architecture

From file to answer

A non-technical user goes from 'I have a CSV' to 'I have an answer' without ever seeing SQL.

01
Step 1

Ingest

Upload CSV / Excel / PDF / connect a database. Parquet on disk, DuckDB columnar SQL ready immediately.

02
Step 2

Route

The complexity router classifies the question as simple, moderate, or complex — the routing decision is what makes the cost curve work.

03
Step 3

Execute (simple)

Direct DuckDB SQL — sub-second answers for counts, top-N, basic filters. No agents, no tokens spent.

04
Step 4

Execute (complex)

ReAct loop with five agents: Data Quality, Statistical, Visualization, Explanation, and Verification. Orchestrator coordinates.

05
Step 5

Verify

Verification agent cross-checks the others' work. Hallucinations get caught before they surface.

06
Step 6

Answer

Interactive Plotly chart + written explanation + confidence score + the SQL you can verify. Export to PNG / SVG / CSV / Excel / PDF.

Multi-turn conversation

1,193-line engine tracks current dataset, columns mentioned, and prior results so 'compare those regions' actually works.

OutcomeOne question, one chart, one written answer, one verifiable query.

Architecture narrative

A complexity router classifies the incoming question as simple, moderate, or complex. Simple goes straight to DuckDB SQL — sub-second answers for 'how many rows', 'top 10 by revenue', basic filters. Complex enters a ReAct loop with five specialist agents (Data Quality, Statistical, Visualization, Explanation, Verification) coordinated by an orchestrator. The Verification agent is the load-bearing one — it cross-checks the others' work before surfacing anything to the user. A multi-provider LLM layer (Claude / OpenAI / Ollama) with per-model token counting picks the cheapest provider that hits the quality bar, and a benchmark harness keeps that decision honest. Storage is DuckDB columnar SQL + Parquet on disk + SQLite metadata; analytics is pandas / scipy / statsmodels; visualization is 10+ Plotly chart types. A 1,193-line conversation engine tracks the current dataset, columns mentioned, and prior results so 'compare those regions' actually works.

System shape

Local-first analytics stack with a hybrid SQL / agentic execution layer and a multi-provider LLM fallback.

Input

File ingest

CSV / Excel / PDF / DB connection. Parsed to Parquet on disk.

Conversation engine

Tracks dataset, columns referenced, prior results, follow-up context.

Routing

Complexity classifier

Simple / moderate / complex — decides whether to spend tokens at all.

Execution

Direct DuckDB

Columnar SQL on Parquet. Sub-second for the simple 80%.

ReAct loop

Data Quality, Statistical, Visualization, Explanation, Verification — five specialist agents.

Multi-provider LLM

Claude / OpenAI / Ollama. Per-model token counting; benchmark harness picks cost-vs-quality.

Output

Plotly visualization

10+ chart types with interactive UI.

Written explanation

Plain-English answer plus confidence score and the underlying SQL.

Export

PNG / SVG / CSV / Excel / PDF.

Verification is load-bearing

The fifth agent's only job is to catch the other four. Multi-agent systems without a verification layer hallucinate confidently.

Hybrid routing

The simple path doesn't spend a single token. The complex path is allowed to spend many, because the routing decision earns that budget.

Key Decisions
1 / 3
Hybrid query routing instead of pure-SQL or pure-agent

Build two paths — direct DuckDB for simple queries, ReAct loop with five agents for complex ones — and a router that decides between them.

A single path is wrong both ways. Pure SQL can't handle 'compare Q3 vs Q4 trends, break down by region, explain what changed.' A ReAct loop for 'how many rows?' is absurd. The split keeps simple sub-second and spends tokens only where they earn ROI.

Two paths to maintain. Worth it because the router decision is also what makes the multi-provider fallback meaningful — cheap providers handle the easy 80%, expensive ones get reserved for work that needs them.
A dedicated Verification agent

Add a fifth agent whose only job is to cross-check the other four.

Multi-agent systems that don't verify their own outputs hallucinate confidently. Catching that has to be part of the system, not a hope.

Extra tokens on every complex query. Worth it because the alternative is silently wrong answers.
Local-first analytics stack

DuckDB + Parquet + pandas / scipy / statsmodels — all in-process, no data warehouse.

The platform has to work for someone who uploaded a CSV ten seconds ago. No setup, no auth to a warehouse, no waiting for a job to finish.

Won't scale past low-tens-of-GB datasets without rearchitecture. Fine for the MVP target user.
Struggles

Scope. Shipped as a single 154-file monolithic commit with 448 tests passing on my laptop and zero production hours. Feature-complete MVP with no real users.

No clean fix — this is a habit problem, not a bug. The recovery is visible in how I'm now shipping duSraBheja: smaller PRs, earlier deploys, letting real usage drive the next round of agents rather than designing them up front.
Learnings
Multi-agent systems need a verification layer before they need a sixth agent.
Hybrid routing is a product decision, not a technical one — it's about when to spend tokens.
Feature-complete-but-unreleased is a worse state than partially-shipped.
Demo