Upload a CSV and ask a question in plain English
The product starts from raw tabular data and the exact question the user wants answered.
Non-technical users shouldn't need to write SQL to explore their own data. dataGenie lets you upload a CSV, ask questions in plain English, and get back query results, charts, and written explanations - making data analysis as easy as having a conversation.
dataGenie is about lowering the skill barrier to data analysis without flattening the work into toy answers. I wanted a system where a non-technical person could ask a question in plain English and still get something that feels analytically serious.
Most internal analytics tools assume one of two extremes: either the user writes SQL or the product gives them a shallow dashboard that cannot answer new questions. I wanted a middle path: conversational access to real data with enough structure and guardrails to stay useful.
I built the backend, query-routing logic, LLM provider layer, data profiling flow, and the product framing for how non-technical users ask analytical questions.
The recent wave of LLM tooling made the interface side easier, but it also made me care more about routing, fallbacks, and quality boundaries. The interesting work was not 'chat with CSVs'; it was deciding when not to use an agent.
A conversational analytics prototype that routes between direct SQL and agentic reasoning.
A user brings tabular data, asks in plain English, and the product chooses the lightest reliable path to an answer.
The product starts from raw tabular data and the exact question the user wants answered.
Schema, nulls, and shape context are captured so the system knows what kind of data it is handling.
Easy asks should go straight to SQL; harder asks earn a multi-step reasoning loop.
The user gets an answer that feels analytical rather than like a generic chat response.
Direct SQL in DuckDB handles fast counts, filters, and aggregations.
A ReAct loop plans, queries, and synthesizes when the question needs more reasoning.
The core architecture is hybrid by design. Uploaded data lands in DuckDB after profiling. A lightweight intent layer decides whether the request is simple enough for direct SQL or complex enough to go through an agentic reasoning loop. The LLM layer sits behind a provider abstraction with fallbacks so the product can keep answering even when one provider degrades.
This is the secondary view: the system shape behind the flow above. It exists to explain the moving parts, not to substitute for the product story.
Simple questions should move fast. Complex questions should decompose before they answer.
plain-English analytics
tabular source data
schema, nulls, distributions
simple vs complex path
fast answers in DuckDB
plan, query, synthesize
Claude, OpenAI, Ollama
query result, explanation, viz
One path for every question either slows simple queries down or underpowers complex ones.
The model layer is abstracted so failures or rate limits do not collapse the product.
Started from the user problem: helping non-technical people query data without writing SQL.
The architecture matured when I stopped pretending all questions deserved the same execution path.
Fallback logic turned the system from a prototype into something that could survive real provider instability.
dataGenie started from a real frustration: every time a non-technical teammate needed data from a CSV, they'd either ask an engineer to write a query or struggle with Excel pivot tables. I wanted to build something where you could just ask "what were the top 5 products by revenue last quarter?" and get back a proper answer with a chart.…
dataGenie started from a real frustration: every time a non-technical teammate needed data from a CSV, they'd either ask an engineer to write a query or struggle with Excel pivot tables. I wanted to build something where you could just ask "what were the top 5 products by revenue last quarter?" and get back a proper answer with a chart.…
An ambitious, comprehensive AI-powered data analytics platform enabling users to upload datasets (CSV, Excel, PDF, databases) and explore them through natural language conversation. Ask questions like "Show me revenue t…
| Layer | Technology |, |-------|-----------|, | Frontend | Next.js 16, React 19, TypeScript, Zustand (state), shadcn/ui, Tailwind CSS 4