AI Data Science Platform
8+ specialised AI agents for end-to-end data science
I built this solo at Fetch.ai. The idea: instead of writing data science code yourself, describe what you want in natural language and let a team of AI agents handle the pipeline. Each agent is specialised. A supervisor agent orchestrates them. Humans stay in the loop at every decision point.
It's live with data scientists at Bosch and Fetch.ai, and non-technical users at HR teams. People who'd never written a line of Python are getting Kaggle-level results in minutes.
Harness Engineering
This project is fundamentally about harness engineering: designing the environment, constraints, and feedback loops that let AI agents do reliable work. The human's job isn't to write code. It's to specify intent, set architectural boundaries, and build the scaffolding that keeps agents on track.
I built this before the term existed. The problems were the same: agents hallucinate, lose context, violate structure. I had to engineer 10+ guardrails, prompt injection protection, agent logging, enforced determinism, and orchestration patterns from scratch. What OpenAI now calls harness engineering, I was figuring out on my own.
The Agents
Six specialised agents handle every stage of the pipeline. A LangGraph supervisor decomposes high-level goals into multi-agent workflows, decides which agents to invoke and in what order, and manages state between steps.
Data Loader
Ingests CSV files, infers schemas, prepares datasets for downstream agents.
Data Cleaning
Generates Python/pandas code to fix missing values, handle outliers, normalize types.
Visualization
Generates interactive Plotly charts from natural language descriptions.
Feature Engineering
Writes scikit-learn transformation code based on statistical properties of the data.
ML Training
Orchestrates H2O AutoML runs and interprets leaderboard results.
Prediction
Runs inference on trained models and explains predictions.
How the LLM is Used
The LLM isn't a chat wrapper. It's the execution engine behind every agent. Three modes:
Coding
Writes executable code
Python/pandas to fix missing values, handle outliers, normalize types
scikit-learn transformation pipelines for encoding, scaling, derived features
Plotly chart specs from natural language, rendered in the browser
Reasoning
Analyses and decides
Detects data quality issues before any code runs
Picks chart types based on distribution, column types, cardinality
Interprets AutoML leaderboard and recommends the best model with reasoning
Orchestration
Coordinates agents
LangGraph supervisor decomposes goals into multi-agent workflows
Decides which agents to invoke, in what order, manages state
"Analyse this sales data" โ load โ clean โ visualise โ engineer โ train
Architecture
User (Natural Language)
|
v
Next.js Frontend (React 18, Tailwind CSS)
|
v
FastAPI Backend
|
v
LangGraph Supervisor (LLM orchestration)
|
+----+----+----+----+----+
| | | | | |
v v v v v v
Load Clean Viz Feat Train Predict
Agent Agent Agent Agent Agent Agent
| | | | | |
+----+----+----+----+----+
|
v
LLM โ coding + reasoning
|
H2O AutoML / MLflow / PostgreSQL / Redis Tech Stack
Next.js 14, React 18, TypeScript, Tailwind CSS, Plotly.js
Python 3.10, FastAPI, LangChain, LangGraph
OpenAI-compatible API
H2O AutoML, scikit-learn, XGBoost, MLflow
PostgreSQL, Redis, Celery, Docker (7 services)
How It Works
A full data science workflow in six steps. From raw CSV to trained model and predictions.
Upload
Drop a CSV. The data loader agent ingests it and infers the schema.
Clean
The cleaning agent generates Python code that fixes missing values, removes outliers, normalizes types.
Visualize
Describe what you want to see. The viz agent generates interactive Plotly charts.
Engineer
Features are transformed into ML-ready representations using generated scikit-learn code.
Train
H2O AutoML trains and evaluates models. The LLM interprets the leaderboard and recommends the best one.
Predict
Run inference on new data. Get predictions and explanations.
Links
Full source code and documentation