Machine learning
Foresee turns a CSV into an ML report in sixty seconds.
Upload a CSV. The system runs a full exploratory analysis on Snowflake while Gemini 2.5 reads the schema and ranks the top five columns most worth predicting. You pick one. Three models train, one after the other: a Logistic Regression, a Decision Tree, and an XGBoost. You get back a PDF with metrics, confusion matrices, feature importances, and a short business rationale.
The pitch is a minute. The hard part was making the minute feel honest.
How it actually works
The frontend is a React app that streams progress from a Flask API. The first thing the API does on upload is push the file into Snowflake and start a parallel pair of jobs: an EDA pass that walks every column and computes the usual statistics, and a Gemini call that summarizes the schema and proposes ranked targets with importance scores from one to one hundred.
The two jobs land at roughly the same time, which is the whole point. Sequential EDA and ranking would be fine for a demo. It would not be fine for the user. The model training itself runs in sequence once a target is picked, since the three models share preprocessing and the cost of doing them serially is small compared to the analysis phase.
The model menu
Three is a deliberate choice. One model feels like a guess. Five would push the report past one screen. Three covers the three things people usually want: a baseline they understand, a tree they can interpret, and a boosted ensemble for the score.
Each one trains, validates, and produces SHAP values for the top features. The PDF is rendered on the server with a templated layout so every report looks like it came from the same publication.
The whatBroke field below is a placeholder until I get the real text from Daniel.