Skip to content

Half Marathon - Time Prediction and Statistics

Interactive Streamlit application that analyzes a runner's free-form description using GPT-4o-mini model – automatically recognizes name, age, gender, and 5K time. Validates and completes missing information with friendly AI prompts. Predicts individual half marathon time based on trained ML model (on real Wrocław Half Marathon data from 2023-2024, totaling over 20,000 participants from both years). Presents the result in readable format (hh:mm:ss) and statistics: pace per kilometer, average speed, and runner level category. Generates personalized, motivating summary and training tips using OpenAI. Anonymously logs data to Langfuse for model quality monitoring. Supports multi-level model sources (S3 → Supabase Storage → local) and is fully containerized (Docker). Users receive immediate, engaging forecast of their result and specific training advice – all in one elegant web interface. Currently, the application runs on Streamlit Cloud and Supabase Storage, originally operated on Digital Ocean.

Technologies and Libraries Used
* Python
* Streamlit
* Scikit-learn
* PyCaret
* CatBoost
* Joblib
* Pandas
* NumPy
* OpenAI GPT
* Langfuse
* AWS S3
* Supabase Storage
* DigitalOcean
* Docker
* Github
* boto3
* python-dotenv