Speech-to-Task on Mobile-Friendly Neural Networks

Client: AI startup · August 2025

PythonWhisperGradioHugging Face

Live speech-to-task prototype built on Whisper Tiny, Deepgram, and Gemma 3n, with free-tier Hugging Face hosting and competition-ready technical documentation.

The client needed a working demonstration that small, mobile-friendly neural networks could turn free-form speech into structured tasks — and they needed it presentable for a Kaggle competition deadline.

I built a live prototype with a Gradio front end that pipes audio through Whisper Tiny and Deepgram for transcription and uses Gemma 3n for task extraction. The demo is configured for Hugging Face Spaces free-tier hosting, so it stays online long-term at no cost.

Alongside the prototype I prepared the technical documentation the team used for the competition, covering model selection trade-offs and the on-device constraints that drove them.

View live site ↗