
DATASET#speech#asr#odia#creative-commons
Odia Speech Corpus v2.1 — 2,000 hours
Annotated speech from 30 Odisha districts. Balanced for gender, age, dialect. Ready for ASR fine-tuning.
Bhubaneswar2,840 viewsPublished 12 Apr 2026
About this listing
A community-curated speech dataset spanning Ganjam, Sambalpur, Cuttack, Mayurbhanj and 26 more districts. Includes time-aligned transcripts in Odia script and IPA, speaker metadata, and noise-condition tags. Released under CC-BY-SA 4.0 for non-commercial research; commercial license available.