MUCS 2021 (Marathi)

MUCS 2021 (Marathi)

MH Specific

Multilingual and code-switching ASR challenge dataset with Marathi speech from diverse speaker groups (college students, rural/urban workers)

Build a competition-grade Marathi ASR model using MUCS 2021 challenge data for benchmarking against other systems.
Homepage

Quick Start

# MUCS 2021 Marathi ASR challenge data
# Download from https://navana-tech.github.io/MUCS2021/
print("Download MUCS 2021 Marathi data from:")
print("https://navana-tech.github.io/MUCS2021/")
Modality
Speech + Text
Size
~99 hrs (93.9 hrs train + 5 hrs test); 31 speakers
License
Format
WAV
Language
mr
Update Frequency
static
Organization
MediaEval / MUCS Challenge Organizers

Schema

FieldTypeDescription
audioaudioMarathi speech recording
textstringTranscription text

Build With This

Create a Marathi ASR model comparison framework using MUCS evaluation protocols and metrics
Develop an ensemble ASR system combining multiple models trained on MUCS data for improved accuracy
Build an ASR error analysis tool specifically for Marathi speech recognition challenges identified in MUCS

AI Use Cases

ASRAccent-Robust Speech Recognition
Last verified: 2026-03-07