OpenSLR-64 (Marathi)

OpenSLR-64 (Marathi)

MH Specific

Crowdsourced high-quality multi-speaker Marathi speech corpus for TTS; female speakers only

Build a baseline Marathi ASR model using the OpenSLR-64 corpus for comparison with larger training sets.
HomepageHuggingFace

Quick Start

# Download from https://openslr.org/64/
import torchaudio
# audio, sr = torchaudio.load('openslr64/mr/audio_sample.wav')
print("Download OpenSLR-64 Marathi: https://openslr.org/64/")
Modality
Speech + Text (TTS)
Size
~3 hrs; 712 MB archive
License
Format
WAV
Language
mr
Update Frequency
static
Organization
OpenSLR / Google

Schema

FieldTypeDescription
audioaudioMarathi speech audio file (WAV)
textstringTranscription of the utterance

Build With This

Create a data augmentation pipeline that expands OpenSLR-64 with speed, pitch, and noise perturbations for robust ASR
Develop a Marathi phoneme-level acoustic model from the OpenSLR recordings for pronunciation research
Build a transfer learning study comparing ASR models pre-trained on OpenSLR-64 then fine-tuned on domain-specific data

AI Use Cases

TTSMulti-Speaker Voice Synthesis
Last verified: 2026-03-07