Part of the largest publicly available dialect-rich read-speech corpus for Indian languages, comprising 10,000+ hours validated audio across 9 languages. Marathi subset covers agriculture and finance domains with dialect-aware phonetic lexicons and speaker metadata. Captures rural speech patterns that urban-centric datasets miss.
# RESPIN Marathi dialect speech corpus
# Access from https://respin.iisc.ac.in/
print("RESPIN Marathi dialect speech corpus")
print("Access from: https://respin.iisc.ac.in/")| Field | Type | Description |
|---|---|---|
| audio | audio | Dialectal Marathi speech recording |
| text | string | Transcription in standard Marathi |
| dialect | string | Dialect or regional variety identifier |