450-hour annotated dataset of Hindi-Marathi code-switching speech, including tag-switching, intra-sentential, and inter-sentential code-mixing patterns. Designed for automatic speech recognition in multilingual contexts common in Maharashtra where Hindi-Marathi mixing is prevalent.
# Hindi-Marathi code-switching ASR dataset
# Access from respective research paper/repository
print("Hindi-Marathi code-switching ASR dataset")
print("Check paper references for download instructions")| Field | Type | Description |
|---|---|---|
| audio | audio | Code-switched Hindi-Marathi speech audio |
| transcription | string | Transcription with language tags |