Google Dakshina (mr) dataset for language nlp.
from datasets import load_dataset
ds = load_dataset('google/dakshina', 'mr', split='train')
for ex in ds[:5]:
print(f"Devanagari: {ex['native']}")
print(f"Roman: {ex['romanized']}\n")| Field | Type | Description |
|---|---|---|
| native | string | Text in Devanagari script |
| romanized | string | Text in Latin/Roman script transliteration |