Marathi — Human-translated evaluation benchmark for machine translation covering 200+ languages including Marathi, with 3,001 sentences from diverse web articles
from datasets import load_dataset
ds = load_dataset('facebook/flores', 'mar_Deva')
print(f'Dev: {len(ds["dev"])}, DevTest: {len(ds["devtest"])}')
for ex in list(ds['devtest'])[:3]:
print(f'[{ex["id"]}] {ex["sentence"][:80]}...')| Field | Type | Description |
|---|---|---|
| sentence | string | Sentence in source/target language |
| id | int | Sentence ID aligned across 200 languages |