Meta FLORES-200 (mr)

MH Specific

Meta FLORES-200 (mr) dataset for language nlp.

Evaluate and benchmark Marathi machine translation models using FLORES-200 as a standardized test set.

Quick Start

from datasets import load_dataset
ds = load_dataset('facebook/flores', 'mar_Deva', split='devtest')
for ex in ds[:5]:
    print(f"[{ex['id']}] {ex['sentence'][:80]}...")

Modality

parallel-text

Size

3,001 sentences, parallel across 200 languages

License

CC-BY-SA-4.0

Format

CSV/JSON

Language

Update Frequency

static

Organization

Meta AI

Schema

Field	Type	Description
sentence	string	Text sentence in Marathi (or source language)
id	int	Sentence identifier aligned across languages

Build With This

Create a Marathi translation quality leaderboard comparing commercial and open-source MT systems on FLORES-200

Develop a fine-grained translation error analysis tool that identifies specific error types in Marathi MT output

Build a parallel corpus augmentation pipeline using FLORES-200 as seed data for mining additional Marathi translation pairs

AI Use Cases

MT evaluation benchmark

Related Datasets

AI4Bharat BPCC (mr)

parallel-text

AI4Bharat IndicCorp v1 (mr)

text

AI4Bharat IndicCorp v2 (Marathi)

text

AI4Bharat IndicGLUE (mr)

text

Last verified: 2026-03-07