Benchmarks, Tools & Dialects

Evaluation benchmarks, NLP toolkits, dialect resources, and fairness datasets for Marathi.

11 datasets

728 stereotypes with contrasts in parallel across 16 languages including Marathi. Annotated with regional and demographic features for evaluating LLM bias. The only bias/fairness evaluation dataset available in Marathi, critical for responsible AI development.

Build a fairness auditing tool for Marathi NLP models that measures bias across caste, religion, and gender dimensions.
text
LanguageShades

Marathi — Human-translated evaluation benchmark for machine translation covering 200+ languages including Marathi, with 3,001 sentences from diverse web articles

Benchmark Marathi machine translation quality against 200 languages using the FLORES-200 evaluation set.
Text (parallel, Marathi)
Meta AI

Python library for Indian language text processing including tokenisation, normalisation, script conversion, and transliteration with full support for Devanagari/Marathi

Build a Marathi text preprocessing pipeline using Indic NLP Library for tokenization, normalization, and script conversion.
Tools (Python)
AI4Bharat / Anuvaad

Marathi Subset — Natural language understanding benchmark for 11 Indian languages including Marathi, covering tasks like news categorisation, headline prediction, and paraphrase detection

Run comprehensive NLU benchmarks on Marathi models using IndicGLUE to identify areas needing improvement.
Text (Marathi)
AI4Bharat, IIT Madras

Natural Language Inference (NLI) dataset for 11 Indic languages including Marathi, created by high-quality machine translation of the English XNLI dataset. Contains premise-hypothesis pairs with entailment, contradiction, and neutral labels for evaluating Marathi language understanding.

Build a Marathi fact-checking assistant that uses natural language inference to verify claims against known facts.
text
AI4Bharat

Marathi — Comprehensive NLU benchmark of 9 tasks across 20 Indian languages including Marathi, covering classification, structure prediction, QA, and sentence retrieval

Evaluate Marathi language models on IndicXTREME's diverse task suite for comprehensive performance assessment.
Text (Marathi)
AI4Bharat, IIT Madras

Deep learning-based NLP library supporting Marathi with pre-trained language models, text generation, tokenisation, sentence embeddings, and data augmentation

Build a Marathi NLP application using iNLTK's pre-trained models for text generation and classification.
Tools (Python)
iNLTK Community

Comprehensive Marathi NLP library including MahaBERT, MahaAlBERT, MahaRoBERTa language models, MahaFT word embeddings, and tools for tokenisation, sentiment, NER, and hate speech detection

Build an end-to-end Marathi NLP pipeline using L3Cube models for text classification, NER, and sentiment analysis.
Models, Tools (Python)
L3Cube, Pune

Evaluation results and benchmark scores for MahaBERT (L3Cube) and IndicBERT (AI4Bharat) models on Marathi NLU tasks including sentiment, NER, and text classification

Build a Marathi model comparison framework using MahaBERT/IndicBERT benchmarks to guide model selection.
Benchmarks (tables)
L3Cube, Pune

Regional dialect data and linguistic documentation for major Marathi dialect varieties including Varhadi (Vidarbha), Malvani (Konkan coast), and Deshi (Western Maharashtra)

Build a Marathi dialect identification system that classifies text by regional dialect for sociolinguistic research.
Text (Marathi dialects)
Various Research Institutions

Translated MMLU (Massive Multitask Language Understanding) benchmark in 10 Indian languages including Marathi. Contains multiple-choice questions spanning science, humanities, social sciences, and more. Standard benchmark for evaluating how well Marathi LLMs compare to English ones.

Benchmark Marathi language models on MMLU-Indic to measure knowledge and reasoning capabilities in Marathi.
text
Sarvam AI