IL-TUR Indian Legal NLP Benchmark (Marathi)

Comprehensive Indian legal NLP benchmark with 8 tasks in Marathi including Legal NER, Rhetorical Role Prediction, Court Judgment Prediction, Bail Prediction, Legal Statute Identification, Prior Case Retrieval, Summarization, and Legal Machine Translation. Far more structured than raw court document text.

Build a legal document search engine for Maharashtra courts that enables semantic search across case law in Marathi.

Homepage HuggingFace Paper

Quick Start

# Access from https://github.com/Legal-NLP-EkStep
import json
with open('legal_marathi_ner.json') as f:
    data = json.load(f)
print(f"Total documents: {len(data)}")
for doc in data[:3]:
    print(f"Text: {str(doc)[:80]}...")

Modality

text

Size

Multi-task benchmark; 8 legal NLP tasks in Marathi

License

Research

Format

JSON

Language

mr, en

Update Frequency

static

Organization

Exploration Lab (ACL 2024)

Schema

Field	Type	Description
text	string	Legal text passage in Marathi or English
task_type	string	NLP task type (NER, classification, summarization)
label	string	Task-specific annotation or label

Build With This

Create an automated case similarity finder for Maharashtra lawyers that identifies relevant precedents from legal text

Develop a legal document summarizer that generates concise case briefs from lengthy Maharashtra court judgments

Build a legal entity extractor that identifies parties, judges, dates, and legal provisions from Marathi court documents

AI Use Cases

Marathi legal judgment predictionCourt document summarizationLegal named entity recognitionBail prediction for Maharashtra courtsLegal statute identification

Related Datasets

Census 2011 Maharashtra

Tabular (Excel, CSV, PDF)

Census 2011 Village Amenities - Maharashtra

tabular

Crime in India (NCRB)

Tabular (PDF, Excel)

Election Commission of India

Tabular (web, PDF)

Last verified: 2026-03-09