UD Marathi-UFAL Treebank

UD Marathi-UFAL Treebank

MH Specific

UD Marathi-UFAL Treebank dataset for language nlp.

Build a Marathi grammar checker using dependency parsing from the Universal Dependencies treebank.
HomepageGitHub

Quick Start

from datasets import load_dataset
ds = load_dataset('universal_dependencies', 'mr_ufal', split='train')
for ex in ds[:5]:
    print(f"Text: {ex['text'][:60]}...")
    print(f"POS: {ex['upos'][:8]}...\n")
Modality
text (CoNLL-U)
Size
466 sentences, 3,506 tokens
License
Format
CSV/JSON
Language
mr
Update Frequency
static
Organization
UFAL, Charles University

Schema

FieldTypeDescription
textstringMarathi sentence text
tokenslist[string]Word tokens
uposlist[string]Universal POS tags for each token
deprellist[string]Dependency relation labels

Build With This

Create a Marathi syntactic complexity analyzer for readability scoring of educational and government texts
Develop a Marathi POS tagger and parser for integration into downstream NLP pipelines
Build a Marathi sentence simplification tool that restructures complex sentences based on dependency parse analysis

AI Use Cases

POS taggingdependency parsing
Last verified: 2026-03-07