UD Marathi-UFAL Treebank dataset for language nlp.
from datasets import load_dataset
ds = load_dataset('universal_dependencies', 'mr_ufal', split='train')
for ex in ds[:5]:
print(f"Text: {ex['text'][:60]}...")
print(f"POS: {ex['upos'][:8]}...\n")| Field | Type | Description |
|---|---|---|
| text | string | Marathi sentence text |
| tokens | list[string] | Word tokens |
| upos | list[string] | Universal POS tags for each token |
| deprel | list[string] | Dependency relation labels |