AI4Bharat IndicSentenceSummarization (mr) dataset for language nlp.
from datasets import load_dataset
ds = load_dataset('ai4bharat/IndicSentenceSummarization', 'mr', split='train', streaming=True)
for i, ex in enumerate(ds):
print(f"Original: {ex['text'][:80]}...")
print(f"Summary: {ex['summary'][:80]}...\n")
if i >= 4: break| Field | Type | Description |
|---|---|---|
| text | string | Source Marathi text to be summarized |
| summary | string | Condensed summary of the text |