L3Cube-MahaParaphrase dataset for language nlp.
from datasets import load_dataset
ds = load_dataset('l3cube-pune/MahaParaphrase')
for ex in ds['train'][:5]:
print(f"S1: {ex['sentence1'][:60]}...")
print(f"S2: {ex['sentence2'][:60]}...")
print(f"Paraphrase: {bool(ex['label'])}\n")| Field | Type | Description |
|---|---|---|
| sentence1 | string | First Marathi sentence |
| sentence2 | string | Second Marathi sentence |
| label | int | Whether the sentences are paraphrases (1) or not (0) |