Marathi Alpaca Instruction Dataset

Marathi Alpaca Instruction Dataset

MH Specific

Marathi translation of the Stanford Alpaca instruction-tuning dataset for fine-tuning instruction-following capabilities in Marathi language models

Fine-tune a Marathi instruction-following LLM using this Alpaca-format dataset for building a Marathi AI assistant.
HomepageHuggingFace

Quick Start

from datasets import load_dataset
ds = load_dataset('ravithejads/marathi-alpaca')
for ex in ds['train'][:5]:
    print(f"Instruction: {ex['instruction'][:60]}...")
    print(f"Output: {ex['output'][:60]}...\n")
Modality
Text (Marathi)
Size
~52K instructions
License
Format
CSV/JSON
Language
mr
Update Frequency
static
Organization
Open-Source Community (Translated from Stanford Alpaca)

Schema

FieldTypeDescription
instructionstringTask instruction in Marathi
inputstringOptional input context for the task
outputstringExpected response in Marathi

Build With This

Create a Marathi chatbot for Maharashtra government services that answers citizen queries about schemes and procedures
Develop a Marathi writing assistant that helps with grammar correction, paraphrasing, and style improvement
Build a Marathi educational tutor that generates explanations, quiz questions, and study materials on demand

AI Use Cases

Marathi instruction fine-tuningconversational AI trainingtask-following model development
Last verified: 2026-03-07