Marathi literature, media archives, tourism statistics, and heritage site datasets.
9 datasets
Maharashtra — List of 285 centrally protected and 244 state protected archaeological monuments and heritage sites in Maharashtra maintained by the Archaeological Survey of India
Movie and TV subtitle dataset in 10 Indic languages sourced from OpenSubtitles.org. Contains pre-processed dialogues in JSONL format. The only publicly available Marathi conversational/dialogue dataset, essential for training chatbots and conversational AI in Marathi.
Maharashtra — Indian National Trust for Art and Cultural Heritage listings of unprotected heritage sites, buildings, and cultural landscapes in Maharashtra
Official tourism data from the Maharashtra Tourism Development Corporation covering domestic and international visitor numbers, tourist destinations, and accommodation statistics
Historical and contemporary Marathi newspaper collections from major publications (Loksatta, Sakal, Maharashtra Times) available through digital archives
Full dump of Marathi Wikipedia articles providing encyclopaedic knowledge coverage across diverse topics in Marathi language
Marathi Collection — Digital library providing access to Marathi books, manuscripts, lecture videos, and research articles across multiple disciplines, with Marathi language interface
Maharashtra — Detailed documentation for Maharashtra's UNESCO World Heritage Sites including Ajanta Caves, Ellora Caves, Elephanta Caves, Chhatrapati Shivaji Terminus, and Victorian Gothic/Art Deco ensembles of Mumbai
Public domain Marathi literature including 1,000+ books from Maharashtra Granthottejak Sanstha, classical texts (Dnyaneshwari, Dasbodh, Haripath), and historical documents