Bharat Scene Text Dataset (BSTD)

MH Specific

Large-scale scene text dataset for 11 Indian languages plus English, sourced from Wikimedia images of Indian signboards and street scenes. Includes 5,113 Marathi word annotations with polygon bounding boxes

Build a Devanagari scene text recognition system for reading Marathi shop signs and street nameplates in urban Maharashtra.

Homepage GitHub

Quick Start

# Bharat Scene Text Dataset
import json
with open('bstd_annotations.json') as f:
    data = json.load(f)
print(f'Total images: {len(data)}')
for img in list(data.values())[:3]:
    print(f"Text regions: {len(img.get('annotations', []))}")

Modality

Image (scene text)

Size

6,582 images; 106K+ word instances

License

Apache-2.0 (images: CC BY-SA 4.0)

Format

PNG/JPEG

Language

Update Frequency

static

Organization

IIIT Hyderabad

Schema

Field	Type	Description
image	image	Street scene image containing text
bounding_boxes	list[object]	Text region bounding box coordinates
text	string	Recognized text content
script	string	Script type (Devanagari, Latin, etc.)

Build With This

Create a real-time Marathi signboard translator app that captures and translates Devanagari text from camera feeds

Develop an automated address reader for delivery services that extracts Marathi text from building photographs

Build a heritage site signage digitizer that reads and catalogs inscriptions from Maharashtra historical monuments

AI Use Cases

Scene text detectionscript identificationend-to-end OCR

Related Datasets

AIKOSH IIT Bombay Indic Datasets (IndiaAI)

multimodal

CHIPS - Corpus of Handwritten Indic Scripts (Page-Level OCR)

Image (full-page handwritten documents with detection + recognition annotations)

CMATERdb - Devanagari-Roman Mixed-Script Handwritten Documents

Image (handwritten mixed-script document pages with word-level annotations)

COCO Captions Marathi

Text (caption pairs)

Last verified: 2026-03-07