ISIDCHAR - ISI Kolkata Devanagari Character Database

ISIDCHAR - ISI Kolkata Devanagari Character Database

MH Subset Needed

Handwritten Devanagari character dataset from the CVPR Unit at Indian Statistical Institute (ISI), Kolkata. Contains 36,172 grayscale character images across 47 character classes covering all basic Devanagari consonants, vowels, and numerals. Collected from multiple writers with natural handwriting variation. One of the earliest and most cited Indian script character recognition benchmark datasets. Also includes a separate ISI Devanagari Numeral Database with 22,556 numeral images from 1,049 writers.

Benchmark modern deep learning character classifiers against this classic ISI Kolkata dataset.
Maharashtra subset not yet extracted. This is a global dataset that contains data covering Maharashtra. A regional subset can be extracted by filtering on geographic coordinates or administrative boundaries.
Homepage

Quick Start

# Request access from ISI Kolkata CVPR Unit
# https://www.isical.ac.in/~ujjwal/download/database.html
from PIL import Image
import numpy as np

# Load character images (47 class directories)
print("ISIDCHAR: 36,172 Devanagari character images, 47 classes")
print("ISI Numeral DB: 22,556 numeral images from 1,049 writers")
Modality
Image (handwritten character crops)
Size
36,172 character images (47 classes) + 22,556 numeral images (10 classes, 1,049 writers)
License
Format
PNG/BMP
Language
mr, hi, ne, sa
Update Frequency
static
Organization
CVPR Unit, Indian Statistical Institute (ISI), Kolkata

Schema

FieldTypeDescription
imageimageGrayscale handwritten Devanagari character image
character_classstringDevanagari character or numeral label
class_idintNumeric class identifier

Build With This

Create a comprehensive Devanagari character recognizer combining ISI data with DHCD for maximum writer diversity
Develop a confusable character pair analyzer identifying which Devanagari characters are most often misrecognized
Build a character-level pre-training pipeline that transfers ISI character knowledge to word-level Marathi OCR

AI Use Cases

Devanagari character classification benchmarkHandwritten numeral recognitionCharacter-level OCR pre-trainingWriter variability analysis
Last verified: 2026-03-12