Roboflow Indian Identity Document Detection Datasets

Roboflow Indian Identity Document Detection Datasets

Collection of annotated Indian identity document datasets on Roboflow Universe covering Aadhaar cards (2,645 images with field-level bounding boxes), Voter ID cards (1,274 images), PAN cards, and Driving Licenses. Annotations include object detection bounding boxes for key fields (name, number, date of birth, address, photo, gender). While not Marathi-specific, many documents contain Devanagari text fields. These are among the few publicly available annotated Indian government document datasets suitable for training field detection and extraction models.

Build a multi-document Indian KYC processor that detects document type and extracts key fields from identity documents.
Homepage

Quick Start

# Download from Roboflow Universe
# Aadhaar: https://universe.roboflow.com/cutm-iwh4a/aadhaar-card-details
# Voter ID: https://universe.roboflow.com/ocr-aadhar/voter_id-qiygw
from roboflow import Roboflow

# rf = Roboflow(api_key="YOUR_KEY")
# project = rf.workspace().project("aadhaar-card-details")
# dataset = project.version(1).download("yolov8")
print("Roboflow Indian ID datasets: Aadhaar, PAN, Voter ID, DL")
Modality
Image (identity documents with bounding box annotations)
Size
2,645 Aadhaar + 1,274 Voter ID + smaller PAN/DL sets
License
Format
JPEG/PNG with YOLO/COCO/VOC annotations
Language
en, hi, mr
Update Frequency
static
Organization
Community (Roboflow Universe)

Schema

FieldTypeDescription
imageimageIdentity document image
document_typestringDocument type (Aadhaar, PAN, Voter ID, Driving License)
field_bboxjsonBounding box coordinates for each detected field
field_labelstringField label (name, number, DOB, address, photo, etc.)

Build With This

Create a Marathi-aware Aadhaar field extractor combining detection with Devanagari OCR for address fields
Develop a document verification system comparing extracted fields across Aadhaar, PAN, and Voter ID for consistency
Build a privacy-preserving document anonymizer that detects and redacts sensitive fields in Indian identity documents

AI Use Cases

Indian identity document field detectionAadhaar/PAN/Voter ID automated extractionDocument classification (ID type identification)KYC automation for Indian documents
Last verified: 2026-03-12