Collection of annotated Indian identity document datasets on Roboflow Universe covering Aadhaar cards (2,645 images with field-level bounding boxes), Voter ID cards (1,274 images), PAN cards, and Driving Licenses. Annotations include object detection bounding boxes for key fields (name, number, date of birth, address, photo, gender). While not Marathi-specific, many documents contain Devanagari text fields. These are among the few publicly available annotated Indian government document datasets suitable for training field detection and extraction models.
# Download from Roboflow Universe
# Aadhaar: https://universe.roboflow.com/cutm-iwh4a/aadhaar-card-details
# Voter ID: https://universe.roboflow.com/ocr-aadhar/voter_id-qiygw
from roboflow import Roboflow
# rf = Roboflow(api_key="YOUR_KEY")
# project = rf.workspace().project("aadhaar-card-details")
# dataset = project.version(1).download("yolov8")
print("Roboflow Indian ID datasets: Aadhaar, PAN, Voter ID, DL")| Field | Type | Description |
|---|---|---|
| image | image | Identity document image |
| document_type | string | Document type (Aadhaar, PAN, Voter ID, Driving License) |
| field_bbox | json | Bounding box coordinates for each detected field |
| field_label | string | Field label (name, number, DOB, address, photo, etc.) |