Python library for Indian language text processing including tokenisation, normalisation, script conversion, and transliteration with full support for Devanagari/Marathi
from indicnlp.tokenize import indic_tokenize
from indicnlp.normalize import indic_normalize
text = 'मराठी भाषा प्रक्रिया'
tokens = indic_tokenize.trivial_tokenize(text, 'mr')
print(f'Tokens: {tokens}')| Field | Type | Description |
|---|---|---|
| function | string | NLP function name (tokenize, normalize, transliterate) |
| language | string | Supported language code |