#自然语言处理#Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
翻译 - 用于印度语言的Natural Language Toolkit旨在为应用程序开发人员可能需要的各种NLP任务提供开箱即用的支持
Open source speech to text models for Indic Languages
#安卓#A privacy aware versatile keyboard for Android, supporting 23 languages and 60 layouts. Mirror of original GitLab repository - https://gitlab.com/indicproject/indic-keyboard
Repository containing experimentation platform on how to train, infer on wav2vec2 models.
Anek is a variable type-family which supports nine Indian scripts plus Latin in two (weight & width) axes.
Anuvaad - Open Sourced Document Translation Platform for Indic Languages
#自然语言处理#OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes
#自然语言处理#A directory of Indic (Indian) language computing resources.
#自然语言处理#State-Of-The-Art & ready to use mini NLP models for Indian Languages
#自然语言处理#Software and Resources for Mitigating Online Gender Based Violence in India
Finite-state script normalization and processing utilities
#自然语言处理#A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanagari script.
Web Interface for Transliteration for Indic languages.
Code for the ACL 2020 Paper on Schwa Deletion in Hindi and Punjabi
MILU (Multi-task Indic Language Understanding Benchmark) is a comprehensive evaluation dataset designed to assess the performance of LLMs across 11 Indic languages.
Machine Translation from English to Odia language.
Arima TTS is a product of base project Arima. Arima TTS is a text to speech synthesizer for one of the world's eldest language TAMIL.