Shillong, Nov 24: MWire Labs, a technology startup based in Shillong, has announced the release of NE-BERT, an AI model built specifically for the languages of Northeast.
The new model supports nine languages: Khasi, Garo, Pnar, Assamese, Manipuri (Meitei), Mizo, Nyishi, Kokborok, and Nagamese.
A key feature is the inclusion of Pnar. Until now, Pnar was not included in any major multilingual AI model. NE-BERT closes this gap, making Pnar accessible for developers to build translation apps, spell-checkers, and educational software for the first time.
“Unlike global AI models which primarily focus on English or Hindi, NE-BERT is designed to help digital tools and apps understand the unique languages of the Northeast,” B. Nyalang, Founder, MWire Labs said.
The company is among the few startup-built language models listed on AI Kosh, the Government of India’s national AI repository under the IndiaAI mission.
NE-BERT is free for public use that allows local students, developers, and businesses to build technology using their own languages without waiting for outside companies to do it for them.
According to Nyalang, the technology was tested against two major similar models: mBERT (by Google) and IndicBERT (by AI4Bharat, a research lab at IIT Madras) wherein it was found that it understands the language clearly.
Under the “Confusion Score” (Perplexity) results for Pnar, the company stated that NE-BERT has the lowest confusion compared to the global tech giants.
The release of NE-BERT follows the startup’s earlier launch of Kren-M. While NE-BERT is an engine for understanding text, Kren-M is a Large Language Model (LLM)-essentially a compact, specialised version that speaks both Khasi and English.
Nyalang said by combining Kren-M’s conversational ability with NE-BERT’s deep understanding of regional dialects, MWire Labs is working to build a complete ecosystem for Northeast Indian languages, starting with Meghalaya first.
“For too long, the diverse languages of Northeast India have been overlooked in AI due to scarce data and resources. With NE-BERT and Kren-M, we’re breaking that barrier, creating digital tools that give our languages the representation they deserve,” he added.























