Foreign Word Detection, Malayalam Writing Learning Portal and More: SMC Monthly Updates October 2020
Foreign Word Detection in mlmorphPython library for Malayalam morphological analyzer - mlmorph released version 1.
A detailed note by Santhosh Thottingal.
Named Entity Recognition, a task of identifying and classifying real world objects such as persons, places, organizations from a given text is a well known NLP problem. For Malayalam, there were several research papers published on this topic, but none are functional or reproducible research.
The morphological characteristics of Malayalam has been always a challenge to solve this problem. When the named entities appear in an inflected or agglutinated complex word, the first step is to analyse such words and arrive at the root words.
As the Malayalam morphology analyser is progressing well, I attempted to build a first version of Malayalam NER on top of it. Since mlmorph gives the POS tagging and analysis, there is not much to do in NER. We just need to look for tags corresponding to proper nouns and report.
You can try the system at https://morph.smc.org.in/ner
This was originally written by Santhosh Thottingal and published at thottingal.in