March 11, 2019 · documentation ner mlmorph

Malayalam Named Entity Recognition using morphology analyser

A detailed note by Santhosh Thottingal.

Named Entity Recognition,  a task of identifying and classifying real world objects such as  persons, places, organizations from a given text is a well known NLP  problem. For Malayalam, there were several research papers published on this topic, but none are functional or reproducible research.    

The morphological characteristics of Malayalam has been always a  challenge to solve this problem. When the named entities appear in an  inflected or  agglutinated complex word, the first step is to analyse  such words and arrive at the root words.    

As the Malayalam morphology analyser is progressing well,  I attempted to build a first version of Malayalam  NER on top of it. Since mlmorph gives the POS tagging and analysis,  there is not much to do in NER. We just need to look for tags  corresponding to proper nouns and report.  

You can try the system at https://morph.smc.org.in/ner

Malayalam named entity recognition example using https://morph.smc.org.in/ner

Known Limitations

This was originally written by Santhosh Thottingal and published at thottingal.in
  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket
Comments powered by Disqus