The test corpus for Malayalam Morphological analysis has many foreign words. They are either written in a non-Malayalam script or written in Malayalam. For example, “ഇലക്ട്രിസിറ്റി”, “ഡോക്
This was originally written by Santhosh Thottingal and published at Thottingal.in.I have been trying to generate a Markov chain for Malayalam content. A Markov chain is a stochastic model describing a
A detailed note by Santhosh Thottingal.A few months back, I wrote about the spellchecker based on Malayalam morphology analyser. I was also trying to intergrate that spellchecker with LibreOffice. It is not
A detailed note by Santhosh Thottingal.Named Entity Recognition, a task of identifying and classifying real world objects such as persons, places, organizations from a given text is a well known NLP problem.