SMC Monthly Newsletter: January 2024
We're announcing the relaunch of our monthly newsletter starting January 2024. Stay tuned for updates on our ongoing projects, noteworthy developments in Free and Open Source Software (FOSS) and Language Computing, and other engaging updates.
Deccan Herald Changemaker 2024: Anivar Aravind
Anivar Aravind has been recognized as the Changemaker for 2024 by Deccan Herald, in the field of Digital Rights activities. This acknowledgment was for his commitment to creating a more equitable digital landscape in India. Read more about his works here.
National Seminar: Malayalam in AI Models
In a recent national seminar hosted by the Tirur regional centre of Sree Sankaracharya University of Sanskrit, Kalady, Kerala, Santhosh Thottingal delivered a talk focusing on the current status of Malayalam within Artificial Intelligence models. The presentation discussed important aspects such as the scarcity of high-quality training data and the linguistic complexity of Malayalam.
Ongoing Projects
Nupuram and Malini Typefaces
The latest addition to the family of fonts maintained by SMC are currently in the final phases of development. Built on the unique development stack of metapost drawings to automated opentype rules, these fonts and their development approach is expected to bring about a revolution in font development.
Nupuram is a Malayalam variable typeface, inspired from the early Malayalam movie titles designs. It is a superfamily of 5 related typefaces and supports the color fonts technology. The alpha version of Nupuram is now available for testing.
Malini is a versatile variable typeface designed to meet a wide range of needs, from regular body text to blurbs and titles, as showcased in the specimen page. The alpha version of Malini is now available for testing.
Revamped Website
The official website of SMC is undergoing a comprehensive revamp to adopt a more modern and contemporary appearance. The updated design is expected to be rolled out soon.
Learn to Write Malayalam
A web application designed for learning Malayalam writing is currently undergoing enhancements with the incorporation of new features. You can explore the updated version by visiting this link.
Speech Recognition Demo
A demo of open source speech recognition system in Malayalam is now launched here. This speech to text system is exclusively trained on openly available speech and text corpora in Malayalam using the Kaldi toolkit and works well on properly articulated speech.
In news
New Malayalam Font: Chingam
Rachana Institute of Typography has released a new traditional orthography font, Chingam. It is available for download from here.
LLMs in Malayalam and other Indic Languages
Large Language Models (LLMs) have become a prominent focus in the field of AI. Following the release of openly licensed models like Llama and Mistral, there has been a series of efforts to fine-tune these models for Indian languages. Saravam AI has notably contributed by introducing the OpenHathi model, providing detailed procedures for fine-tuning Llama specifically for Hindi. Subsequently, AI4Bharat released Airavata, an instruction-tuned version of OpenHathi.
Abhinand has also undertaken the fine-tuning of Llama for Dravidian languages, including Tamil, Kannada, and Malayalam, as announced here. Despite these commendable efforts, all creators express a shared concern about the lack of high-quality training data and evaluation benchmarks and datasets in Indian languages.
Wiki Conference Kerala
Marking the 21st Birthday Celebration of Malayalam Wikipedia, the Wiki Conference Kerala was held on 23 Dec 2023 at St. Thomas College, Thrissur. The event followed an unconference format, providing a platform for active participation and insightful discussions. Numerous contributors from the SMC community actively engaged in various sessions, sharing their expertise and perspectives during this occasion.
Shut down of Coqui
The popular open source speech technology start up coqui has shut down its operations. The source code repo is still open and active.
Thanks for reading!!