SMC Monthly Report: October 2024
SoftwareDrawing SMC Logo Using Metapost: Santhosh Thottingal shares an experiment to draw the SMC logo
We're announcing the relaunch of our monthly newsletter starting January 2024. Stay tuned for updates on our ongoing projects, noteworthy developments in Free and Open Source Software (FOSS) and Language Computing, and other engaging updates.
Anivar Aravind has been recognized as the Changemaker for 2024 by Deccan Herald, in the field of Digital Rights activities. This acknowledgment was for his commitment to creating a more equitable digital landscape in India. Read more about his works here.
In a recent national seminar hosted by the Tirur regional centre of Sree Sankaracharya University of Sanskrit, Kalady, Kerala, Santhosh Thottingal delivered a talk focusing on the current status of Malayalam within Artificial Intelligence models. The presentation discussed important aspects such as the scarcity of high-quality training data and the linguistic complexity of Malayalam.
The latest addition to the family of fonts maintained by SMC are currently in the final phases of development. Built on the unique development stack of metapost drawings to automated opentype rules, these fonts and their development approach is expected to bring about a revolution in font development.
Nupuram is a Malayalam variable typeface, inspired from the early Malayalam movie titles designs. It is a superfamily of 5 related typefaces and supports the color fonts technology. The alpha version of Nupuram is now available for testing.
Malini is a versatile variable typeface designed to meet a wide range of needs, from regular body text to blurbs and titles, as showcased in the specimen page. The alpha version of Malini is now available for testing.
The official website of SMC is undergoing a comprehensive revamp to adopt a more modern and contemporary appearance. The updated design is expected to be rolled out soon.
A web application designed for learning Malayalam writing is currently undergoing enhancements with the incorporation of new features. You can explore the updated version by visiting this link.
A demo of open source speech recognition system in Malayalam is now launched here. This speech to text system is exclusively trained on openly available speech and text corpora in Malayalam using the Kaldi toolkit and works well on properly articulated speech.
Rachana Institute of Typography has released a new traditional orthography font, Chingam. It is available for download from here.
Large Language Models (LLMs) have become a prominent focus in the field of AI. Following the release of openly licensed models like Llama and Mistral, there has been a series of efforts to fine-tune these models for Indian languages. Saravam AI has notably contributed by introducing the OpenHathi model, providing detailed procedures for fine-tuning Llama specifically for Hindi. Subsequently, AI4Bharat released Airavata, an instruction-tuned version of OpenHathi.
Abhinand has also undertaken the fine-tuning of Llama for Dravidian languages, including Tamil, Kannada, and Malayalam, as announced here. Despite these commendable efforts, all creators express a shared concern about the lack of high-quality training data and evaluation benchmarks and datasets in Indian languages.
Marking the 21st Birthday Celebration of Malayalam Wikipedia, the Wiki Conference Kerala was held on 23 Dec 2023 at St. Thomas College, Thrissur. The event followed an unconference format, providing a platform for active participation and insightful discussions. Numerous contributors from the SMC community actively engaged in various sessions, sharing their expertise and perspectives during this occasion.
The popular open source speech technology start up coqui has shut down its operations. The source code repo is still open and active.
Thanks for reading!!