August 3, 2024

SMC Monthly Newsletter: June and July 2024

Santhosh Thottingal to attend  /gʁafematik/ 2024

Santhosh Thottingal will be presenting his work “Parametric type design in the era of variable and color fonts" at G21C (Grapholinguistics in the 21st Century, also called /gʁafematik/), Venice, Italy durimg October 2024. Read more on the design process here.

He will be presenting his design experiments using METAPOST for Nupuram and Malini typefaces that are variable fonts.

Malayalam Pronunciation Dictionary on Huggingface

Malayalam Pronunciation Dictionary aka Malayalam Phonetic Lexicon curated by Kavya Manohar as part of her PhD, is now available in Huggingface hub as a dataset. It gives Phonemic transcription of Malayalam words in IPA format.

This is a collection of Malayalam words and their pronunciation described in IPA format. The pronunciations has been automatically generated using [Mlphon] ( Python library. The Malayalam words in this dataset are categorized into: Common words (ordered by frequency of occurrence in Indic-NLP-Corpus), Verbs, Nouns, English words
Nouns of Sanskrit origin, Proper nouns, Pronouns, Person names and Place names.  The commonwords are ordered by frequency of occurrence in Indic-NLP-Corpus. All other categories of words were derived from curated collection of words in Mlmorph.

Day in History Dataset

Santhosh Thottingal published a new dataset, 'A Day in History' on Huggingface. This is a dataset prepared out of wikipedia pages like

You can try out a demo here. This is available in Malayalam and English.

Source code:

In News

  1. The Dummy Text Generator in Malayalam, gets some UI updates and an interesting കാലകേയൻ -KiLiKi language.
  2. Chilanka Malayalam font has been merged in to NixOS/nixpkgs unstable. NixOS unstable users can now use the font by adding  fonts.packages = [ pkgs.smc-chilanka ]  in the configuration.nix
  3. Santhosh Thottingal published a detailed study on the introduction of Artificial Intelligence in School curriculum. The same has been published in Malayalam by LUCA portal.
  4. Behdad Esfahbod, the lead developer of Harfbuzz rendering engine has published an article, State of text rendering 2024.
  5. IISC to opensource 16k hours of speech data in collaboration with Bhashini as per news.
  6. A bird photography exhibition titled "Padipparakkunna Malayalam", paying homage to Induchoodan, on his birth centenary year, is organised by Njattuveala and Induchoodan Foundation at Durbar Hall Art Centre, Ernakulam, Kerala. The catalogue, typeset in Manjari is available here. Manoj Karingamadathil presented a talk on the documentation efforts for the Bird-life in Kerala.