Genuary Results & February Project

Two months ago I wrote this about hobby projects:

  • Ship or abandon something monthly
  • Write a simple specification at the start to list things that I want to achieve (use case/problem to solve, what technologies and why I chose)
  • Write one starter blog post and one concluding blog post for a month
  • Keep up a backlog of ideas and things to learn, and review it often. Be conscious about adding stuff to the backlog. Also, let go of ideas that never seem to get done, write a blog post about the idea and move on.

How I Think about Hobby Projects

This is the concluding post for month of January which I did some Genuary prompts (wrote two posts about them), but eventually got too busy with new work-work project that needed to abandon them and I am moving on with a new project that has been brewing with me for long. I even have some things already written for it and running in Google cloud.

February Project

I am interested in writing a library that can form declinated Finnish noun forms from a base form. For instance, one could ask for "plural genitive of 'koira'" and it would return you 'koirien'. But I am getting ahead of myself, first we need a lot of words to test the eventual library. You may have heard that Finnish noun declination is a bit complicated but also it is pretty well documented and there are existing tools and a pretty comprehensive list of Finnish words and what their declination is.

Supporting Material

  • Kotus - The Institute for Languages of Finland publishes a list of Finnish words associated with the rule of their declination that is available here Kaino - Kotimaisten kielten keskuksen nykysuomen sanalista. This is a primary source for building this things
  • Voikko is a Finnish language toolset that includes a function to detect which form a word is in Voikko – Free linguistic software for Finnish. This tool is will be a big part of the toolchain for generating the needed test set.
  • Project Gutenberg has been by my source for word forms for my initial tests but I will expand this to more modern open sources of text and add a tool to add test words personally. Eventually, I want to have an editor for adding individual missing words and to use as a visual aid for testing the library.

The February project goals

In February I want a solid, running tool chain that generates me Finnish word forms as a dataset. It also can recognize most urgently needed missing word forms so when the editor is in place, the editor UI can prompt for user to add new word forms.

The secondary goal is to prepare for GCP certification test I will have to renew before end of April. I will build using GCP tools such as Cloud Functions, Big Query (excellent for making queries to decent size data set) and perhaps others. I am planning to use Django for the editor, but that will be coming perhaps as the April project. At the end of this month I want to have:

  • Automated constantly running "word form collector" that reports the size of the dataset and request for some missing words daily. This will be either a Python or a Go script running as one or more Google Cloud Function

Upcoming months

I have not set up my plans to far ahead but I have planned for March and April. In March I will have a break from serious stuff and the course Three.js Journey — Learn WebGL with Three.js and perhaps think of something I can share as and end result. In April I want to come back to the word forms and come up with the editor. For May, I have no plans but I want it to be another fun month like Three.js March.