Lessons from Our Voice Engine: CLMs
April 7, 2021
Welcome back to “Lessons from Our Voice Engine,” where members of our Engineering and Speech Tech teams offer high level insights into how our voice engine works.
Lesson 2 is from Lora Lynn Asvos, a Computational Linguist on our Speech Tech team.
What are CLMs?
CLM stands for “custom language model.” As mentioned in Lesson 1, language models are statistical models of language that can predict the next word based on the context.
CLMs are language models, as the name implies, but they have a little something extra. Instead of focusing generically on a given language, a CLM focuses on a specific domain of that language. This domain could be fairy tales, fables, scientific texts, cooking recipes, knitting patterns, you name it.
Even though CLMs specialize in a particular domain, they are still bolstered by general language knowledge. This allows CLMs to cope if the user goes outside the intended domain, which is particularly useful with children — they excel at saying the unexpected!
Why are CLMs important for our kid-specific voice engine?
We often get this question from clients in conjunction with, “Why is a CLM better than a generic LM?” Generic LMs cover many topics and contain lots of data. For general knowledge applications, they can be useful. However, generic LMs are trained on adult words, use cases, and sentence structures. Their strength is also their weakness. As the old adage goes, a jack-of-all-trades is a master of none. Or in this case, a jack-of-all-domains.
When a child says “the train went choo-choo,” a generic LM might interpret “choo-choo” as “to you” or “chew chew,” similar-sounding but more standard words. Children’s texts are also full of fun and unique character names, places, and objects. With a generic LM, the unique word won’t be understood, leading to a disappointing reading experience.
Since our focus is children’s speech, our CLMs are trained on kid-centric data, which means words like “choo-choo” are correctly understood. Our CLMs also allow for phrases with unique words like “the alien smork of planet Terratow” to be recognized with exceptional accuracy. This keeps the experience of reading engaging, educational, and enjoyable.