See how Deepgram stacks up. Check out our ASR Comparison Tool. ๐ŸŽ๐ŸŠ

All Posts

Transfer Learning from Spanish to Portuguese: How Neighbors on the Map Also Share Vectors

Transfer learning is one of the hottest topics of natural language processingโ€”and, indeed, machine learning in generalโ€”in recent years. In this post, I want to share with you what transfer learning is, why itโ€™s so helpful when thinking about language-related tasks, and how weโ€™ve used it to create a high-accuracy model for Portuguese based on the work that weโ€™d already done for Spanish.ย 

In this blog post, weโ€™ll discuss some of our specific logic here, including the intuition of picking Spanish for helping Portuguese model training and the similarities between these languages on many levels. But to get started, letโ€™s talk about what transfer learning is and why itโ€™s so valued at Deepgram before diving into the specifics of Spanish and Portuguese.

What is Transfer Learning? A Very Brief History

In short, we can say that transfer learning is taking a model that has been trained on one task, and using it for a different one by changing its training data. Transfer learning started its journey with word vectors, which are static vectors for each word in the corpus. For our purposes, you can think of a vector as a way of describing the relationship between words in a large, abstract space.

To get an idea of how this works, you can see an example of word vectors in Figure 1, below. If you look at the 1st box for the words man and woman, they’re the same, but the word king is different because he’s not an ordinary human; heโ€™s the king. Also man, woman, and king share the same third box, because they’re all human. The fourth box of king is identical to man (baby blue), but woman has a different fourth box, pink. So king is more similar to man in this regard.

Figure 1.ย Word2vec, the ancestor of transfer learning, showing word similarities and differences (from https://jalammar.github.io/illustrated-word2vec/).

 

Next, contextual word vectors came with ELMo, or Embeddings from Language Models, which is โ€œa type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy)โ€ (source). This provided a way to look at the relationship between words at a greater depth and with more .

Finally transformersโ€”a type of deep learning model that differentially weights input dataโ€”were developed to generate contextual word vectors as well as a sentence-level vector, which is even better for linguistic analysis.

At each step of this process, though, the goal remained the sameโ€”to look for good representations for our corpus words, which happen to be vectors. Both pretrained word vectors and transformers are trained on giant corpora, hence they know a lot about the target languageโ€™s syntax and semantics. We feed pretrained vectors to our downstream models and the vectors bring what they know about the language, semantics of the words, and many surprising features to our models.ย 

You can probably already see how this could be useful for language-related tasks. Speech recognition is the task of converting speech to text. Though speech recognition models are more sophisticated algorithms than text oriented statistical models, a neural network is still a neural network and weights are certainly used.ย 

Hence, some weight re-using techniques are applicable to speech recognition, along with more sophisticated acoustic tricks. That is to say, once you have a model that works well for one language, itโ€™s relatively easy to give that model data for another languageโ€”especially one thatโ€™s closely relatedโ€”and see better results than if you started from scratch.

Why We Use Transfer Learning

At Deepgram, transfer learning is highly valued. For a specific language, when we want to train a new version of a specific model, we donโ€™t want to start from scratch. Instead, when we want to train a model for a brand new language, we want to transfer some knowledge from a similar languageโ€™s model when possible.

To illustrate the power of these processes, weโ€™ll look at a specific case of transfer learningโ€”going from Spanish to Portugueseโ€”to show how you can train a model for a new language from scratch by the help of a similar languageโ€™s model.ย 

How Deepgram Works

How Deepgram Works

Want to get more value out of your call center data, build the next game-changing voice feature, or save a lot of money on speech transcription? Learn why Deepgram is the platform to get you there.

Download Now

Why Spanish to Portuguese

Letโ€™s get started with the basic question of why we picked Spanish to help train our Portuguese model from scratch. Spanish and Portuguese come from the same language family and are closely related. Italian, French, Spanish, and Portuguese all belong to this language familyโ€”the Romance family. However, even if you didn’t know that, a quick glance is enough for one to see the similarities between Spanish and Portuguese just by looking at a piece of text. Hereโ€™s an example text pair, taken from Alice in Wonderland, to explain what we mean:

Spanish

No habรญa nada tan notable en eso; Alicia tampoco pensรณ que fuera tan extraรฑo escuchar al Conejo decirse a sรญ mismo: ‘ยกDios mรญo! ยกOh querido! ยกLlegarรฉ tarde!’ (cuando lo pensรณ despuรฉs, se le ocurriรณ que deberรญa haberse preguntado por esto, pero en ese momento todo parecรญa bastante natural); pero cuando el Conejo sacรณ un reloj del bolsillo de su chaleco, lo mirรณ y siguiรณ corriendo, Alicia se puso de pie, porque le pasรณ por la mente que nunca antes habรญa visto un conejo con chaleco… bolsillo, o un reloj para sacar de รฉl, y ardiendo de curiosidad, corriรณ por el campo tras รฉl, y afortunadamente llegรณ justo a tiempo para verlo caer por una gran madriguera debajo del seto.

Portuguese

Nรฃo havia nada tรฃo notรกvel nisso; Alice tambรฉm nรฃo achou tรฃo estranho ouvir Rabbit dizer para si mesmo: ‘Meu Deus! Oh querida! Chegarei tarde!’ (Quando ele pensou sobre isso depois, ocorreu-lhe que deveria ter se perguntado sobre isso, mas na รฉpoca tudo parecia bastante natural); mas quando o Coelho tirou um relรณgio do bolso do colete, olhou para ele e correu, Alice levantou-se, porque lhe passou pela cabeรงa que nunca tinha visto um coelho com um colete… bolso, ou um relรณgio para levar. longe dele, e queimando de curiosidade, ela correu pelo campo atrรกs dele, felizmente chegando bem a tempo de vรช-lo cair em uma grande toca sob a cerca viva.

For non-speakers of Spanish and Portuguese, itโ€™s totally reasonable to identify the above languages as the sameโ€”and you can quickly and easily find a lot of similarities between the two texts above, even if you donโ€™t know what they mean.

Similarities between Spanish and Portuguese

In this section, weโ€™ll compare Spanish and Portuguese phonetically, vocabulary-wise, and syntax-wise to better understand the ways that these languages are so similar in more depth.

Phonetic Similarities

The first major similarity between Spanish and Portuguese is acoustic similarity, which plays a key role for our transfer learning purposes. Here, acoustic or phonetic similarity means that the two languages use a set of sounds in their words that are very similar to one another.

Consonants are almost identical in both languages, although Spanish has three extra affricates that Portuguese does not. Other than that, the consonants look the same on paper and they sound the same. Below, in Figure 2, we can see the phonetic alphabet for consonants of both languages, taken from the SAMPA website.

Figure 2.ย Spanish and Portuguese consonants, represented in SAMPA.

 

Vowels are a bit different. The Portuguese phonetic system has more vowels, and perceptually, Portuguese vowels are described as sounding different by Spanish speakers. In Figure 3 below we can see the Spanish and Pt vowels side by side, again taken from SAMPA website.

Figure 3.ย Spanish and Portuguese vowels in SAMPA.

 

Here, Spanish vowels look more minimalistic and the Portuguese side looks more crowded, since it has nasal vowels, like French. Still, thereโ€™s a certain level of similarity.

Vocabulary Similarities

The final similarity between Spanish and Portuguese we want to discover is vocabulary similarity. Here, similar vocabulary means vocabulary strings that are either common or differ by a small edit distanceโ€”that is to say, one or two different letters or sounds. Compare the vocabulary words below, in Table 1, with Spanish on the left and Portuguese given on the right.

Spanish Portuguese
casa casa
seรฑora senhora
para para
ahora ahora
cabeza cabeรงa
toma toma
vamos vamos
cama cama
gano gano
bonita bonita
cara cara
centro centro
estados estados
libros livros
opiniรณn opiniรฃo

Table 1. Comparison of Spanish and Portuguese vocabulary items.

 

As we see, some words are literally the same and some words differ by an edit distance of one or two. This is great for transferring the word vectors from the Spanish model into the Portuguese model.ย 

Syntactic Similarity

The final aspect of similarity between the two languages that weโ€™ll look at here is syntactic similarityโ€”that is, how similar is the way that sentences and phrases are constructed? This is also related to transferring weights, as syntactic constructions are one of the things learned by the model. Since weโ€™d like to uncover how these two languages relate to each other, letโ€™s compare the dependency trees for the phrases un perro pequeรฑo and um cachorro pequeno meaning โ€˜a small dogโ€™ in each language. These trees show us how words within a phrase or sentence are related to each other. I generated both dependency trees with displaCy, shown below in Figure 4.

ย Figure 4.ย Dependency trees of โ€˜a small dogโ€™ (literally, โ€œa dog smallโ€) in Spanish (top) and Portuguese (bottom).

 

In both dependency trees, we notice that the adjective comes after the noun. In both trees, the head is the noun perro/cachorro โ€˜dogโ€™ and the adjective is attached to the head by the dependency relation adjective modifier, or amod. A determiner un/um โ€˜aโ€™ is attached to the noun by the determiner, or det relation. We see that the above trees are identical; hence, if one learns the syntax of one Spanish/Portuguese pair, one can easily figure out the other languageโ€™s syntax.ย 

Putting together this information, now weโ€™re ready to understand how we make use of transfer learning for training the very first speech recognition model of Portuguese from scratch.

Transferring our Modelโ€™s Learning from Spanish to Portuguese

As weโ€™ve seen, Spanish and Portuguese are very similar to one another in terms of their sounds, their vocabulary, and their sentence structures. If we go back to our discussion of vectors and weighting at the beginning, you can see why starting with a Spanish model would be an effective choice for creating a Portuguese modelโ€”the weights and vectors are likely to be extremely similar, with only a few small points of difference that the model will learn when it sees Portuguese data, making small adjustments as it goes.

We can see how useful this is if we imagine instead starting with an English model to create one for Portuguese. Although this would better than starting from zero (Portuguese and English are, after all, both languages, and they are related to one other, albeit more distantly), the number and magnitude of the adjustments would be much greater than when starting with Spanish, requiring a longer training process and more data from Portuguese to get the model to the same level of accuracy.

Wrapping Up

I hope this article gives you a good sense of what transfer learning is and why it can be so impactful in a case like Spanish and Portuguese where the languages are very similar. Weโ€™ve got more transfer learning material coming in the next few months, so stay tuned to learn moreโ€”you can sign up for our newsletter below to keep up-to-date on whatโ€™s happening at Deepgram.

Want to see the power of transfer learning first hand? Sign up for a free API key or contact us to get started using end-to-end deep learning for your speech recognition projects.

How to Evaluate a Deep Learning ASR Platform

How to Evaluate a Deep Learning ASR Platform

Get the information you need about 1st generation, 2nd generation, and modern-day automatic speech recognition (ASR) solutions to ensure your evaluation experience is efficient and yields the data you need to make your purchasing decision.

Download Now

Related Resources

Embracing the diversity of Spanish
This National Hispanic Heritage Month, ยกtenemos el espaรฑol en cuenta! There are about 61 million people of Hispanic heritage in the United States today, of which at least 43 million speak Spanish at home. Spanish...
5 Reasons Deep Learning for Speech Recognition is Business-Ready Now
Iโ€™m frustrated. When I read from tech authors, advisors, and our competitorโ€™s blogs that End-to-End Deep Learning (E2EDL) Speech Recognition software is only being researched or not production-ready, I want to scream… โ€œListen people! End-to-end...
Why Deep Learning is the Best Approach for Speech Recognition
Automatic speech recognition isnโ€™t new. It has its origins in Cold War-era research with narrow military implementations, which was followed in the 1960s, 70s, and 80s by developments from leaders like Marvin Minsky and research...

Apply Now

Receive up to $100,000 to use over 12 months.

Become a Partner

When you become a partner youโ€™re in good company.

Talk to Customer Success