This National Hispanic Heritage Month, ¡tenemos el español en cuenta!
There are about 61 million people of Hispanic heritage in the United States today, of which at least 43 million speak Spanish at home. Spanish is also widely spoken as a second language throughout the US. Of course, Spanish is also a major world language. According to Ethnologue, it has the second largest population of native speakers in the world at 471 million, after Mandarin Chinese at 971 million and ahead of English at 370 million.
Like all languages, Spanish has a high degree of internal variation. I am Chilean, and when I read statistics citing the 471 million native Spanish speakers, I have to ask myself: which Spanish do they mean? I know that people from many parts of the Spanish-speaking world have a hard time with the Chilean dialect I grew up around. Though some of my Spanish-speaking friends might smirk and disagree, Chilean Spanish is still Spanish. At least, that’s what I have always thought. But why is Chilean Spanish considered Spanish while the Asturiano “language” spoken in Spain is not considered Spanish (or Portuguese)?
That question fascinates me. At its core, it boils down to: what is the difference between a language and a dialect? It may surprise you that there is no one definition of either concept in linguistics. There are no signposts that tell you when you’ve left Spanish and entered Portuguese or French or Catalan. There are only conventions for how we talk about language boundaries, which are usually loaded with political history. The language names we know hide a vast amount of diversity under the surface.
As the colonial Spanish invaded much of what today is North and South America, disease and oppression made it so that the primary language of business in the cities was Spanish. However, neither the Spaniards who invaded nor the indigenous people who lost their lands were one homogenous people. Rather they represented a countless variety of traditions and cultures. Moreover, Spanish usually did not fully replace indigenous languages in the countryside. Thousands of indigenous languages survived and are still in use throughout Latin America, influencing the local dialects of Spanish in vocabulary and pronunciation.
Evidence of this diversity survives in both Spanish and English today. Chocolate is a word from Nahuatl, the language of the Aztec Empire, as are coyote, avocado, and of course mezcal! But in Chile we don’t use the Mexican word for avocado, aguacate. Instead, we say palta. This kind of difference confused me as a Spanish-speaking child growing up in the US, where the Spanish taught in schools was full of odd phrases and words that simply came from other dialects of Spanish. In other instances, we see the reverse situation, where one word comes to be used to refer to different things across dialects. For example, an empanada is a nearly universal Spanish word (and used in the Philippines, too, of course) for delicious foodstuffs that have barely anything in common.
This mixing of languages takes place within the US as well, where millions of Spanish speakers are bilingual with English. These speakers combine the languages in many ways. Sometimes they speak fully in one language or another. Sometimes they alternate speaking entire phrases or sentences in one language within a single conversation. Sometimes they pepper individual words from one language into speech that is predominantly in the other. Their accent and pronunciation in either language might shift depending on the social context. All these possibilities are real, legitimate uses of language — and they dramatically increase the complexity of teaching technology how to understand human speech. Capturing this kind of bilingual speech is an exciting and important goal for Deepgram as a speech technology company.
Anyone who does business in the United States knows that serving a large customer base means embracing the diversity of languages spoken in our society, including a high frequency of Spanish-English bilingualism. If you are going to build speech recognition models meant to serve a multilingual customer base, you have to consider that there is no single dialect of Spanish; the differences in accent, grammar, and vocabulary are large. Furthermore, you’ll have to be prepared to tackle a range of bilingual speech patterns. At Deepgram, we built extensive English vocabulary in our Spanish general model to boost understanding in bilingual situations. Furthermore, our end-to-end deep learning approach enables us to create tailored speech models in a matter of days after receiving representative audio data. Because end-to-end deep learning approaches allow us to build models with your actual data, it is easier to train and deploy a mixed English-Spanish model that fits your precise business needs. By starting with your data, we are not constrained by debatable linguistic abstractions and forced language boundaries. Rather the data from your use cases build a tailor-made model that serves the accents and most common bilingual speech situations that you actually encounter in your business.
One joke that I have often heard is that the Americans and Brits are divided by a common tongue. It is my opinion that those of us who hail from Latin-American culture are united by a common tongue! The richness of Spanish as a language is a reflection of the diversity of its people and as such, means those language products meant to serve Spanish speakers should be built to account for that cultural diversity. At Deepgram, we don’t shoehorn all people who speak “Spanish” into one ASR product. Rather, we start with a base concept of Spanish and build custom-trained models that reflect regional vocabulary, accents, bilingualism, and culturally relevant business use-cases. As a linguist, I am sure this is the right way to serve the needs of the millions of Spanish speakers in the US and abroad — and ¡espero que tengas el español en cuenta this National Hispanic Heritage Month too!
Me gustaría agradecer a mi editor, Sam Zegas, Deepgram’s Chief of Staff, who made major contributions to this article. Spanish (like English) is not just spoken by “natives” and “heritage speakers” but also by people who learned Spanish later in life. Their voices will show up in the data and should be recognized. This is especially true if you want to build language learning apps like the kind Sam and I discuss and dream about.