A Yoruba Text-to-Speech App Is Being Brought to Life Through This New Tech Initiative
We talk to Kola Tubosun of Yorubaname.com about their plan to create a Siri-like application that speaks Yoruba.
I'd never admit this to my parents, but I blame them for the fact that I barely speak Yoruba (if you're reading this, mom and dad, know that I still love you). When aunts and uncles complain about how Yoruba is in danger of becoming extinct, I wonder why, then, didn't their generation try harder to pass the language down to us.
Really, it's not all their fault. But while we can't go back in time and change things, Yoruba isn't necessarily doomed. New technology is giving us the tools to keep our heritage alive.
This is the task that the good folks over at Yorubaname.com have taken on with their Yoruba speech-to-text initiative. Their immediate goal is to create a Siri-like application that will service millions of Yoruba-speaking people in Nigeria and elsewhere but, ultimately, their creation will help ensure the language's longevity. Besides, a Yoruba Siri—maybe we'd call her Simi instead— is bound to have a lot of personality.
I got a chance to speak with the website's curator, Kola Tubosun. He spoke about the group's plans to execute the software, offsetting "Western-centrism" in the tech industry and empowering others through technology. Read our conversation below and find out how you can get involved in brining the app to life.
For those who don’t know much about speech synthesis, can you elaborate on it some more and tell us how it’ll be utilized for the Yoruba text-to-speech application?
Speech synthesis is the process of creating human speech using software and audio segments. It’s called text-to-speech because the end product needs written text to put into action. Like those bibles that read the words to you, or like those GPS systems that talk, or even these Word applications that can read to you what you have typed, the system picks out already written text and converts it to synthetic audio. It is created, usually, by a process of training the computer to string along segments of audio into comprehensible speech. Watch this video to see it in action.
What we’re trying to create for Yoruba is similar, and the uses of the application are many. For instance, most artificial intelligence softwares use spoken language as means of activating them. Siri, on the iPhone, for instance, can be spoken to and “she” speaks back. That voice is a manufactured voice. But because it can respond to commands and take commands, it is useful in many other ways. Blind people, for instance, will be able to operate their phones if they can just talk to it and tell it what they want. You can use it at ATMs to help people who don't speak English, etc.
Why is it so important that we have this software in Yoruba in particular?
Well, Yoruba has over 30 million speakers. That is already a huge population that can benefit from this kind of innovation. Many of those 30 million do not speak English at all, which means that they are shut out of a number of things involving technology. If a market woman can use an ATM in her local language, I think that empowers her. If she can speak to her phone in Yoruba and it does what she wants, that's a leap forward.
But more importantly, African languages have been left out, for too long in global conversations in technology and that has always bothered me. Siri exists in Danish, Finnish, and Norwegian, three languages which, combined and multiplied by two, still aren’t as widely spoken as Yoruba, yet there is Siri in those languages. Why? Because we don’t care?
So, I’m working on Yoruba because that’s the language I speak and on which I have competence as a linguist to create anything. My overarching aim, however, is to show that more can be done for any African language, and more should be done. One of the ways to keep a language from being endangered is not only to speak it to our children, but also to have them capable of adapting to changing times, in this case with technology.
Can you speak more about the issue of “anglonormativity” in the tech world? How has it affected your experience while trying to build this software?
I used the word in this essay to refer to the accepted convention that everything must cater first (and sometimes primarily) to the English-speaking world. But then I realized that it’s not so much Anglonormativity as it is Euronormativity or anything-but-Africa-normativity. Nigeria has 170 million people and has its own version of English which is spoken by almost everyone. Yet, the only type of English you see on Siri or Google are British, Australian, and American. For some reason, we just regularly seem to be invisible, low on the priority.
Now for this I blame not just the people who create this products/applications, but to African tech stakeholders who haven’t held the big companies to a higher standard, and who haven’t demanded more of them. In any case, if we don’t build applications like this that cater to our own languages, then we shouldn’t complain that no one cares and no one takes us seriously.
Is it your hope that text-to-speech software will expand to include other African languages as well? How do you think developers can make this happen?
Oh, sure! We’re open to collaborations.
Can you tell us more about your fundraising initiative and how people can get involved?
We are trying to raise $4000 to create this TTS-Yoruba application. You can donate to us here. No amount is too small. But we are also interested in partnering with anyone with other capabilities that can be useful either in creating this particular application, or numerous others that serve the African language experience. Grants? Sponsorships? Investments? Yes! Send us an email at email@example.com