Abena, a voice assistant wants to preserve African languages
Like many kids who grew up with an elderly relative, Nana Ghartey was used to helping his grandmother out around the house. He was 9 years old, and they lived together in Winneba, a small town in Ghana about 65 kilometers east of the capital city of Accra. Ghartey told TechCabal that because his grandmother didn’t have the best eyesight and had trouble walking around, he regularly had to support her in completing basic tasks and functions: turning the lights in her home on and off, operating the radio and TV, and dialing numbers on her mobile phone on her behalf, and helping her check her phone credit balance. Another reason Ghartey needed to assist his grandmother was that she primarily speaks Fante, a tonal language native to the Akan tribe in Ghana and had trouble understanding the Western languages mainly used by tech devices.
But in addition to being a dutiful young boy, Ghartey was also a budding computer geek. His late mother was a programmer at Wang Computers, so even as a toddler, Ghartey was used to being surrounded by them. With time, he slowly began to teach himself how to operate them; once he learned how to type, he was able to help his mum out with small tasks on Microsoft Word.
Ghartey, who is now 32, told TechCabal that he credits the needs of his grandmother during his adolescence and his early exposure to computers through his mother’s profession as the inspiration and catalyst, respectively, of the artificial intelligence products he created, initially to help his grandmother out, Kofi, and the one he made for the wider public, Abena. They are the first voice assistants offered in Fante and Twi, respectively.
For all the opportunities technology affords to customize experiences by catering to users’ needs with hyper-specificity, millions of Africans like Ghartey’s grandmother, who either don’t speak English or other Westernized languages or simply prefer not to, are often excluded from these benefits . The linguistic dominance of Western languages also threatens indigenous African languages; as big, Western-based tech companies spread their influence across Africa, both as service and product providers, and as job creators, indigenous languages are often deprioritised and nudged into the background.
The invention of the first voice assistant in 1992, Radio Rex, marked an era where computers developed the ability to translate on-screen actions or words into text. By 2010, the wealthiest American tech companies had launched language technologies: Siri was released that year, Google Now in 2012, and Alexa in 2014. But these voice technologies support only Western languages and are unavailable in African languages because of the low-revenue opportunities in African markets.
It’s not as though these behemoth tech companies lack the means to build products that support indigenous African languages. And, to be fair, some have made efforts to catch up, slowly but surely, like Google Translate’s addition of 10 African languages to the platform last month. Some language experts argue that these companies are the only entities, along with robust AI labs (though often the two are linked), that can solve what they characterize as a “low resources” problem.
Ghartey has his own theories about that.
“I wouldn’t say they can’t build such systems for African languages. Surely they can,” he told TechCabal from Accra, where he now lives. “They’ve prioritized the languages that they want to focus on first.”
So rather than wait for big tech to do all of the problem solving, he’s been working towards a solution for nearly a decade—with a much smaller team and significantly fewer resources—and has already made headway.
“Working on voice tech for African languages has been a completely different ball game. Data set out there are based on religious text, so you end up building a system that has a bias for biblical words.”
By the time Ghartey reached his teenage years, he was consumed by a desire to learn as much about computers as he could. In high school, he was the designated tech guy, often helping his classmates do things like connect to the internet on their phones. (This was 2005 when such functions weren’t yet second nature.) Sometimes, because he wasn’t allowed to use the campus computer lab, he skipped class and sneaked off to internet cafés just to spend more time tinkering with them, he recalled to TechCabal. After years spent assisting his grandmother, he had become accustomed to using his skills to help anyone who needed it.
He still graduated, though, and went on to study Information Technology at the Ghana Telecom University in Accra. But Ghartey taught himself mobile app development and built desktop applications, websites, and eventually mobile games, none of which were part of his school curriculum, by reading the programming textbooks which an uncle who came visiting from the US had left behind. By the time graduation came around, Ghartey graduated top of his class and secured a scholarship to do a master’s in software development at Coventry University in the UK.
His departure meant that he could no longer be around to help his grandmother. For a while, his younger sister stepped in to assist her, but eventually she moved out, too. Even after Ghartey received his master’s and moved back home, he settled in Accra, where he would eventually build his mobile app development company, Mobobi LLC. He knew he needed to come up with a solution to assist his grandmother in operating her tech devices and performing little chores, and quickly. After years of obsessive learning about computers, he knew the solution would be AI-driven, but at the time, he knew nothing about AI development.
“It wasn’t easy. I remember the sleepless nights reading research papers and benchmarking frameworks and tools to find the right solution for my grandmother. I knew that if I intended to bootstrap and ship a production build to the masses, I’d need to further hone my skills in AI, so I set on a mission to educate myself on what was required,” Ghartey told TechCabal.
First, he addressed the most immediate need: that of his grandmother. Using her voice and that of his sister’s as data, he built Kofi, a voice assistant that supported Fante, his grandmother’s preferred language and one spoken by 6 million Ghanaians. The product enabled her to use mobile apps without having to scroll through the home screen, which was difficult for her due to her impaired vision. All she had to do was tap the side of the button to launch the Kofi app, and she could instantly receive news updates, play music, stream radio programs, and follow Twitter trends—curated by the AI with human moderators to prevent misinformation—without having a Twitter account
Ghartey said that when he introduced her to Kofi and asked her to try it out, she initially thought she was interacting with a real person and became frustrated at times when the voice assistant didn’t comply with some of her commands. Ghartey had to explain to her some of Kofi’s limitations, but that process helped him recognize how useful a native voice assistant could be for the wider public. In order to serve them, he would need to build a voice assistant in Ghana’s most widely spoken language: Twi.
This would prove to be a much larger undertaking than Kofi. First of all, Ghartey didn’t have nearly as much data to work with initially, and he already knew the next iteration would require a large data set so that he could offer the product in several different dialects in order to avoid bias toward certain intonations . Secondly, Twi, like many other languages in the south of the Sahara, is tonal; accent marks dictate tones and make a difference in communication, which demanded even more research.
Another important factor to consider was access. In Winneba, where he lived with his grandmother, internet connection was patchy and inconsistent. Ghartey didn’t want his users to worry about constantly having to purchase data in order to use the app, which meant that Abena would need to function offline. (For context, with Apple’s Siri, this offline feature is only available to users with iOS 15.) Abena’s offline operation was also an effort to cut costs on his end; online operation would have required a server that processed thousands of user requests from users in real time, and thus enormous computing power from costly cloud servers.
But Ghartey knew his efforts would be worth it. He spent 6 years building Abena AI, and released it in the Google Play Store this April as a Mobobi product. (At that point, Mobobi had already gained some traction for Real Piano Teacher, a popular game that has amassed more than 20 million downloads and is the first Ghanaian-made game on Google Play Pass.) Just like the features offered on Kofi, users can check trending news on Twitter; follow weather reports; check, transfer, and recharge their airtime balances; and transfer money with their preferred mobile payment app, all using voice commands on Abena AI. The offline mode also supports an in-app community feature that allows users to send text messages, images, and videos with each other.
The reception was immediately enthusiastic. Just a few days after launching Abena AI, a health influencer shared a demo of the app on Twitterwhich catapulted it to trending topic popularity and demonstrated the need of a voice assistant offered in a language that was the second most spoken in the country after English.
Similar to the way his grandmother benefitted from Kofi, Ghartey realized that in addition to being more linguistically inclusive, Abena AI also served people with visual impairments who speak or preferred to speak Twi, many of whom were elsewhere. They became fast fans of Abena AI, he said, because it made it easier for them to check and recharge their credits from their phones, which otherwise required tech-savviness or Western language literacy.
Once Abena AI was out in the world, it was time to refine it, a task that turned out to be just as challenging as building the voice assistant in the first place. He couldn’t afford to pay professional linguists to help him with the training data for Twi, so he crowdsourced translation data from users by using third-party pop-up apps that requested translations. That proved unsuccessful when he discovered that the responses users provided were either inaccurate, or complete diversions of the words he asked them to translate. He tried incentivising users by offering to pay 5 cedis ($0.6) per translation, which worked for a while, but eventually he ran out of money. He remained tenacious, though, and eventually hired religious translators who worked on church-related translations in Twi. Still, he said he is always on the lookout for more ways to gather information about the language.
“It’s still a work in progress,” he said, adding that he accounts for the context of every word and crowdsources information about dialects from Abena AI app users. “You need lots of data to build these models.”
Like Google Nest and Amazon Echo—both hardware using their respective company’s voice recognition software—Abena also functions as a home automation system. Abena AI works with smart lights, smart plugs, and other smart home devices.
In the nearest future, Ghartey is looking to ship an update that will see Abena support 5 more Ghanaian languages and hire professional natural language processing (NLP) and machine learning engineers to speed up his work and help him ship updates to other African languages like Swahili , Kinyarwanda, African, Sesotho, and isiZulu.
Ghartey works on Abena AI for 3 hours on workdays, while working full time at his company Mobobi, to allow him to keep bootstrapping it.
“The feedback from a few people who tried Abena AI out before it was released kept us going. That smile on a 60-year-old’s face as he listened to his airtime being read out without stress—it was paving the way for people to perform tasks easily, tasks they’d never been able to do in the past,” Ghartey said .