SayHi Learn: Voice-Driven Language Learning
In 2018, I led the design of SayHi Learn—a voice-driven language learning app built on top of SayHi Translate. The goal was to create a new experience that delivered meaningful value to users while capturing high-value utterances to support Amazon's growing investment in speech AI.
SayHi Translate had recently joined Amazon, and this work was part of an early effort to extend its reach and unlock additional long-term value from its speech pipeline.
This work included:
Research into second-language acquisition and user motivation
Mobile app design: user journeys, wireframes, mockups
Interaction and animation design for pronouciation feedback
Alignment of UX goals with model training and data quality
How Do We Expand Customer Value While Training Smarter AI?
SayHi Translate was already helping millions of users bridge language gaps in real time—and generating a steady stream of voice data. But most of that data reflected quick, transactional interactions. Useful for translation, but not always ideal for training.
One key gap stood out: we lacked consistent, repeatable utterances that could help improve model precision. Phrases that were domain-specific, phonetically diverse, or intent-rich were especially valuable for downstream applications like Alexa.
We needed a way to collect that kind of data on purpose—ethically, at scale, and in a way that would create meaningful value for users.
Just Give The People What They Want
As we explored ways to bring more value to users, our research uncovered a surprising behavior. Our customers were using SayHi Translate to teach themselves new languages: repeating translated phrases out loud, mimicking pronunciation, and revisiting conversations for practice.
What if we leaned into that behavior and built a language-learning experience that felt natural to users while quietly gathering the high-quality, repeatable utterances our models needed?

Building a Voice-First Language Learning Experience
We created SayHi Learn—a standalone, voice-first language learning app built on the same foundation as SayHi Translate. Its core experience included:
Speech-based UX, focused on short, high-frequency utterances
Pronunciation feedback powered by lightweight speech scoring
Spaced repetition and visual breakdowns to reinforce learning over time
I collaborated closely with our linguists and content leads to design curriculums that were not only useful for language learners, but intentionally included words and phrases that directly improved model performance.
Crafting Feedback That Builds Confidence
Precise pronunciation scoring wasn’t available yet—but by comparing spoken input to the expected phrase, we could estimate recognition accuracy and design feedback that felt supportive.
Our goal was to keep feedback nearly invisible. Instead of flagging errors, we quietly adapted the curriculum to each user—adding repetition, breaking phrases into smaller parts, avoiding any sense of failure.
I designed a signature interaction that made the app feel curious, not critical. When recognition was critically low, the phrase bubble (styled like a chat message) tilted its head like a confused dog and popped up animated question marks. This gentle motion, inspired by message reaction UX patterns, let us show uncertainty without discouragement. It turned a missed phrase into a moment of engagement.

From Learning Moments to Language Models
SayHi Learn launched on Android as a small but intentional app—one that delivered value to learners while feeding back into Amazon’s broader voice AI strategy. The utterances it collected, combined with general voice data from SayHi Translate, played a pivotal role in the global expansion of Alexa, bootstrapping 33 new speech recognition languages.
It was a clear example of how a well-designed user experience—built on user behavior, motivation, and trust—can serve dual purposes. We didn’t just create a learning app. We created a strategic pipeline for responsible AI training.