- Seasalt is building customizable speech recognition tech for enterprise call centers.
- The founders sold their last startup to Baidu in 2017.
- The company partners with cloud communications giant Twilio.
After selling their last startup to Baidu, a pair of tech vets are jumping back into the crowded space of voice and speech recognition with a company called Seasalt.AI.
The startup sells a software platform to enterprise companies with contact centers. Developers can use Seasalt to build apps, devices and services that communicate conversationally with users.
The company was founded by Guoguo Chen and Xuchen Yao, experts in the field of voice and speech recognition software. Chen created the “OK Google” hotword for Android and co-authored a speech recognition project called Kaldi, which Nvidia eventually integrated into its graphics card. Yao meanwhile is a Johns Hopkins University PhD graduate who previously worked at the Allen Institute for AI (AI2) incubator in Seattle.
In 2015, Chen and Yao co-founded KITT.AI, a startup that spun out of AI2. One of the company’s most popular products was a customizable wake word engine called Snowboy, a software toolkit that allowed developers to add verbal hotwords to their own hardware. The startup also launched ChatFlow, a framework for developers to build chatbots.
Baidu acquired KITT.AI in 2017. Chen and Yao worked for the Chinese tech giant for two years, leaving in 2019.
Seasalt, which has 22 employees, provides a customizable speech recognition engine. The startup describes its tech as the “next generation” of conversational AI. It raised a small seed round when it initially launched in January 2020.
The company sells six applications, which work in tandem as part of its full suite of services, listed below, called “SeaSuite.”
- SeaChat lets users create a framework for automated chatbot responses.
- SeaCode is a software development studio for conversational AI. Users can use the platform to build tools such as chatbots.
- SeaVoice is a speech-to-text (STT) transcription feature that can be customized to understand different languages and nuanced speech, among other uses. This tool also has a text-to-speech (TTT) feature, which can be customized to sound like Tom Hanks or David Attenborough.
- SeaMeet‘s secretary-like features can be used in conferences and meetings. It can identify up to 12 unique speakers in the room. Users can train the model to provide automatic meeting minutes and follow-up notes, among other actions.
- SeaWord can be customized to go over text to extract meaningful information. The tool can also be used to highlight and redact words like identifiable information.
- SeaX is a tool designed for contact centers. It can automate responses to incoming messages, calls and social media, among others. The software also features a tool that call center agents can use to transcribe and categorize incoming calls from customers.
Seasalt aims to offer tools that are capable of understanding nuance in both speech and text. Its main use case is in enterprise contact centers. These companies use the software to not only monitor and evaluate their agents, but to also aggregate voice data to extract insights.
Global corporations need to operate call centers in hundreds of countries, meaning they will inevitably encounter low-resource languages and accents in every place they operate. In America, for example, there are at least 24 dialects of English.
Seasalt has about 50 customers, including large enterprise companies in Southeast Asia.
“For any enterprise company, if you have some really weird spelling or some technical jargon, we can take care of that,” Yao told GeekWire.
Spoken Communication, a Seattle startup that sold speech recognition tech to call centers, was acquired by Avaya in 2018.
Yao said that due to the pandemic, there are many cross-border e-commerce sites popping up in north and Southeast Asia, selling their products overseas. This is creating a tailwind for Seasalt, he said. He added that the region is currently underserved by competitors.
The call-center technology market includes many existing players. Tech giant Google sells its natural language capabilities in a packaged solution called Contact Center Artificial Intelligence. Amazon and Microsoft also have their own services: AWS Contact Center Intelligence and Azure Cognitive Services. Other notable players include Deepgram, Five9, Avaya and 8×8.
Yao said Seasalt would not have the “muscle” to compete against Five9 or other publicly traded companies if it did not have its partnership with Twilio, which bolsters its sales chain. He explained that a lesson he learned from KITT.AI was that software itself is not what is going to create a moat for the startup. Instead, he added, it comes from its current distribution, commercialization and customer base.