Every minute thousands of hours of audio are created across the globe – TV, Radio, Telephony, meetings, conferences etc. The big challenge is to make the information contained within searchable and interpretable. The current dissemination of this information is costly and slow, impeding business response and applicability. Speechmatics are increasing the accessibility of this information by spearheading innovation in automatic speech recognition. Given trends towards greater audiovisual content hosting and laws requiring transaction transparency Speechmatics is poised for rapid growth in coming years.
KITE Invest: Speechmatics uses deep neural network algorithms with unrivalled accuracy and performance in transcription. US and British English are already supported; with an additional 24 languages in the pipeline. How did Speechmatics evaluate which languages to expand into?
Benedikt von Thüngen: Customer interviews & demands. Size of the respective markets & number of speakers.
KITE: What challenges does the service face when adapting to a new culture and language?
BvT: Very little from a technical perspective, as almost all languages are composed of the same basic building blocks (Phonemes). We have a very international team covering 15 languages, three continents and six nationalities.
KITE: The experienced team of engineers has already been awarded grants to develop a “faster than real-time” system, which will be of particular interest for “live” transcription of TV/ Radio content. How have Speechmatics found the process of receiving funding to support R&D?
BvT: Exhilarating, as it is an initial validating of some of the technical ideas we have.
KITE: Speechmatics are currently concentrating their efforts on B2B activities with TV and audio transcript archiving, and in governmental, medical and legal transcriptions where accuracy is crucial. Given the wealth of audiovisual content there is a clear opportunity for growth, how have Speechmatics found the process of raising awareness of their product?
BvT: Business plan competitions, pitching & networking events have been a very good platformto raise awareness. Furthermore, we are very lucky to have some of the world’s leading figures in speech recognition on the team, who have an extensive market network.
KITE: What differentiates Speechmatics from other players in the field?
BvT: Simply – we have the best speech recognition system on the market, especially in terms of accuracy. One particular aspect is the broad training data we used to train our models, which facilitates a very broad application of our technology. In addition, we will be able to offer the fastest turn-around times at the lowest cost with the highest accuracy compared to any competitor on the market within the next 3-6 months.
KITE: Direct use via their website and the provision of an integrative Rest API allows the technology to be applied in a wide variety of applications. To date Speechmatics have secured a few major product partners, what have you learnt in the process that you would take forward to new partnerships?
Speechmatics is completely cash-flow funded and are looking to facilitate a potential exit in 2-3 years, or through the right partnerships a potential IPO, with very strong interest already.