DHAHRAN: In a world racing toward automation, Klemen Simonic believes the most natural interface is also the most enduring: the human voice.
As founder and CEO of Soniox — a cutting-edge speech-to-text platform — Simonic is betting that voice-powered technology will drive the next wave of digital innovation.
And in a country like , where smartphones dominate daily life and a young population is hungry for digital solutions, the potential is hard to ignore.
Soniox, which Simonic launched five years ago, offers speech recognition, transcription and real-time multilingual translation in more than 60 languages.
Unlike many competitors, it delivers ultra-fast, token-level outputs in milliseconds — a critical advantage for live assistants, wearables, bots and smart speakers.
But Simonic’s journey toward building the company began long before the rise of generative AI.
“I started in programming development right after high school, and I was invited to join the Jozef Stefan Institute in Slovenia, one of the best institutes in this part of Europe,” he told Arab News.
“I was working there with Ph.D. students and postdocs on machine learning, natural language processing, dependency parsing, tokenization, tagging and entity extraction.”

Klemen Simonic (2nd right) and his Soniox team. (Supplied)
That early exposure led him to two internships at Stanford University in 2009 and 2011, where he worked alongside top researchers in AI. “I wanted to join Google to work on these cool things,” he said.
After an internship there in 2014, Simonic was courted by both Google and Facebook — ultimately joining the latter in 2015 to help build speech recognition systems now used across Facebook, Instagram and WhatsApp.
Today, his company is focused entirely on voice AI, and its promise goes beyond convenience.
With privacy and compliance built in — including SOC 2 Type II certification and HIPAA readiness — Soniox is already being used in hospitals, call centers and emergency rooms where clear, accurate transcription can be a life-saving tool.
Opinion
This section contains relevant reference points, placed in (Opinion field)
“We have many healthcare customers using our API in emergency rooms where real-time AI interpretation can bridge communication gaps that human translators sometimes cannot, especially with complex medical terminology,” said Simonic.
represents a particularly compelling market for the company’s ambitions. With more than 90 percent smartphone penetration and a population where 70 percent of people are aged under 35, the Kingdom is fertile ground for voice-enabled technologies.
The widespread adoption of government-developed platforms like Tawakkalna during the COVID-19 pandemic only accelerated the Kingdom’s reliance on mobile-first services.
“Data and artificial intelligence contribute to achieving ’s Vision 2030; this is because, out of 96, 66 of the direct and indirect goals of the vision are related to data and AI,” according to the Saudi Data & AI Authority.
The Kingdom’s communications and IT sector is now worth more than $44 billion — 4.1 percent of gross domestic product — and expanding quickly with strategic investments in cloud computing, automation and smart infrastructure.
Although Soniox does not yet have a team on the ground in the region, the company sees significant interest from Saudi organizations exploring AI-powered transcription and customer service tools.
Simonic said there are pilot programs in countries like Portugal and interest from companies in looking to improve call center and transcription services.
And while Arabic remains one of the more complex languages for voice AI, Simonic sees both the challenge and the opportunity. Many of ’s rural communities speak dialects rich in cultural nuance — languages that are often excluded from mainstream datasets.
This environment offers fertile ground for Soniox’s technology, which strives to “enable all languages, so everyone in the world can speak and be understood by AI.”
Simonic’s team, primarily based in Slovenia, is committed to expanding language support to make the technology more inclusive, even in markets where none of the developers speak the local tongue.
Soniox is also designed with flexibility in mind. Businesses can integrate its API without storing any audio or transcripts, ensuring tight data control. For individual users, features like encrypted transcripts and a summarizing tool enhance productivity — even for the tech-averse.
“My mom is not very tech-savvy, but she uses our app to build her grocery shopping list,” Simonic said. “That was not the original purpose, but it shows how technology can evolve in ways we didn’t expect.”
In July, Soniox launched a new comparison tool that allows developers and businesses to benchmark different speech AI providers using their own voice samples and real-world data.
It is another step toward transparency and broader adoption — especially in regions like the Gulf, where choosing the right solution can hinge on performance in diverse linguistic contexts.
“The tech morphs, but the human voice remains the most intimate and effective way we communicate,” Simonic said.
As pushes forward with its digital transformation under Vision 2030, technologies like Soniox may find their voice amplified — not just as a tool for productivity, but also as a bridge between language, innovation and access in a rapidly changing world.