Speech Engineer

Are you ready to take on large responsibilities that will impact all aspects of our work?
Scroll to content

At Storytel we believe that powerful stories add an extra dimension to life. We offer hundreds of thousands of audiobooks and ebooks to customers in more than 20 markets, with several new markets launching in the coming year. As we continue to accelerate our development of speech technology and in particular Text-to-Speech, we are hiring three new key roles to our Speech team to help us build some of the best automatic narration technology in the world.

About the Team

The role is in the Speech team, a part of the larger Intelligence group which houses our machine learning and data science teams. In the Speech team we build services that enable Storytel to efficiently generate new, and understand existing content. In particular, our team owns the entire Text-to-Speech stack at Storytel, from data curation to modelling decisions, training and deployment infrastructure. In order to accelerate the development, and get the system in production, we are growing our team. Since the team is new, each position we're hiring for is considered essential. Our new team members will be expected to take on large responsibilities and will impact all aspects of our work. While each role's main responsibilities are different, we will all work closely together to achieve our big ambitions of highly automated and prosodically rich speech synthesis.

We are an international company with colleagues in the larger Intelligence team in Stockholm, Barcelona and Copenhagen. The Speech team is currently based in Stockholm, and while we hope to keep building the team in the Stockholm offices we are open to work with the right candidates to find a solution that is great for both parties.

About the Role

As a Speech Engineer you will have a large responsibility for the components closest to the generated audio, from audio preprocessing, architecture of spectrogram prediction and vocoder model. You will work with the team to evaluate our stack using a combination of quality assessment methods and live evaluation, and tune our components to improve our result. To rapidly scale our work you will work to find ways of more effectively using our existing data.  Finally, since the field is evolving rapidly you will also need to keep our stack up to date with developments in speech technology and interact with the research and open source communities.

About You

We believe that you are passionate about speech technology, see its potential and eager to use it to build something impactful. You’re interested in staying in touch with the field as it evolves and eager to expand your knowledge of how to put your work in production.

To be successful in this role we believe that you have:

🧑‍🎓 PhD or MSc degree, or equivalent industry experience, in Speech technology, Machine Learning, Computer Science, Mathematics, Physics, or a related field.

⭐️ Research or industry experience in audio and speech processing using neural networks, e.g. text-to-speech, speech recognition, or similar topics.

✅ Expansive knowledge and understanding of modern deep learning

🧠 Strong understanding of the current state of text-to-speech, including developing trends and current State-of-The-Art models.

📣 Familiarity with common audio processing methods and their corresponding terminology

☁️ Comfortable working in Python and one of the frameworks: Tensorflow, PyTorch or JAX.

🇬🇧 Excellent written and verbal communication in English.


While not required, we would also love to hear about any of:

  • Experience in building qualitative listening tests for either online crowds or experts.
  • Packing models and audio processing pipelines for deployment.
  • Experience with distributed training of DL models.
  • Experience with both RNN and Transformer based TTS.
  • Contributions to open source frameworks or tools for speech processing.
  • Experience using tools like Google Cloud, Docker, Kubernetes or Kubeflow.
  • Published in top tier ML conferences like ICASSP, Interspeech, ACL, NeurIPS, ICLR, ICML, AAAI

What we offer

  • Participate in developing a top-notch streaming entertainment platform used by over a million users worldwide
  • Plenty of autonomy and responsibility
  • Your own yearly education budget
  • A workplace that values creativity and personal initiative
  • Limitless audio and ebooks from our own service
  • An international team of super-talented colleagues
  • Explore, work, and implement some of the newest and hottest technologies
  • A company full of book lovers

Does this sound like you? If you feel like Storytel is a place where you could thrive, let us know and we will contact you as soon as possible.

Additional information

  • Remote status

    Flexible remote

Or, know someone who would be a perfect fit? Let them know!


Tryckerigatan 4
111 28 Stockholm Directions


The Storytellers in 4 words?
Friendly, Welcoming, Helpful, Innovative.

Amount of coffee and tea cups per day?
Impossible to know, we like our hot beverages.

Number of orange headphones?

Times we celebrate?
Whenever we have something to celebrate. Which is quite often, we like to celebrate - preferably with cake. 

Already working at Storytel?

Let’s recruit together and find your next colleague.


Applicant tracking system by Teamtailor