Machine Learning Engineer

You will have an essential role in developing TTS and other speech technology applications!
Scroll to content

At Storytel we believe that powerful stories add an extra dimension to life. We offer hundreds of thousands of audiobooks and ebooks to customers in more than 20 markets, with several new markets launching in the coming year. As we continue to accelerate our development of speech technology and in particular Text-to-Speech, we are hiring three new key roles to our Speech team to help us build some of the best automatic narration technology in the world.

About the Team

The role is in the Speech team, a part of the larger Intelligence group which houses our machine learning and data science teams. In the Speech team we build services that enable Storytel to efficiently generate new, and understand existing content. In particular, our team owns the entire Text-to-Speech stack at Storytel, from data curation to modelling decisions, training and deployment infrastructure. In order to accelerate the development, and get the system in production, we are growing our team. Since the team is new, each position we're hiring for is considered essential. Our new team members will be expected to take on large responsibilities and will impact all aspects of our work. While each role's main responsibilities are different, we will all work closely together to achieve our big ambitions of highly automated and prosodically rich speech synthesis.

We are an international company with colleagues in the larger Intelligence team in Stockholm, Barcelona and Copenhagen. The Speech team is currently based in Stockholm, and while we hope to keep building the team in the Stockholm offices we are open to work with the right candidates to find a solution that is great for both parties.

About the Role

As a Machine Learning Engineer focusing on Deep Learning you will have a large responsibility for model implementation, our training infrastructure and the way we package our artifacts for serving. You will work on improving how we handle systems of models interacting together on NLP problems like text normalization and g2p all the way to audio. We also expect you to help improve how we use and build datasets for TTS, e.g. setting up self-/semi-supervised learning for various models in our stack or working with active learning to improve the annotated part of our data. To do this well we believe that you will also need to stay up to date with developments in deep learning and interact with the research and open source communities.

About You

You understand that in practice, deep learning requires more engineering work than advertised, and that a significant amount of the work is required in everything from data preparation, versioning and deployment. Your strong understanding of neural networks helps you in debugging networks and understanding various test time tradeoffs.

To be successful in this role we believe that you have:

🧑‍🎓 MSc degree, or similar, in Speech technology, Machine Learning, Computer Science, Mathematics, Physics, or a related field.

⭐️ Research or industry experience with Deep Learning.

🧠 Good knowledge of recent developments in at least one of: Self-/Semi-Supervised Learning, Active Learning, Distributed Training, Generative Models, Flow based models, GANs, Autoregressive models, Transformers, Audio, Speech.

✅ Expansive knowledge of deep learning and experience getting deep neural networks in production.

☁️ In-depth understanding of one of the frameworks: Tensorflow, PyTorch or JAX.

👩‍💻 Strong Python development knowledge.

🇬🇧 Excellent written and verbal communication in English.


While not required, we would also love to hear about any of:

  • Experience working with deep learning in any of the following applications: natural language processing, speech processing, audio recognition, audio generation, content generation.
  • Experience with large scale training with multi-GPU / TPU setups and setting up distributed training (towered or multi-node).
  • Familiarity with tools for serving deep learning artifacts, such as TensorFlow serving, TorchServe, TensorRT (Triton) etc.
  • Contributions to open source frameworks or tools for Deep Learning.
  • Experience using tools like Google Cloud, Docker, Kubernetes or Kubeflow.
  • Published in top tier ML conferences like ICASSP, Interspeech, ACL, NeurIPS, ICLR, ICML, AAAI

What we offer

  • Participate in developing a top-notch streaming entertainment platform used by over a million users worldwide
  • Plenty of autonomy and responsibility
  • Your own yearly education budget
  • A workplace that values creativity and personal initiative
  • Limitless audio and ebooks from our own service
  • An international team of super-talented colleagues
  • Explore, work, and implement some of the newest and hottest technologies
  • A company full of book lovers

Does this sound like you? If you feel like Storytel is a place where you could thrive, let us know and we will contact you as soon as possible.

Additional information

  • Remote status

    Flexible remote

Or, know someone who would be a perfect fit? Let them know!


Tryckerigatan 4
111 28 Stockholm Directions


The Storytellers in 4 words?
Friendly, Welcoming, Helpful, Innovative.

Amount of coffee and tea cups per day?
Impossible to know, we like our hot beverages.

Number of orange headphones?

Times we celebrate?
Whenever we have something to celebrate. Which is quite often, we like to celebrate - preferably with cake. 

Already working at Storytel?

Let’s recruit together and find your next colleague.


Applicant tracking system by Teamtailor