Unternehmen
DeepL SE
Über diese Stelle
Über DeepL SE
DeepL is a global communications platform powered by Language AI. Since 2017, we’ve been on a mission to break down language barriers. Our human-sounding translations and intelligent writing suggestions are designed with enterprise security in mind. Today, they enable over 100,000 businesses to transform communications, reach new markets, and improve productivity. And, empower millions of individuals worldwide to make sense of the world and express their ideas.
Our goal is to become the global leader in Language AI, building products that drive better communication, foster connections, and make a real-life impact. To achieve this, we need talented individuals like you to join our exciting journey. If you're ready to work with a dynamic team and build your career in the fast-moving AI space, DeepL is your next destination.
What sets us apart
What sets us apart is our blend of modern technology, competitive benefits, and an open, welcoming work culture that enables our people to thrive. When we share what it's like to work at DeepL, the reactions are overwhelmingly positive. This may be because of our products that have helped countless people worldwide or our shared mission to improve communication for individuals and businesses, bringing cultures closer together. What we know for sure is this: being part of DeepL means joining a team dedicated to innovation and employee well-being. Discover what our teams have to say about life at DeepL on LinkedIn, Instagram and our Blog.
Meet the team behind this journey
The HPC Engineering team plays a pivotal role within DeepL Research, striving to achieve the best possible computational performance with our AI technology. At DeepL, High Performance Computing means working across a software stack that we control end-to-end. This ranges from targeted optimizations for latest-generation GPUs over efficiently parallelized LLM training at scale to high-throughput, low-latency inference solutions serving millions of users.
We build and improve the core components of our deep learning software and iterate on model design, closely collaborating with research scientists in both the foundational model group and the language AI teams. As a member of our team, you will work in a highly dynamic environment with ample opportunities to make a significant impact across our research and product portfolio.
Aufgaben
- Enable efficient parallelization on highly interconnected GPU clusters and design ideal strategies for large-scale model training
- Build capable solutions for multimodal LLM inference in high-throughput and low-latency scenarios
- Dive deep into our tech stack: Profile, debug, and optimize GPU kernels and low-level systems configuration
- Together with our researchers, ensure that state-of-the-art models are implemented correctly and efficiently
- Evaluate the latest technologies and develop innovative ideas that push the boundaries of what is computationally possible
Fähigkeiten
- You have strong analytic skills and a scientific approach to technical challenges, as demonstrated, for example, by a master’s or doctoral degree in computer science, mathematics, physics, or a related field
- You are an accomplished software engineer, enjoy solving tough problems, and thrive building solutions independently
- A track record of taking responsibility for projects from conception over implementation to production use
- Strong experience in Python, including its bindings to native code in C++ or Rust
- Deep familiarity with PyTorch and the AI software ecosystem. Key technologies include the PyTorch Distributed library, large-scale training frameworks like Megatron-LM or TorchTitan and highly-optimized inference libraries such as TensorRT-LLM, vLLM, or SGLang
- Good understanding of the high-performance programming model for GPUs and of the collective communication primitives used in parallel computing with MPI or NCCL
- High proficiency in English; knowledge of German or other languages is a plus
- Nice to haves:
- Experience in optimizing and troubleshooting workloads on GPU compute clusters at scale
- A thorough understanding of today’s state-of-the-art Transformer architectures, which allows you to reason about their computational envelopes and bottlenecks
Standort
Adresse
Köln, Deutschland