Unternehmen
NVIDIA GmbH
Über diese Stelle
Über NVIDIA GmbH
Our work at NVIDIA is dedicated towards a computing model focused on visual and AI computing. For two decades, NVIDIA has pioneered visual computing, the art and science of computer graphics, with our invention of the GPU. The GPU has also shown to be spectacularly effective at solving some of the most complex problems in computer science. Today, NVIDIA's GPU simulates human intelligence, running deep learning algorithms and acting as the brain of computers, robots and self-driving cars that can perceive and understand the world. We are looking to grow our company and teams with the smartest people in the world and there has never been a more exciting time to join our team!
NVIDIA's accelerated computing platform is the foundation of modern HPC and AI. At the core of this platform are the CUDA Core Libraries-C++ and Python libraries that enable developers to write fast, reliable, scalable GPU-accelerated software. We are looking for outstanding interns to contribute to the CUDA Core Libraries that power GPU computing for both C++ and Python developers. This includes projects like CCCL (Thrust, CUB, libcudacxx), cuda-python, and numba-cuda. Join the team building the foundational libraries, algorithms, language and compiler infrastructure that make CUDA a speed of light delight for developers across a wide range of workloads including deep learning, scientific computing, and data analytics.
Aufgaben
- Contribute to the design and implementation of CUDA Core Libraries in C++ and/or Python, including parallel algorithms and language idiomatic exposure of core CUDA concepts.
- Design and optimize GPU algorithms and APIs, from high-level interfaces down to low-level performance tuning involving memory, parallelism, and synchronization.
- Improve developer experience: tests, benchmarks, CI, packaging, documentation, and examples.
- Collaborate with experienced CUDA engineers; participate in design reviews, code reviews, and open-source-style workflows.
Fähigkeiten
- Currently pursuing a BS, MS, or PhD in Computer Science, Computer Engineering, or a related field.
- Strong programming skills in C++, Python, or both, with interest in systems-level development (performance, memory, concurrency, API design).
- Familiarity with modern C++ (templates, generics, standard library) and/or Python library development and packaging.
- Experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accelerated Python, or similar) through coursework, projects, or research.
- Experience with software libraries or open-source projects, including testing, performance profiling, and code reviews.
- Ability to work independently and drive a project from exploration to completion.
- Clear written communication for design discussions and documentation.
- Ways to stand out from the crowd
- Knowledge of CPU/GPU architecture and how hardware details impact algorithmic performance.
- Hands-on experience with CUDA C++, CUDA Python, Pytorch, JAX, Numba, CuPy, or related GPU-accelerated Python stacks.
- Familiarity with libraries such as Thrust, CUB, libcudacxx, or similar modern C++/GPU libraries.
- Familiarity with compiler infrastructure and tooling such as LLVM, Clang/LLVM tooling, or MLIR.
- Comfort navigating and debugging large, multi-language codebases (C++, Python, CMake, GitHub Actions CI systems) with demonstrated interest in developer tools, library design, and making other developers faster and more productive.
Standort
Adresse
München, Deutschland