Bringing real world currency to the blockchain.
Research Engineer Intern, Video/Multimodal LLM
Location
Hungary
Posted
78 days ago
Salary
0
Seniority
Entry Level
Job Description
Research Engineer Intern, Video/Multimodal LLM
Tether.to
• Research and develop state-of-the-art LLM and/or video or multimodality models • Implement the application of large models in business scenarios • Continuously optimize the training framework for large models on GPUs • Collaborate with broader teams across Tether
Job Requirements
- MSc/PhD candidate in computer science or a related technical discipline
- Related Research Experience in areas such as LLM, computer vision, multimodality
- Proficient with PyTorch deep learning framework and libraries
- Excellent analytical and problem-solving skills
- Publications in top AI conferences is a nice to have
Benefits
- Innovation and hands-on learning experience
- Community-building and development events
- Collaboration with industry experts
- Opportunities for dynamic internship experience
Related Guides
Related Categories
Related Job Pages
More Research Engineer Jobs
• Research and improve open-source video and multimodal video generation foundation models. • Focus on one or more areas such as pre-training, supervised fine-tuning, post-training, inference, architecture design, or evaluation. • Benchmark models against current state-of-the-art, identify bottlenecks, and propose novel improvements. • Work with large-scale video datasets and distributed training systems. • Collaborate with researchers and engineers on projects with clear research and publication potential.
• Research and improve open-source video and multimodal video generation foundation models • Focus on one or more areas such as pre-training, supervised fine-tuning, post-training, inference, architecture design, or evaluation • Benchmark models against current state-of-the-art, identify bottlenecks, and propose novel improvements • Work with large-scale video datasets and distributed training systems • Collaborate with researchers and engineers on projects with clear research and publication potential
Senior Research Engineer – Multimodal, Video Foundation Model
Tether.toBringing real world currency to the blockchain.
• Pioneer multimodal and video-centric research that moves fast and breaks ground, contributing directly to usable prototypes and scalable systems • Design and implement novel AI architectures for multimodal language models, integrating text, visual, and audio modalities • Engineer scalable training and inference pipelines optimized for large-scale multimodal datasets and distributed GPU systems across thousands of GPUs • Optimize systems and algorithms for efficient data processing, model execution, and pipeline throughput • Build modular tools for preprocessing, analyzing, and managing multimodal data assets (e.g., images, video, text) • Collaborate cross-functionally with research and engineering teams to translate cutting-edge model innovations into production-grade solutions • Prototype generative AI applications showcasing new capabilities of multimodal foundation models in real-world products • Develop benchmarking tools to rigorously evaluate model performance across diverse multimodal tasks
• Research and improve open-source video and multimodal video generation foundation models • Focus on one or more areas such as pre-training, supervised fine-tuning, post-training, inference, architecture design, or evaluation • Benchmark models against current state-of-the-art, identify bottlenecks, and propose novel improvements • Work with large-scale video datasets and distributed training systems • Collaborate with researchers and engineers on projects with clear research and publication potential
