Centific logo
Centific

Unlock the Value of AI and Unleash the Possibilities

AI Research Engineer: Vision AI / VLM / Physical AI-2

AI Research ScientistMachine Learning EngineerFull TimeRemoteMid LevelTeam 5,001-10,000H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

5 days ago

Salary

$140K - $150K / year

Seniority

Mid Level

Job Description

AI Research Engineer: Vision AI / VLM / Physical AI-2

Centific

Role Description Are you pushing the frontier of computer vision, multimodal large models, and embodied/physical AI—and have the publications to show it? Join us to translate cutting-edge research into production systems that perceive, reason, and act in the real world. We are building state-of-the-art Vision AI across 2D/3D perception, egocentric/360° understanding, and multimodal reasoning. As an AI Research Engineer, you will own high-leverage experiments from paper → prototype → deployable module in our platform. We are seeking passionate Engineers to join our cutting-edge labs, you could be part of: - Computer Vision team: Dive into the world of 3D reconstruction, scene understanding, and visual AI. Explore innovative techniques like those used to transform real-world spaces into immersive 3D models. - Physical AI Robotics team: Work at the intersection of simulation, robotics, and AI. Leverage NVIDIA’s Omniverse for advanced 3D simulation and collaboration. What You’ll Do: - Advance Visual Perception: Build and fine-tune models for detection, tracking, segmentation (2D/3D), pose & activity recognition, and scene understanding (incl. 360° and multi-view). - Multimodal Reasoning with VLMs: Train/evaluate vision-language models (VLMs) for grounding, dense captioning, temporal QA, and tool use. - Physical AI & Embodiment: Prototype perception-in-the-loop policies that close the gap from pixels to actions. - Data & Evaluation at Scale: Curate datasets, author high-signal evaluation protocols/KPIs, and run ablations. - Systems & Deployment: Package research into reliable services on a modern stack (Kubernetes, Docker, Ray, FastAPI). - Agentic Workflows: Orchestrate multi-agent pipelines that combine perception, reasoning, simulation, and code generation. Example Problems You Might Tackle: - Long horizon video understanding from egocentric or 360° video. - 3D scene grounding: linking language queries to objects, affordances, and trajectories. - Fast, privacy-preserving perception for on-device or edge inference. - Robust multi-modal evaluation: temporal consistency, open-set detection, uncertainty. - Vision conditioned-policy evaluation in simulation with sim2real stress tests. Qualifications - Masters/Ph.D in CS/EE/Robotics (or related), actively publishing in CV/ML/Robotics. - Strong PyTorch (or JAX) and Python; comfort with CUDA profiling and mixed precision training. - Demonstrated research in computer vision and at least one of: VLMs, embodied/physical AI, 3D perception. - Proven ability to move from paper → code → ablation → result with rigorous experiment tracking. Requirements - Experience with video models (e.g., TimeSFormer/MViT/VideoMAE), diffusion or 3D GS/NeRF pipelines, or SLAM/scene reconstruction. - Prior work on multimodal grounding or temporal reasoning. - Familiarity with ROS2, DeepStream/TAO, or edge inference optimizations. - Scalable training: Ray, distributed data loaders, sharded checkpoints. - Strong software craft: testing, linting, profiling, containers, and reproducibility. - Public code artifacts (GitHub) and first-author publications or strong open source impact. Benefits - Real impact: Your research ships—powering core features in our MVPs and products. - Mentorship: Work closely with our Principal Architect and senior engineers/researchers. - Velocity + Rigor: We balance top-tier research practices with pragmatic product focus. - Salary: $140K - $150K

Related Job Pages

More AI Research Scientist Jobs

Cotiviti logo

Intern – Agentic AI Research

Cotiviti

Enabling a high-quality and viable healthcare system

InternshipRemoteTeam 5,001-10,000H1B Sponsor

• Researching and developing advanced healthcare informatics solutions with a specialization in Agentic AI. • Explore applications of Agentic and Generative AI in healthcare. • Develop, analyze, and collaborate on agentic AI projects. • Work closely with cross-functional teams to drive innovative research and practical implementation in healthcare environments.

United States
$32 - $40 / hour
Proxima logo

AI Scientist

Proxima

Decode and design the interfaces of life

Full TimeRemoteTeam 11-50Since 2019H1B Sponsor

• Scientifically direct the design and training of large-scale, state-of-the art deep learning systems • Develop novel architecture and training paradigms to lead the industry in unsolved scientific problems • Collaborate with content experts from other domains (e.g., chemistry, physics, biology) to enable innovative feature-engineering and novel cross-disciplinary approaches • Actively contribute to top-tier ML conferences and journals and attend core ML conferences to stay connected with the community and current trends

South Korea
Tether.to logo

AI Research Engineer – Pre-training, LLM, Multi-Modal

Tether.to

Bringing real world currency to the blockchain.

Full TimeRemoteTeam 11-50Since 2014H1B No Sponsor

• Conduct foundational pre-training for LLMs and Multi-Modal models (integrating text, vision, audio, or other modalities) on large, distributed servers equipped with multi-nodes & thousands of NVIDIA GPUs • Design, prototype, and scale innovative architectures, tokenizers, and cross-modal alignment layers to enhance model intelligence and multi-modal understanding • Source, filter, and curate massive-scale textual and multi-modal datasets, establishing robust data pipelines for efficient pre-training • Independently and collaboratively execute experiments, analyze results, and refine training methodologies for optimal performance and token efficiency • Investigate, debug, and eliminate bottlenecks in model efficiency, computational performance, and multi-modal alignment stability during long training runs • Contribute to the advancement of distributed training systems to ensure seamless scalability and hardware efficiency on target platforms

United Arab Emirates
Tether.to logo

AI Research Engineer – Pre-training, LLM, Multi-Modal

Tether.to

Bringing real world currency to the blockchain.

Full TimeRemoteTeam 11-50Since 2014H1B No Sponsor

• Conduct foundational pre-training for LLMs and Multi-Modal models on large, distributed servers • Design, prototype, and scale innovative architectures, tokenizers, and cross-modal alignment layers • Source, filter, and curate massive-scale textual and multi-modal datasets • Independently and collaboratively execute experiments, analyze results, and refine training methodologies • Investigate, debug, and eliminate bottlenecks in model efficiency • Contribute to the advancement of distributed training systems

Italy