Bringing real world currency to the blockchain.

AI Research Engineer – Kernel, Inference Optimization

AI Research ScientistMachine Learning EngineerFull Time Remote SeniorTeam 11-50Since 2014H1B No SponsorCompany Site LinkedIn

Location

United Kingdom

Posted

8 days ago

Salary

Seniority

Senior

Postgraduate DegreeEnglishFlash

Job Description

• Drive innovation in model serving and inference architectures for advanced AI systems. • Focus on optimizing model deployment and inference strategies to deliver highly responsive, efficient, and scalable performance across real-world applications. • Work on a wide spectrum of systems, ranging from resource-efficient models designed for limited hardware environments to complex, multi-modal architectures that integrate data such as text, images, and audio. • Develop, test, and implement novel serving strategies and inference algorithms. • Engineer robust inference pipelines, establish comprehensive performance metrics, and identify and resolve bottlenecks in production environments. • Enable high-throughput, low-latency, low-memory footprint, and scalable AI performance that delivers tangible value in dynamic, real-world scenarios.

Job Requirements

A degree in Computer Science or related field.
Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A* conferences).
Must have knowledge of Metal Shading Language (MSL).
Proven experience in low-level kernel optimizations and inference optimization on mobile devices is essential.
Your contributions should have led to measurable improvements in inference latency, throughput, and memory footprint for domain-specific applications, particularly on resource-constrained devices and edge platforms.
A deep understanding of modern model serving architectures and inference optimization techniques is required.
Must have strong expertise in writing GPU kernels for mobile devices (i.e., smartphones) as well as a deep understanding of model serving frameworks and engines.
Practical experience in developing and deploying end-to-end inference pipelines, from optimizing models for efficient serving to integrating these solutions on resource-constrained devices is required.
Demonstrated ability to apply empirical research to overcome challenges in model serving, such as latency optimization, computational bottlenecks, and memory constraints.
You should be proficient in designing robust evaluation frameworks and iterating on optimization strategies to continuously push the boundaries of inference performance and system efficiency.
Distributed Inference Systems: Designing and optimizing high-performance inference engines using techniques like Tensor Parallelism, Pipeline Parallelism, and Expert Parallelism to handle massive models on GPU clusters.
Deep understanding of the math and structure behind Diffusion Models and Vision Transformers
Understanding of Pruning, Quantization, Flash attention, KV Cache, Speculative Decoding (Eagle) etc.

Benefits

Flexible working hours
Professional development opportunities

Related Categories

AI Research Scientist AI Engineer Machine Learning Engineer LLM Engineer Computer Vision Engineer NLP Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More AI Research Scientist Jobs

AI Research Engineer – Applied AI

Bright Vision Technologies

AI Research Scientist9 days ago

Full Time Remote

• Design, prototype, and evaluate applied AI solutions across natural language, vision, recommendation, and structured data domains. • Translate ambiguous business problems into well-scoped ML formulations with clear success metrics and evaluation strategies. • Stay current with the latest research in deep learning, large language models, and adjacent areas, and assess applicability to internal use cases. • Implement rigorous experimentation workflows including baselines, ablations, and statistically sound evaluation methodology. • Build production-quality training and inference pipelines using modern ML frameworks and orchestration tools. • Collaborate with ML platform engineers to ensure efficient use of compute, storage, and accelerator resources. • Optimize models for accuracy, latency, throughput, and cost based on production requirements. • Develop tooling for dataset construction, labeling, validation, and ongoing monitoring of data quality. • Partner with product, design, and domain experts to ensure model behavior aligns with user needs and policy requirements. • Implement safety, fairness, and reliability evaluations and incorporate findings into model selection decisions. • Document research findings, design decisions, and operational characteristics clearly for both technical and non-technical audiences. • Mentor engineers on applied ML methodology, evaluation rigor, and responsible deployment. • Contribute to internal knowledge sharing, reading groups, and prototype-to-production playbooks. • Influence the broader AI roadmap based on research insight, capability gaps, and emerging opportunities.

Python PyTorch

View details: AI Research Engineer – Applied AI

United States

Apply

Lead Bioinformatics AI Scientist

Baylor Genetics

Baylor Genetics pioneered the history of genetic testing. Now, we’re leading the way in precision diagnostics.

AI Research Scientist10 days ago

Full Time RemoteTeam 501-1,000Since 1978H1B No Sponsor

Company Site LinkedIn

• Serves as the visionary leader in Bioinformatics AI application development in a clinical genetic testing setting. • Provides technical guidance and hands-on support towards building company’s next-generation bioinformatics AI platform. • Identifies, prototypes, and develops state-of-the-art AI applications to revolutionize clinical testing and genomic analysis workflow. • Designs, develops, evaluates, and deploys novel AI solutions to gain valuable data insights based on the genetical, phenotypical, and clinical datasets. • Evaluates, adopts, and customizes GenAI models based on both internal and external datasets to build next-generation clinical genetic testing platforms. • Supports both internal and external data requirements by leveraging AI and GenAI capabilities to keep up with the increasing demands of the business. • Collaborates in a multidisciplinary and regulated clinical diagnostics environment with geneticists, bioinformaticians, software engineers, and IT infrastructure professionals.

AWS Azure Cloud Google Cloud Platform Python PyTorch SQL Tensorflow

View details: Lead Bioinformatics AI Scientist

United States

Apply

Legal Expert – AI Research Insights

Terac

Democratizing the future of market research with AI

AI Research Scientist10 days ago

Contract RemoteTeam 1-10Since 2025H1B No Sponsor

Company Site LinkedIn

• Conduct a research study with Legal professionals • Collect real-world professional scenarios to help train AI models • Submit scenarios resolved through conversation or consultation • Follow validation criteria for scenario submissions

View details: Legal Expert – AI Research Insights

United States

$12 - $220 / hour

Apply

Job Closed

AI Researcher Intern

GenScript

Make People and Nature Healthier through Biotechnology

AI Research Scientist10 days ago

Internship RemoteTeam 5,001-10,000Since 2002H1B Sponsor

Company Site LinkedIn

• Research and design Agent execution framework, providing standardized runtime environment for intelligent agents • Implement tool call orchestration mechanism, supporting unified abstraction for function calling, API integration, and external system interaction • Build execution sandbox environment to ensure safety and controllability of Agent operations • Design task decomposition and planning engine, supporting automatic breakdown of complex goals and execution path optimization • Implement execution state tracking and anomaly recovery mechanisms to ensure reliability of long-running tasks • Design hierarchical memory architecture, covering storage and retrieval mechanisms for working memory, short-term memory, and long-term memory • Research memory compression and summarization techniques, enabling efficient storage of massive interaction history while preserving key information • Build context-aware memory system, supporting multi-dimensional memory association based on time, task, and user • Develop memory retrieval augmentation mechanisms, achieving deep integration of RAG and Agent memory • Explore memory forgetting and update strategies, balancing memory capacity with information timeliness • Research multi-Agent system architecture, design communication protocols and collaboration mechanisms between Agents • Implement role specialization and task allocation algorithms, supporting orchestration of expert Agents, coordinator Agents, executor Agents, and other roles • Build consensus achievement and conflict resolution mechanisms to handle decision disagreements among multiple Agents • Design Agent social behavior norms, simulating communication, negotiation, and feedback patterns in human team collaboration • Explore emergent behavior and collective intelligence, researching self-organization and adaptive capabilities in multi-Agent systems • Design Agent evaluation and benchmarking system, establishing quantitative capability metrics • Build Agent behavior interpretability framework, supporting decision process tracing and attribution analysis • Research Agent safety alignment mechanisms to prevent risks such as unauthorized operations, harmful outputs, and goal drift • Track cutting-edge Agentic AI research and translate academic achievements into engineering practice.

Docker Kubernetes Python

View details: AI Researcher Intern

New Jersey

$30 / hour

Apply

AI Research Engineer – Kernel, Inference Optimization

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More AI Research Scientist Jobs

AI Research Engineer – Applied AI

Lead Bioinformatics AI Scientist

Legal Expert – AI Research Insights

AI Researcher Intern