Serverless AI Inference - run any model, at any scale, without managing GPUs
Machine Learning Engineer – AI Architecture Research
Location
Worldwide
Posted
142 days ago
Salary
0
Seniority
Senior
Job Description
Machine Learning Engineer – AI Architecture Research
Featherless AI
• Research and develop new neural network architectures (e.g. alternatives or extensions to Transformers, recurrent / hybrid models, long-context systems) • Design and run architecture-level experiments (scaling laws, memory mechanisms, compute trade-offs) • Prototype models end-to-end — from research code to training-ready implementations • Collaborate with inference and systems engineers to ensure architectures are deployable and efficient • Analyze model behavior, failure modes, and inductive biases • Read, reproduce, and extend cutting-edge research papers • Contribute to internal research notes, benchmarks, and open-source efforts (where applicable)
Job Requirements
- Strong background in machine learning fundamentals and deep learning
- Hands-on experience implementing model architectures from scratch
- Solid understanding of:
- Attention mechanisms, RNNs, state-space models, or hybrid architectures
- Training dynamics, scaling behavior, and optimization
- Memory, latency, and compute constraints at the model level
- Comfortable working in PyTorch or JAX
- Ability to move fluidly between theory, experimentation, and engineering
- Clear communicator who can explain architectural trade-offs
- Nice to Have
- Experience with non-Transformer architectures (RNN variants, SSMs, long-context models)
- Background in research-driven startups or open-source ML projects
- Experience with large-scale training or custom training loops
- Publications, preprints, or notable research contributions
- Familiarity with inference optimization and deployment constraints
Benefits
- Competitive compensation + meaningful equity
Related Guides
Related Job Pages
More AI Engineer Jobs
• Own the strategic roadmap for key AI platforms, focusing on zero and low-code agent builders to streamline internal operations. • Manage relationships with AI platform vendors (e.g., Google, OpenAI) to oversee implementation, align roadmaps, roll out new features, and resolve technical bugs. • Provide technical expertise in building AI agents and implementing platforms; mentor and enable the AI Champion advisory group so they can empower their respective departments. • Apply agentic best practices to design and build workflows that automate complex tasks and help run the business more effectively. • Collaborate with other platform owners to test and continually enhance integrations across our tech stack, including Google Workspace, Slack, ServiceNow, and Salesforce. • Gather requirements from "champion" users and partner with the high-code agent team to test and integrate custom-built functionality into vendor platforms. • Help bridge technical gaps between Product Management and the AI Enablement team to ensure platform capabilities meet business needs.
AI Engineer
EthosEthos blends industry expertise and technology to provide accessible and affordable life insurance coverage.
• Production RAG: indexing, retrieval, hybrid search, reranking, query rewriting, grounding, citations • Context Graph: entity resolution + linking + provenance; graph + vector retrieval; supports multi-hop context • LLM orchestration: tool/function calling, structured outputs, routing across model tiers, failure modes • GPU/inference cost optimization: batching, caching/KV reuse, quantization, autoscaling; optimize $/session + latency • Safety + compliance: PII/PHI handling, redaction, audit logs, deterministic replay, hallucination mitigation • LLMOps: eval harness (golden sets, regression, adversarial), monitoring for quality/cost/drift • Design/ship the end-to-end pipeline: retrieve → assemble context → generate → cite → log/monitor • Improve quality and trust via evaluation, feedback loops, and clear evidence-backed outputs • Partner with product, security, and domain teams; write crisp design docs; raise engineering bar • Ship RAG v1 with citations + measurable quality metrics • Deliver Context Graph v1 that improves retrieval on real copilot tasks • Reduce cost/latency with a concrete inference optimization plan shipped to prod.
AI Engineer
Precision Medicine GroupPrecision Medicine Group delivers specialty services that help its life science clients navigate healthcare challenges. The company entered its 10th year of suc
• Build and deploy enterprise-ready AI solutions using LLMs, other GenAI approaches, and deep learning capabilities. • Scale out AI solutions: optimize performance of solutions, automate deployment and testing • Rapidly prototype and iterate on AI applications using Azure, AWS, and off-the-shelf tools. • Partner with the AI Solutions Architect to ensure scalable, secure, and compliant system design. • Develop APIs and lightweight UIs (e.g., Dash, Flask and others) to deliver AI tools to end users. • Stay current on emerging AI technologies, including vector databases, RAG pipelines, and productivity AI platforms. • Drive delivery of AI components aligned with product roadmaps and business priorities.
Lead AI Engineer – GCP, GenAI
DKSH Portugal, Unipessoal, Lda.Distributor of Specialty Chemicals and Innovative Ingredients. Market Expansion Services Provider
• Design, build, and operate LLM-powered systems using Gemini and Vertex AI • Implement RAG architectures at scale, including ingestion, retrieval, and generation • Build and orchestrate LLM agents using LangChain or similar frameworks • Integrate AI capabilities via API-driven architectures • Debug and optimize end-to-end LLM pipelines: Chunking strategies, Embeddings, Retrieval logic, LLM response behavior • Deliver production-ready AI services, including: Monitoring and observability, Rate limiting and cost controls, Reliability and fallback strategies • Contribute to solution design and technical decision-making • Continuously evaluate and experiment with new LLM models and platform features • Implement AI safety, security, and compliance controls • Collaborate with cross-functional teams across time zones




