Capital One logo
Capital One

At Capital One, we think and work like a tech company, using our digital fluency to transform everything about the customer experience. We’re bending data to our will, and turning a stodgy industry on its head. That’s reflected in our ranking as the number one business technology innovator in the U.S. in the 2016 InformationWeek Elite 100.

Senior Director, Machine Learning Engineering

Machine Learning EngineerMachine Learning EngineerFull TimeRemoteSeniorTeam 10,001+Since 1994H1B SponsorCompany SiteLinkedIn

Location

Virginia

Posted

55 days ago

Salary

$286.2K - $326.7K / year

Seniority

Senior

Job Description

Senior Director, Machine Learning Engineering

Capital One

• Lead and scale a high-performing engineering organization responsible for the Personalization Platform that powers real-time, personalized product experiences and multi-channel targeted user messaging across Capital One products and services. • Define the technical strategy, delivery roadmap, and operating model for a portfolio spanning recommendation systems, ranking, decisioning, GenAI infrastructure, MLOps, and low-latency application-serving systems. • Build, develop, and manage engineers and engineering leaders; set a high bar for hiring, performance, talent density, coaching, and succession planning across the organization. • Partner cross-functionally with Product, Data Science, Cloud Infrastructure, and Machine Learning Platform teams to align strategy, prioritize investments, and co-develop advanced recommendation systems and algorithms serving Capital One users. • Drive the design, buildout, and operation of robust ML infrastructure and pipelines supporting feature extraction, model training, testing, guardrails, evaluation, deployment, and both real-time and batch inference with strong reliability, scalability, and operational rigor. • Architect low-latency, event-driven systems for real-time personalization and decisioning based on streaming data, user behavior, and contextual signals. • Drive the evolution of MLOps practices through automated, metrics-backed deployment workflows, validation and testing systems, model lifecycle governance, and scalable observability. • Guide the adoption of state-of-the-art AI and LLM optimization techniques to improve scalability, cost, latency, throughput, and reliability of large-scale production AI systems. • Provide organizational technical and people leadership by influencing architecture, engineering standards, delivery excellence, incident management, and cross-team strategy while mentoring managers, tech leads, and senior engineers. • Make high judgment build-vs-buy decisions across a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more. • Attract and retain top talent in the AI industry and nurture personal and professional development for your team. • Foster a culture of learning and staying abreast of the state-of-the-art in AI.

Job Requirements

  • Bachelor's degree in Computer Science, Engineering, or AI plus at least 10 years of experience developing or leading AI and ML algorithms or technologies, or Master's degree plus at least 8 years of experience developing or leading AI and ML algorithms or technologies
  • At least 5 years of people leadership experience
  • 7 years of experience managing and leading an engineering team
  • 8+ years of experience deploying scalable, responsible AI solutions on major cloud platforms (AWS, GCP, Azure)
  • Master’s or PhD in Computer Science or a relevant technical field
  • Proven expertise designing, implementing, and scaling personalization platforms and recommendation systems across feed personalization, ads ranking, or targeted marketing messaging
  • Proficiency in Python, Java, C++, or Golang; hands-on experience with ML frameworks (PyTorch, TensorFlow) and orchestration tools (Databricks, Airflow, Kubeflow)
  • Experience optimizing large-scale training and inference systems for hardware utilization, latency, throughput, and cost
  • Deep expertise in cloud-native engineering, containerization (Docker, Kubernetes), and automated CI/CD deployment
  • Deep experience with MLOps, model observability, and production ML lifecycle management
  • Strong track record building organizations, developing managers and senior engineers, and leading through scale and ambiguity
  • Excellent communication and presentation skills, with the ability to influence senior stakeholders and articulate complex AI concepts clearly.

Benefits

  • Comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being

Related Job Pages

More Machine Learning Engineer Jobs

Microsoft logo

Machine Learning Engineer for Master graduates

Microsoft

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to any characteristic protected by applicable local laws, regulations, and ordinances.

Full TimeRemoteTeam 10,001+H1B Sponsor

Overview Come build community, explore your passions, and do your best work at Microsoft while helping a team deliver machine learning (ML) capabilities that are reliable, measurable, and ready to integrate into real products and services. This opportunity will allow you to bring your aspirations, talent, potential and excitement for the journey ahead. As a Machine Learning Engineer (MLE), you will contribute—under guidance and with support from teammates—to understand ML requirements for a feature, prepare data, train baseline models, and evaluate results using standard metrics. This opportunity will allow you to grow your skills in ML workflows, model integration, and engineering practices that support security, privacy, accessibility, and responsible AI. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. Responsibilities - Implements data preprocessing steps, baseline models, and evaluation approaches under supervision. - Contributes small, well-defined components of ML workflows (e.g., dataset preparation, training utilities, evaluation scripts) with guidance. - Supports integration of ML models into existing systems and services, coordinating with engineering partners as needed. - Follows established practices for reproducibility, logging, monitoring, and secure development throughout the ML lifecycle. - Collaborates with teammates and partner teams to validate end-to-end functionality prior to release. - Learns and applies team processes related to security, privacy, accessibility, and responsible AI in day-to-day work. Qualifications Required Qualifications - Master's Degree in Computer Science, or related technical discipline with proven experience coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience. - Proficiency to speak, write and read in English language. Preferred Qualifications - Proficiency in one programming language (Python preferred) and common ML libraries (e.g., scikit-learn, PyTorch, TensorFlow). - Understanding of ML and GenAI fundamentals (e.g., supervised learning, evaluation metrics, overfitting/underfitting, regularization, embeddings, tokenization, transformers). - Ability to manipulate structured and unstructured data (e.g., pandas, SQL). - Familiarity with Git, code reviews, testing, and debugging in a collaborative environment. - Interest in learning deployment patterns, monitoring/observability, and responsible AI practices. This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled. Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Mexico
Inworld AI logo

Staff / Principal Machine Learning Engineer

Inworld AI

The leading developer platform for multimodal AI characters. Get generative AI purpose-built for real-time experiences.

Full TimeRemoteTeam 51-200Since 2021H1B No Sponsor

• Develop best-in-class real-time multimodal models and the orchestration platform optimized for thousands of queries per second. • Tackle unclear problems and find solutions that ensure performance, latency, and reliability as core product features. • Collaborate with global teams to design benchmarks or prototypes that uncover detailed insights on projects. • Ensure that all engineering outputs are stable and ship products that meet market needs.

Switzerland
Inworld AI logo

Senior / Lead Machine Learning Engineer, Serving - Serbia

Inworld AI

The leading developer platform for multimodal AI characters. Get generative AI purpose-built for real-time experiences.

Full TimeRemoteTeam 51-200Since 2021H1B No Sponsor

About Inworld Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second. We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA. Who We're Looking For A year ago, reliably working agentic systems and sub-second multimodal inference at scale barely existed. Nobody has a decade of experience here. So we're not screening for a resume template — we're looking for strong people from varied backgrounds who learn fast, thrive in ambiguity, and can show us what they've built, broken, and understood. Experience We Find Useful You don't need all of this. But you need enough to make a case. - Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM. - Model Acceleration. Hands-on experience with quantization, distillation, caching strategies , continuous batching, paged attention, and speculative decoding. - High-Performance Systems. Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs. - Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections. - Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups. - Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production. - Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems. - Professional fluency in English (written and spoken) is required, as you will be collaborating daily with our US-based leadership and engineering teams. Who Thrives Here - You don’t need a roadmap to start walking; you’re comfortable picking a direction and building the map as you go. - You believe engineering isn't finished until it’s shipped and stable. You have a bias for impact over purely theoretical optimizations. - You don't just ship code; you obsess over the why. You’re the first to question an architecture if you think there’s a better way to solve the core latency or throughput problem. - You aren't satisfied with "the PM said so." You thrive on deep context and want to understand the fundamental logic behind every decision we make. What Working Here Is Like We hand you unclear problems and expect you to make them clear. We value engineers who say "I don't know yet" and then design the benchmark or prototype that finds out. We treat performance, latency, and reliability as first-class product features, not a box to check before launch. Impact comes before everything else, though we support sharing work and open-source contributions that move the field forward. Your work should be visible. Flat structure, fast iterations, minimal process theater. For candidates interested in relocating to the San Francisco Bay Area in the future, full U.S. visa and relocation support may be available, subject to business needs and applicable legal and work authorization requirements.

Serbia
The Formula Consulting logo

MLOps Engineer – IoT

The Formula Consulting

Personalberatung Berlin. Personalvermittlung von IT-Profis. 📞 03052105983

Full TimeRemoteTeam 1-10Since 2023H1B No Sponsor

• Join a team of 35 tech sassy Internet of Things and Sensor Technology engineers. • Partnering with your hiring manager, you: Design, build and maintain scalable infrastructure for ML models. • Collaborate closely with AI/ML teams to bring models reliably into production. • Manage model versioning, experiment tracking, and reproducibility. • Implement CI/CD for ML systems. • Ensure performance, reliability, and security of deployed models in critical environments. • Set up monitoring, logging, and alerting for model performance. • Sometimes you support field operations by acquiring sensor data on site and validating the technology through controlled field trials.

Germany
€70K - €100K / year
Job Closed