Impact Through Innovation
Principal ML Engineer
Location
United States
Posted
5 days ago
Salary
$200K - $300K / year
Seniority
Lead
Job Description
Principal ML Engineer
Red Cell Partners
• End-to-End Product Engineering: Rapidly prototype and scale full-stack applications, ensuring seamless integration between UI, business logic, and ML inference. • Architectural Leadership: Design modular, extensible system architectures that support rapid iteration without accruing technical debt. • Technical Governance: Lead code reviews across the stack and serve as escalation point for engineering, infrastructure, and security decisions. • Risk & Remediation: Identify scalability bottlenecks and security vulnerabilities and present remediation strategies. • Bridge Research & Product: Translate experimental ML techniques into production-ready features. • Infrastructure & Compliance: Design infrastructure aligned with government audit requirements (e.g., HIPAA, FedRAMP). • Security-First Development: Implement IAM and secure data practices across the lifecycle. • AI-First Product Ownership: Drive conception and execution of AI-first products, maintaining a strong bias toward applying state-of-the-art capabilities in production environments.
Job Requirements
- Bachelor’s degree in Computer Science or equivalent experience.
- 8–10 years of experience in software development with a track record of shipping full-stack products from 0 to 1.
- Expert proficiency in React/TypeScript and backend systems (Python, Rust, or Go), including API design.
- 5+ years building and operating production ML systems with focus on deployment and inference.
- Deep experience with cloud architecture and identity management (e.g., Microsoft Entra ID/Auth0/Octa) in regulated environments.
- Strong data engineering skills across SQL, NoSQL, and vector databases.
- Demonstrated experience as a technical lead or principal engineer.
- AI Innovation Mindset: Demonstrated passion for staying at the forefront of AI advancements and incorporating cutting-edge capabilities into production systems.
- U.S. Citizenship: Must be a U.S. citizen to support work on government contracts and controlled environments.
Benefits
- Career track opportunity with potential for rapid advancement with strong performance as the firm grows
- 100% employer paid, comprehensive health care including medical, dental, and vision for you and your family.
- Paid maternity and paternity for 14 weeks at employees' normal pay.
- Unlimited PTO, with management approval.
- Opportunities for professional development and continued learning.
- Optional 401K, FSA, and equity incentives available.
- Mental health benefits are available through Tara Mind.
- Cost effective GLP-1 solutions available through Crux.
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
Pessoa Engenheira de Machine Learning – Especialista II
Grupo BoticárioCriamos oportunidades para a beleza transformar a vida das pessoas, e assim transformar o mundo ao nosso redor.
• Construção de pipelines de Machine Learning de ponta a ponta: incluindo preparação de dados, otimização de código, automatização de rotinas de treino e predição de modelos; • Monitoramento de métricas de qualidade, eficiência, custo e uso de produtos de dados; • Liderança técnica no processo de MLOps do time, definindo boas práticas de desenvolvimento e sustentação de soluções; • Desenvolvimento de arquiteturas em nuvem para soluções de dados e IA (GCP, AWS, Azure); • Construção de pipelines de dados com modelos de Gen AI embarcados (ex: preparação de dados não estruturados para estruturados); • Desenvolvimento de pilotos junto à área para prototipação de conceitos e prova de valor.
AI/ML Engineer – SC Cleared
Cloud BridgeHarness the full potential of AWS with award-winning Premier Partner, Cloud Bridge
• Attend technical and IT meetings to identify inefficiencies and automation opportunities • Translate business pain points into clear AI/ML use cases and delivery plans • Design and build LLM- and ML-based solutions to improve internal processes • Deploy automation tools and AI agents that enhance quality, speed, and consistency • Collaborate with architects and stakeholders, owning delivery of well-scoped initiatives
• Rapidly prototype and build full-stack tools and visualizations to support researchers and entrepreneurs. • Design, implement, test, and debug code across front-end, back-end, and data pipelines. • Collaborate with research teams to translate cutting-edge ML techniques into production-ready solutions. • Work with entrepreneurs and users to gather requirements, incorporate feedback, and iterate on product development. • Develop and optimize robust pipelines for model fine-tuning, evaluation, and deployment. • Establish best practices for reliable and reproducible ML model development. • Contribute to the creation of scalable, high-performance infrastructure for AI-driven products.
MLOps Engineer
RacknerDevSecOps and AI from Cloud to Mission Edge | Kubernetes Partner | Multicloud | 8(a) | HUBZone
Role Description At Rackner, we build systems where advanced technologies move beyond prototypes and into real-world operational use. We are seeking an MLOps Engineer to support the deployment and lifecycle management of AI/ML systems within a secure, mission-focused environment. This is not a research role. This is where models become reliable, deployable, and auditable systems. You will operate at the intersection of: - machine learning - cloud-native infrastructure - distributed systems …and ensure AI/ML systems are production-ready in environments where reliability and performance matter. What You’ll Do - Own the ML Lifecycle (End-to-End) - Build and operate production-grade ML pipelines - Orchestrate workflows using Kubeflow, Airflow, or Argo - Implement model versioning, lineage, and reproducibility standards - Operationalize AI/ML Systems - Deploy models into secure and constrained environments - Transition workflows from experimentation → containerized pipelines → production systems - Enable both batch and real-time inference architectures - Engineer for Reliability - Design systems for reproducibility, auditability, and stability - Monitor model performance and system health using Prometheus, Grafana, OpenTelemetry - Detect and resolve issues such as model drift and system degradation - Build Cloud-Native ML Infrastructure - Deploy and manage Kubernetes-based ML workloads - Containerize pipelines using Docker - Support scalable training and inference workflows - Establish Data Discipline - Support feature engineering and dataset preparation - Implement data versioning and governance practices (e.g., lakeFS) - Apply metadata and data management standards - Create Repeatable Systems - Develop runbooks, playbooks, and documentation - Build systems that are operationally sustainable and transferable Qualifications - Experience deploying ML systems into production environments - Strong programming skills in Python - Hands-on experience with: - ML pipeline tools (Kubeflow, Airflow, Argo) - Experiment tracking tools (MLflow, ClearML) - Experience with Kubernetes and containerized systems (Docker) - Familiarity with CI/CD pipelines - Understanding of distributed systems and scalable architectures - Experience working with: - LLMs or transformer-based models - Computer vision systems (YOLO, Faster R-CNN) - Focus on deployment and integration, not pure research Mindset - Systems thinker who prioritizes reliability over novelty - Comfortable operating in complex, evolving environments - Focused on delivering real-world outcomes Clearance Requirements - Active TS/SCI clearance strongly preferred - Candidates with an active Secret clearance may be considered and supported for upgrade - Candidates without an active clearance must be: - U.S. citizens - Eligible to obtain and maintain a clearance - Able to work in a CAC-enabled or secure environment Note: Start timelines and work scope may vary depending on clearance status and program requirements. Benefits - 100% covered certifications & training aligned to your role - 401(k) with 100% match up to 6% - Highly competitive PTO - Comprehensive Medical, Dental, Vision coverage - Life Insurance + Short & Long-Term Disability - Home office & equipment plan - Industry-leading weekly pay schedule Company Description Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector. We are an energetic, growing team focused on solving complex problems through: - Distributed systems - DevSecOps - AI/ML - Cloud-native architecture Our approach is cloud-first, cost-effective, and outcome-driven, delivering systems that scale and perform in real-world environments. If you’re an engineer who wants to move from building models → owning production systems, we’d like to connect.



