We give your data purpose - Tech Company focused on innovation and artificial intelligence.
Working Student AI Platform Engineer
Location
United States
Posted
73 days ago
Salary
0
Seniority
Entry Level
Job Description
Working Student AI Platform Engineer
engaige GmbH
• Responsible for designing and implementing complex AI components — from initial concept to production • Enhance and scale existing AI solutions to improve performance and efficiency • Develop robust data pipelines, backends, and API interfaces for seamless integration • Build and maintain automated deployment pipelines to streamline releases and testing • Ensure quality and performance of backend processes and NLP algorithms through continuous optimization • Work closely with the product team to translate data-driven insights into market-ready, innovative solutions
Job Requirements
- Currently enrolled in a degree program in Computer Science, Mathematics, Statistics, Business Informatics, or a related field
- Experience with object-oriented development in Python and familiarity with architectural patterns
- Strong knowledge of Python and FastAPI, particularly for high-performance, scalable backend applications
- Experience developing production-ready machine learning models and scalable data solutions
- Confident working with cloud technologies and infrastructures (e.g., AWS, Azure) and containerization (Docker, Kubernetes)
- Familiarity with modern development tools (e.g., Git, VS Code) and CI/CD pipelines (e.g., GitLab CI/CD)
- Strong communication skills in German and English
- High degree of innovation and the ability to work effectively in an agile, dynamic environment
Benefits
- An innovative working environment with state-of-the-art technologies and exciting challenges
- The opportunity to contribute to cutting-edge AI projects and make a real impact
- Ongoing training, workshops, and room for personal development
- Flat hierarchies and an open company culture with fast decision-making processes
- Flexible working hours, home office options, and attractive compensation models
- The chance to actively shape the future of AI solutions and drive innovation
Related Guides
Related Categories
Related Job Pages
More Platform Engineer Jobs
Sr Platform Engineer
JobgetherWe use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Role Description This role focuses on designing, building, and evolving scalable platforms to support computer vision and machine learning (CVML) workloads. The Sr Platform Engineer will enable ML teams by developing reliable infrastructure, tooling, and workflows that accelerate experimentation, training, and deployment of models at scale. This position combines hands-on engineering with strategic platform guidance, balancing improvements to legacy systems with delivery of new, high-impact platform capabilities. You will collaborate closely with ML engineers, robotics teams, and product stakeholders, shaping platform architecture, reliability, and performance across cloud and on-prem environments. The ideal candidate thrives in a fast-moving, multi-team environment and is motivated by creating durable, widely adopted technical solutions. This role offers the opportunity to influence platform strategy while executing tangible improvements that impact real-world applications in computer vision, robotics, and AI-powered systems. - Design, implement, and evolve platform capabilities for ML training, batch inference, and model deployment workflows - Own and maintain core platform components including compute orchestration, data pipelines, and inference systems used across multiple teams - Enhance platform reliability, scalability, and performance while addressing real-world ML workload requirements - Enable ML engineers with intuitive tools, workflows, and documentation across the full model lifecycle - Develop and optimize hybrid compute environments, leveraging Kubernetes, Slurm, and cloud platforms (AWS preferred) - Evaluate and improve system architecture, balancing incremental improvements with long-term platform health - Mentor junior engineers, provide technical guidance, and drive adoption of platform best practices Qualifications - 5+ years of professional experience in platform, infrastructure, or systems engineering - Strong technical judgment in evolving legacy platforms and delivering new, cross-team components - Proficiency in Python for production systems, tooling, and platform components - Solid understanding of ML systems and the end-to-end model lifecycle from experimentation to deployment - Hands-on experience with cloud platforms (AWS preferred) and container orchestration systems such as Kubernetes and Slurm - Ability to translate requirements from ML, robotics, and product teams into scalable platform solutions - Experience ramping quickly on new domains, tools, and complex systems - Preferred: Golang experience, ML pipeline integration (Kubeflow, Airflow), distributed training and inference (Ray), computer vision or robotics ML systems Requirements - 5+ years of professional experience in platform, infrastructure, or systems engineering - Strong technical judgment in evolving legacy platforms and delivering new, cross-team components - Proficiency in Python for production systems, tooling, and platform components - Solid understanding of ML systems and the end-to-end model lifecycle from experimentation to deployment - Hands-on experience with cloud platforms (AWS preferred) and container orchestration systems such as Kubernetes and Slurm - Ability to translate requirements from ML, robotics, and product teams into scalable platform solutions - Experience ramping quickly on new domains, tools, and complex systems - Preferred: Golang experience, ML pipeline integration (Kubeflow, Airflow), distributed training and inference (Ray), computer vision or robotics ML systems Benefits - Competitive US-based annual salary range of $160,000 - $287,000, plus eligibility for bonus programs - Comprehensive benefits package including healthcare, retirement, and paid time off - Opportunity to work remotely within the United States - Exposure to cutting-edge CVML platforms, AI research, and robotics applications - Mentorship, career development, and learning opportunities in a collaborative, inclusive environment - Chance to influence platform strategy and technical direction across multiple teams
Role Description Buscamos um(a) profissional sênior com forte experiência em AWS para atuar na sustentação, monitoramento e evolução de uma plataforma de dados e analytics em ambiente cloud. Esta posição possui caráter estratégico e exige atuação ponta a ponta, incluindo análise e resolução de incidentes, monitoramento de infraestrutura, atuação proativa na identificação de riscos e proposição de melhorias técnicas e de governança. O profissional atuará em um modelo de AMS estruturado, com foco em suporte N2/N3, confiabilidade da plataforma e evolução contínua do ambiente, apoiando diretamente a operação e a estabilidade dos serviços de dados e analytics. Responsibilities - Sustentação e suporte técnico - Atuar na análise e resolução de incidentes em ambiente AWS (nível N2/N3) - Investigar causas raiz de falhas em serviços de dados e infraestrutura - Realizar troubleshooting envolvendo performance, disponibilidade e custo - Apoiar tecnicamente os níveis iniciais de atendimento - Monitoramento e observabilidade - Monitorar e analisar métricas de serviços AWS, incluindo: - Redshift (CPU, filas, queries, armazenamento) - EC2 (CPU, memória, disco, status checks) - EMR (jobs, uso de recursos, HDFS) - Athena (queries, custo, performance) - SQS (backlog, throughput) - DynamoDB (throttling, latência) - Lambda (erros, duração, concorrência) - S3 (armazenamento e erros) - Identificar gargalos de performance e riscos operacionais - Criar e evoluir mecanismos de alerta e monitoramento - Atuação proativa - Identificar oportunidades de melhoria em performance, custo e estabilidade - Propor ações preventivas para evitar incidentes - Automatizar rotinas operacionais e de monitoramento - Governança e evolução da plataforma - Apoiar na definição de boas práticas de arquitetura em cloud - Contribuir com a evolução da governança da plataforma de dados - Apoiar a análise e direcionamento de vulnerabilidades - Gestão e comunicação - Apoiar na elaboração de relatórios técnicos mensais (saúde da plataforma, riscos e melhorias) - Interagir com clientes e stakeholders técnicos - Documentar incidentes, análises e soluções Qualifications - Experiência sólida com AWS em ambientes produtivos - Vivência prática com: - EC2 (monitoramento e troubleshooting) - CloudWatch (métricas, logs e alarmes) - S3 - Experiência em análise e resolução de incidentes de infraestrutura - Atuação prévia em suporte técnico nível N2/N3 ou AMS - Experiência com análise de performance (CPU, memória, disco, I/O) - Conhecimento de arquitetura em cloud, preferencialmente voltada a dados - Capacidade de atuação autônoma em cenários críticos Requirements - Experiência com ferramentas AWS como Amazon Redshift, DynamoDB, EMR e Lambda - Conhecimento em Athena e em mensageria (SQS) - Experiência com arquiteturas serverless - Experiência com práticas de FinOps (otimização de custos em cloud) - Conhecimento em segurança e vulnerabilidades em AWS - Experiência com ferramentas de observabilidade e monitoramento avançado - Vivência com metodologias ITIL ou AMS estruturado - Experiência em ambientes de dados e analytics - Inglês Intermediário ou avançado Profile Expected - Perfil analítico, com forte capacidade de investigação e diagnóstico - Proatividade na identificação e resolução de problemas - Organização e senso de priorização - Boa comunicação - Capacidade de atuar de forma prática, sem perder a visão estratégica Benefits - Vales Alimentação e Refeição (Swile) - Flexibilidade para crédito em Auxílio Home-Office (Swile) - Cobertura de até 100% em Plano de Saúde e Odontológico - Seguro de Vida em grupo - Trabalho remoto - Convênio Saúde Mental - psicoterapia online e presencial - Incentivo a certificações e cursos - Convênio para cursos de pós-graduação e MBA (Esalq/USP) - Parceria com escolas de idiomas - Parceria com academias e apps de bem-estar (Wellhub) - Palestras e rodas de conversa internas - Bônus por indicação - Happy hours - Mimos em datas comemorativas
Role Description As a Lead AI Platform Engineer, you will be the backbone of our AI production lifecycle. You will bridge the gap between research and real-world application, ensuring our Data Scientist, AI Researchers, Product teams and others in the company have the high-performance infrastructure, automated pipelines, and deployment strategies needed to ship state-of-the-art models and agents at scale. Qualifications - 5+ years experience with cloud infrastructure and infrastructure as code. - Previous experience with the ML and LLM lifecycle - training, hosting, optimisation, observability. - Used to working closely with researchers and data scientists - taking experiments from worksheets into production. - Strong grasp of ML fundamentals and modern GenAI stack. Requirements - Infrastructure as Code (IaC): Design and maintain scalable cloud environments (GCP/AWS) using Terraform. - Resource Provisioning: Manage GPU/TPU resource allocation for training, fine-tuning, and interactive notebooks. - Custom Tooling: Build internal services and CLI tools to streamline the developer experience for the AI team. - Automated Pipelines: Design CI/CD and training pipelines using tools such as GitHub Actions, MLFlow, Vertex AI Pipelines. - Deployment Methodology: Develop reusable patterns for model serving. Managing service deployments to Kubernetes. - Vector Infrastructure: Manage and optimize vector databases and embedding pipelines for RAG-based systems. - Observability and Reliability: Model drift monitoring, resource utilisation, LLM and agent tracing. - Inference Optimization: Implement techniques to reduce latency and increase throughput (quantisation, distillation, etc…) - Cold Start Mitigation: Solve scaling bottlenecks for serverless or containerized model deployments. - Cost Management: Optimize GPU utilization and cloud spend without compromising performance. - Support AI Agent Deployment: Define and create tooling and service templates around agent deployment (tool libraries, tracing, default agent frameworks, skills, etc…). - Enablement for non-technical agent users: Help create workflows and guidance on no-code/low-code agent platforms (n8n, LangSmith, or similar). - Create tooling and policies to enable safe usage of local agents such as Claude code. Benefits - Competitive salary. - Benefits. - Remote working within an impactful, mission-driven culture.
• Design and implement the AWS platform foundations used by product and service teams across RWS • Develop reusable infrastructure patterns aligned with the RWS platform reference architecture • Implement core cloud capabilities including networking, identity integration, security controls, and platform services • Support the creation of standardised infrastructure building blocks to accelerate application deployment • Support engineering and IT teams with guidance as migration of application workloads from on-premise environments into AWS is completed • Build and implement prioritised plan for migrated applications • Collaborate with application teams to modernise architectures • Provide guidance and tooling to help teams successfully adopt AWS infrastructure and services • Build and maintain infrastructure using Infrastructure as Code to ensure consistent, repeatable cloud deployments • Enable product teams to provision infrastructure and deploy services through self-service platform capabilities



