Job Closed
This listing is no longer active.
Transforming how the world connects
AWS DevOps Engineer
Location
United States
Posted
114 days ago
Salary
0
Seniority
Senior
Job Description
AWS DevOps Engineer
AST SpaceMobile
• Design, deploy, and operate AWS infrastructure supporting data lakes and containerized workloads. • Implement Infrastructure-as-Code using Terraform, CloudFormation, or similar tools. • Establish secure, scalable, and highly available AWS architectures following cloud best practices. • Collaborate with application and data engineering teams to translate requirements into reliable platform solutions. • Build and manage AWS-based data lakes using services such as S3, Glue, Athena, EMR, Redshift, and Lake Formation. • Support ingestion, transformation, storage, and access for structured and unstructured datasets. • Implement data lifecycle management, tiering, and cost optimization strategies. • Ensure data platforms meet required security, compliance, and governance standards. • Deploy, manage, and operate containerized applications on Amazon EKS. • Build and maintain container images, registries, and deployment pipelines. • Manage Kubernetes clusters including upgrades, scaling, networking, and security. • Partner with developers to improve application reliability, performance, and deployment consistency. • Design, implement, and maintain CI/CD pipelines for data and application workloads. • Automate infrastructure provisioning, application deployment, and operational tasks. • Implement monitoring, logging, and alerting for AWS services, data pipelines, and Kubernetes workloads. • Participate in incident response, root cause analysis, and continuous improvement initiatives.
Job Requirements
- Bachelor’s degree in Computer Science, Engineering, or a related STEM field, or equivalent practical experience.
- A minimum of 5+ years in DevOps, Site Reliability Engineering (SRE), or cloud infrastructure roles focused on AWS.
- Strong hands-on experience with AWS data services (S3, Glue, Athena, EMR, Redshift, or similar).
- Experience running containerized applications, ideally on Amazon EKS.
- Proficiency with Infrastructure-as-Code and automation tools such as Terraform or CloudFormation.
- Strong Linux systems knowledge and scripting experience (Bash, Python, etc.).
- Solid understanding of AWS networking, security, IAM, and operational best practices.
- Experience with large-scale or enterprise data platforms.
- Experience operating Kubernetes in production environments.
- Familiarity with distributed systems and API-based integrations.
- Experience with monitoring, observability, and logging tools (e.g., CloudWatch, Prometheus, Grafana, ELK).
- Experience designing cost‑optimized cloud architectures.
- Knowledge of data engineering workflows or analytics platforms.
- Strong interpersonal skills and ability to collaborate across technical and non‑technical teams.
- Proven ability to work effectively within cross-functional, fast‑paced environments.
- Excellent written and verbal communication skills.
- Meticulous attention to detail to ensure accuracy of documentation, automation scripts, and deployments.
- Strong problem-solving abilities and willingness to take ownership of issues.
- Ability to manage multiple priorities while maintaining high quality and reliability standards.
Benefits
- Ability to participate in on-call rotations or off-hours support as required by operational needs.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior Software Engineer – SRE
VeevaHeadquartered in Pleasanton, California, Veeva is a leading provider of cloud-based software and services for the life sciences industry. As an employer, Veeva
• Build Cloud Infrastructure: Rapidly build new cloud infrastructure from scratch, adhering to software development best practices • Drive Reliability & Scalability: Ensure our platform meets the scalability and reliability needs of our hundreds of global customers (across North America, Europe, and Asia) • Lead Incident Management: During an incident, effectively lead triage and mitigation efforts, potentially performing periodic on-call duty for escalations • Automate & Optimize: Develop tools and automation to eliminate manual work and reduce issue resolution times • Full-Stack Diagnostics: Proactively learn all necessary systems to provide full-stack diagnostics and determine root causes of production problems • Strategic Engineering Partnership: Strategize with engineering teams on complex problems, offering insights on what will work at scale (supporting 2M+ users) and guiding development decisions before features ship • Influence Design: Participate in engineering design reviews of new features and drive initiatives to improve operational efficiency and platform scalability • Cross-functional Collaboration: Partner effectively with Product Management, Design, and QA to deliver cutting-edge solutions and direct customer value • Backend Focus: Work across multiple layers of our technology stack, with a primary focus on backend development, and opportunities in frontend and infrastructure • Effective Communication: Communicate clearly with engineering teams, succinctly describing problems for seamless hand-offs during outages with both technical and non-technical audiences • Mentorship: Actively mentor team members, contributing to a positive and high-performing team environment
Senior Software Engineer – SRE
VeevaHeadquartered in Pleasanton, California, Veeva is a leading provider of cloud-based software and services for the life sciences industry. As an employer, Veeva
• Build Cloud Infrastructure: Rapidly build new cloud infrastructure from scratch, adhering to software development best practices • Drive Reliability & Scalability: Ensure our platform meets the scalability and reliability needs of our hundreds of global customers (across North America, Europe, and Asia) • Lead Incident Management: During an incident, effectively lead triage and mitigation efforts, potentially performing periodic on-call duty for escalations • Automate & Optimize: Develop tools and automation to eliminate manual work and reduce issue resolution times • Full-Stack Diagnostics: Proactively learn all necessary systems to provide full-stack diagnostics and determine root causes of production problems • Strategic Engineering Partnership: Strategize with engineering teams on complex problems, offering insights on what will work at scale (supporting 2M+ users) and guiding development decisions before features ship • Influence Design: Participate in engineering design reviews of new features and drive initiatives to improve operational efficiency and platform scalability • Cross-functional Collaboration: Partner effectively with Product Management, Design, and QA to deliver cutting-edge solutions and direct customer value • Backend Focus: Work across multiple layers of our technology stack, with a primary focus on backend development, and opportunities in frontend and infrastructure • Effective Communication: Communicate clearly with engineering teams, succinctly describing problems for seamless hand-offs during outages with both technical and non-technical audiences • Mentorship: Actively mentor team members, contributing to a positive and high-performing team environment
Director, Software Engineering – Site Reliability Engineering
AffirmAffirm is a financial services company that is on a mission to provide its customers with “honest financial products that improve lives.” As an employer, Af
• Set the vision and drive execution for Reliability Engineering at Affirm • Own and coordinate delivery of high availability of core Affirm’s services, to attain our service level standards and expectations with external partners • Iterate and maintain a best-in-industry global incident response & lifecycle program • Build software and program management structure to perform continual risk management across the entire Affirm system and Engineering organization • Run a robust development lifecycle establishing a culture for operational excellence, while experimenting and failing fast • Work with a wide variety of cross functional partners outside of engineering ranging from product, enterprise risk, security, legal and compliance • Hire and build a global team of SREs, system engineers, and full stack engineers • Cultivate a respectful and supportive environment for all team members that effectively demonstrates the diversity of the team
• Atuar como ponto focal de confiabilidade e performance das plataformas de dados em ambiente Google Cloud Platform (GCP). • Implementar práticas de DataOps e SRE, garantindo observabilidade, automação, escalabilidade e resiliência nos pipelines de dados. • Monitorar e otimizar jobs de ingestão, transformação e orquestração de dados (Airflow, Composer, Dataflow, Dataproc, BigQuery). • Apoiar engenheiros e arquitetos de dados na implementação de boas práticas de CI/CD, infraestrutura como código e controle de versionamento. • Criar e manter dashboards de monitoramento, alertas e métricas de performance e custo em GCP (Stackdriver, Cloud Monitoring, Prometheus, Grafana). • Atuar em incidentes críticos de dados e infraestrutura, com resolução rápida e análise de causa raiz (RCA). • Trabalhar em conjunto com as áreas de segurança e arquitetura para garantir governança, compliance e proteção dos dados. • Promover a cultura de automação e melhoria contínua, reduzindo retrabalhos e aumentando a eficiência operacional.



