Job Closed
This listing is no longer active.
Michael Baker International provides development, engineering, intelligence, and technology solutions for high-end and large-scale architecture and infrastructu
Senior Cloud DevOps Engineer
Location
United States
Posted
104 days ago
Salary
$130K - $170K / year
Seniority
Senior
No structured requirement data.
Job Description
Senior Cloud DevOps Engineer
Michael Baker International
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description As a Senior Cloud DevOps Engineer at MBI, you will take a hands-on role in designing, building, and maintaining our cloud infrastructure across Microsoft Azure and Amazon Web Services. - Ensure fast, secure, and reliable software delivery through automation and DevOps best practices. - Implement solutions for infrastructure provisioning, CI/CD pipeline creation, incident response, and cloud vendor partnership management. - Support MBI’s strategic technology goals, including Vision 2030 initiatives, digital platform delivery, and enterprise-wide AI/ML infrastructure enablement. Qualifications - 5+ years of hands-on experience designing and supporting Azure and AWS cloud infrastructure in production environments. - Strong proficiency with CI/CD pipelines (Jenkins, Azure DevOps, GitLab CI, GitHub Actions). - Solid scripting and coding abilities in Python, Bash, and PowerShell. - Proven experience managing cloud resources at enterprise scale. - Excellent communication skills and a collaborative mindset. - Bachelor’s degree in Computer Science, Engineering, or related field preferred. Requirements - Deep understanding of core services on both Azure and AWS platforms. - Experience with hybrid or multi-cloud strategies. - Experience with monitoring/alerting frameworks (Azure Monitor, CloudWatch, Prometheus). - Strong understanding of cloud security practices and compliance requirements. Benefits - Medical, dental, vision insurance - 401 (k) Retirement Plan - Health Savings Account (HSA) - Flexible Spending Account (FSA) - Life, AD&D, short-term, and long-term disability - Professional and personal development - Generous paid time off - Commuter and wellness benefits
Job Requirements
- 5+ years of hands-on experience designing and supporting Azure and AWS cloud infrastructure in production environments.
- Strong proficiency with CI/CD pipelines (Jenkins, Azure DevOps, GitLab CI, GitHub Actions).
- Solid scripting and coding abilities in Python, Bash, and PowerShell.
- Proven experience managing cloud resources at enterprise scale.
- Excellent communication skills and a collaborative mindset.
- Bachelor’s degree in Computer Science, Engineering, or related field preferred.
- Deep understanding of core services on both Azure and AWS platforms.
- Experience with hybrid or multi-cloud strategies.
- Experience with monitoring/alerting frameworks (Azure Monitor, CloudWatch, Prometheus).
- Strong understanding of cloud security practices and compliance requirements.
Benefits
- Medical, dental, vision insurance
- 401 (k) Retirement Plan
- Health Savings Account (HSA)
- Flexible Spending Account (FSA)
- Life, AD&D, short-term, and long-term disability
- Professional and personal development
- Generous paid time off
- Commuter and wellness benefits
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Work collaboratively and independently to design and deliver solutions as well as review and provide feedback for those delivered by other engineers for our software and services on our cloud hosted production infrastructure. • Shape how our mission-critical enterprise software solutions are developed and deployed using optimized and automated CI/CD pipelines that ensure high quality products • Help design, build and support infrastructure and security technologies within the cloud that offer resiliency, observability and optimized cost. • Communicate proactively and effectively to different kinds of audiences within the company. Share your own experience, knowledge and expertise with others to help them grow and develop. • Participate in planning work and identify areas of improvement • Perform technology evaluation and selection • Participate in an on-call rotation for maintenance of the cloud solutions.
Senior Site Reliability Engineer
NicheNiche connects people to their future schools, neighborhoods, and workplaces.
• Own and architect cloud infrastructure across AWS and GCP, including EC2, EKS/Kubernetes, RDS, ElastiCache, S3, and networking components (VPCs, load balancers, DNS), driving improvements that increase reliability and reduce operational burden • Lead the design and implementation of secrets management strategies using Hashicorp Vault and other tools, establishing organizational standards for secure configuration management • Architect and evolve infrastructure-as-code practices using Terraform, driving adoption of patterns that improve consistency and reduce deployment risk • Design and optimize deployment pipelines and CI/CD systems, troubleshoot complex deployment failures with Git and FluxCD, and establish best practices for safe, reliable releases • Support database operations including migrations and performance tuning • Own Kafka clusters and message queue systems, including architecture decisions, capacity planning, and troubleshooting complex processing issues • Participate in 24/7 oncall rotations, responding to alerts, triaging incidents, and coordinating with development teams to resolve production issues • Design and implement monitoring, alerting, and observability strategies using Prometheus, Grafana, Sumo Logic, and related tools, establishing organizational standards that catch issues before customers notice them • Define and own Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for critical services, balancing business needs with engineering resources • Lead blameless post-mortems, write comprehensive incident analyses that teach others, and drive systemic improvements that prevent entire classes of incidents • Champion access controls, IAM policies, and security configurations across cloud environments, ensuring infrastructure meets compliance and security requirements • Identify and eliminate systemic sources of operational toil by designing automation, building self-service tooling, and improving developer workflows that scale the team's impact • Lead AI-assisted automation initiatives to streamline SRE processes, implementing solutions that reduce toil and improve incident response • Partner with product development teams as the reliability subject matter expert, providing architecture guidance, production readiness reviews, and proactive capacity planning • Mentor and coach SRE team members, helping them develop technical skills and operational judgment through pairing, code review, and incident response shadowing • Lead knowledge sharing initiatives, demos, and cross-team collaboration to elevate reliability culture and operational excellence across the engineering organization
• Automate provisioning and configuration management following IaC best practices • Create and maintain automated scripts for building, configuring, deploying and testing applications • Maintain, support, and enhance continuous integration environment • Ensure operations excellence by reducing human errors and increasing product operation tasks through automation • Partner with engineering teams to identify development challenges • Develop internal tools/applications
• lead and manage SRE team(s) responsible for production reliability, incident response, and operational readiness across Empower systems and integrated platforms • establish and evolve SRE operating practices including on-call, incident triage/escalation, post-incident reviews, problem management, and operational governance • define and implement service reliability standards • drive automation-first approaches that reduce manual effort • partner with engineering teams to improve deployment workflows • lead observability strategy and execution • collaborate with data/platform and engineering teams to design and optimize AWS-native infrastructure patterns • coordinate with upstream/downstream system owners and data/platform teams to manage dependencies



