Stefanini Group logo
Stefanini Group

The Stefanini Group is a global provider of offshore, onshore and near shore outsourcing, IT digital consulting, systems integration, application, and strategic staffing services to Fortune 1000 enterprises around the world. Our presence is in countries like the Americas, Europe, Africa, and Asia. More than four hundred clients across a broad spectrum of markets, including financial services, manufacturing, telecommunications, chemical services, technology, public sector, and utilities. Stefanini is a CMM level 5, IT consulting company with a global presence.

Cloud Site Reliability Engineer

Location

United States

Posted

9 days ago

Salary

0

Seniority

Mid Level

No structured requirement data.

Job Description

Cloud Site Reliability Engineer

Stefanini Group

Role Description As a Senior Cloud Engineer in the Cloud SRE team, you will be responsible for designing and developing cloud solutions and engineering reliability tools for the Cloud Foundation Services (CFS) platform in the Infrastructure, Platforms & Operations organization. You will apply software engineering practices to build scalable, reusable solutions and utilities that enhance platform reliability. Responsibilities - Design, develop, and maintain reliability solutions and SRE utilities to reduce toil, improve cloud platform reliability, and industrialize SRE practices across the system. - Build and optimize Infrastructure as Code (IaC) using Terraform to manage AWS resources related to SRE solutions, incorporating cost-efficient design principles. - Develop CI/CD pipelines and automated testing to ensure code quality, reliability, and rapid delivery of the solutions. - Define SRE standards, best practices, and guidelines for adoption across teams; establish SRE metrics like SLI, SLOs, etc. - Apply software engineering best practices including version control, code reviews, test-driven development, and documentation to all development. - Participate in incident management and on-call rotation, providing technical support for SRE tools, troubleshooting production issues, and collaborating with teams to reduce incident recurrence through proactive detection and pattern analysis. - Stay current with emerging AWS services, SRE methodologies, and cloud-native development technologies, and drive adoption of innovative solutions. - Collaborate within Agile and Scaled Agile frameworks with cross-functional teams to deliver integrated cloud automation solutions. - Produce clear, blameless postmortems with actionable items and documented failure scenarios. Qualifications - Bachelor's degree in computer science, Information Systems, or equivalent background or equivalent experience. - 7+ years of extensive experience in software development with focus on reliability and platform engineering. - 5+ Years of advanced Python development skills with proven experience building enterprise-grade, highly available tools, APIs, and utilities. - 3+ years of hands-on experience developing solutions in AWS environments with deep understanding of core services (EC2, VPC, S3, Lambda, IAM, CloudFormation, EventBridge, Step Functions etc.) and resource cost optimization. - 3+ years of experience applying SRE principles including observability, toil automation, SLIs/SLOs and reliability engineering. - Expert-level proficiency with Infrastructure as Code (IaC) using Terraform, including module development and state management. - Strong experience with CI/CD pipelines, automated testing frameworks, and DevOps practices. - Experience with observability tools and practices including Grafana, AWS CloudWatch, AWS Canary. - Experience defining, implementing, and managing SLOs/SLIs and error budgets; familiarity with conducting RCAs and producing postmortem documentation. - Working experience in Agile and Scaled Agile environments and familiarity with ITSM processes (incident, change, and problem management), resilience testing and chaos engineering practices. - Experience with GoLang or additional programming languages is a plus. Company Description The Stefanini Group is a global provider of offshore, onshore and near shore outsourcing, IT digital consulting, systems integration, application, and strategic staffing services to Fortune 1000 enterprises around the world. Our presence is in countries like the Americas, Europe, Africa, and Asia, and more than four hundred clients across a broad spectrum of markets, including financial services, manufacturing, telecommunications, chemical services, technology, public sector, and utilities. Stefanini is a CMM level 5, IT consulting company with a global presence.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Sensedia logo

DevOps, Java

Sensedia

Modern Integration Platform

DevOps Engineer9 days ago
Full TimeRemoteTeam 501-1,000Since 2007H1B No Sponsor

• Promote DevOps culture within the project, driving process automation • Build and maintain infrastructure as code • Implement CI/CD pipelines • Support cloud environments • Ensure reliability, scalability, and security of applications and platforms • Contribute to an agile and efficient development journey • Develop and maintain backend applications using Java • Manage and automate infrastructure in cloud environments (AWS and Azure) • Administer, monitor, and optimize development, staging, and production environments • Create and manage containerized environments using Docker and Kubernetes • Apply Infrastructure as Code (IaC) practices • Administer web servers and relational and non-relational databases • Implement and maintain monitoring, observability, and troubleshooting solutions • Ensure environment security by applying DevSecOps practices • Perform incident analysis, problem resolution, and continuous system improvements • Participate in rollouts, deployments, and changes in production environments • Share technical knowledge and support the growth of development and operations teams

Brazil
Ad Hoc logo

Cyber Security Engineer / DevSecOps Engineer

Ad Hoc

Ad Hoc delivers stable, fast, and scalable technology services for governments at the federal and state levels. The company was established by two members of th

DevOps Engineer9 days ago

• Support the design, implementation, and maintenance of secure technology solutions within the federal government • Design, implement, and maintain security controls across cloud and on-premises environments • Conduct security assessments, vulnerability analysis, and risk evaluations of applications, infrastructure, and systems • Support continuous monitoring activities, including security event analysis and incident response efforts • Develop and maintain security documentation, including System Security Plans (SSPs), security procedures, and risk assessments • Assist with Authorization to Operate (ATO) activities and ongoing compliance requirements • Design, implement, and maintain secure CI/CD pipelines supporting application development and infrastructure deployments • Integrate automated security testing into the software development lifecycle • Develop Infrastructure as Code (IaC) solutions using tools such as Terraform, CloudFormation, or Ansible • Automate security controls, compliance checks, and deployment processes • Support Kubernetes, Docker, and cloud-native application deployments • Analyze security findings and develop remediation recommendations • Support vulnerability management activities, including tracking, prioritization, and remediation verification • Participate in security audits and assessments conducted by internal and external stakeholders • Monitor emerging threats and recommend improvements to security posture.

United States
$120K - $150K / year
Trility Consulting logo

Senior DevSecOps Consultant – Azure, Secrets Management

Trility Consulting

Start delivering technology solutions that simplify, automate, and secure your business.

DevOps Engineer9 days ago
ContractRemoteTeam 51-200H1B No Sponsor

• Lead a short-term engagement focused on establishing secure secrets management patterns • Strengthen application security practices • Create repeatable DevSecOps standards across a modern Azure-based environment • Serve as a trusted advisor and hands-on technical leader • Partner with engineering and architecture teams to assess current-state practices and identify security gaps • Design future-state patterns and implement foundational security controls • Improve SDLC controls and create reusable guidance that can be adopted across multiple teams and applications

United States
Leidos logo

DevSecOps Engineer – Intelligent Platforms, Agents

Leidos

Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.

DevOps Engineer9 days ago
Full TimeRemoteTeam 10,001+Since 1969H1B Sponsor

• Design, implement, and maintain automated CI/CD pipelines that carry code from development through security scanning, compliance validation, and deployment into Navy and DoD environments • Build and maintain hardened Kubernetes environments aligned to DISA STIG requirements across cloud and restricted network deployment contexts • Automate security artifact generation including SBOM production, CVE scanning, and continuous compliance validation • Drive adoption of Infrastructure as Code, GitOps practices, and controls-as-code across the team • Leverage AI tooling to accelerate pipeline development, vulnerability triage, compliance remediation, and operational documentation • Partner closely with software engineers, systems engineers, and ISSEs to embed security and compliance requirements from the start of development • Maintain and evolve deployment infrastructure across multiple secure environments, including cloud and air-gapped or intermittently connected contexts • Support ATO processes through automated evidence generation, documentation as code, and direct collaboration with the security team • Establish and promote standards for pipeline design, container security, secrets management, and deployment consistency • Contribute to feature development when team capacity requires, applying security-first development practices to application code • Maintain operational documentation including runbooks, deployment guides, and architecture diagrams as version-controlled artifacts

United States
$107.9K - $195.1K / year