Job Closed
This listing is no longer active.
Peraton Corporation, a national security company headquartered in Herndon, Virginia, supplies solutions for mission-critical programs and systems. Founded in 2017, Peraton's missio
Lead DevOps Automation Architect - Web Content Management (WCM)
Location
United States
Posted
73 days ago
Salary
$112K - $179K / year
Seniority
Lead
Job Description
Lead DevOps Automation Architect - Web Content Management (WCM)
Peraton Corporation
Responsibilities Peraton is looking for a Lead DevOps Automation Architect to lead a team of engineers that is responsible for the design, implementation, and operations and maintenance (O&M) of the multi-tenant Web Content Management as a Service (WCMaaS) platform. This includes managing all aspects of the underlying infrastructure, such as servers and storage. The Lead Architect will work to continuously enhance the platform, leveraging secure, scalable, cost-effective, and operationally sustainable cloud solutions. Aside from technical qualifications, applicants should have effective communication skills, both written and verbal. This role requires deep expertise in cloud services, infrastructure automation, DevSecOps, open-source ecosystems, enterprise solution design, and software and infrastructure engineering practices. The ideal candidate brings strong technical leadership, the ability to engage with a diverse set of stakeholders, and the ability to adapt solutions to evolving requirements and priorities. Location: Remote (but must reside and perform all work within the United States) Work hours: This position requires working online from 8:00 AM Eastern to 5:00 PM Eastern Day to Day Roles and Responsibilities: - Pipeline and infrastructure automation - Serve as automation expert relating to pipelines built and managed in a multi-GovCloud environment - Demonstrate pipeline orchestration tool experience relating to infrastructure as code (IaC) - Develop code using scripting and programming language - Investigate and resolve issues across existing pipelines - Perform root cause analysis to address underlying IaC issues and provide solutions to prevent recurrence. - AWS Infrastructure Management - Build and maintain Linux infrastructure in AWS, including a deep working knowledge of using lambda functions. Demonstrate an in-depth understanding in all aspects of Cloud Development Platforms; including IT operations, virtualization, containerization, networking, storage, disaster recovery, security policy and controls implementation, system management, and data migration. - Push Drupal code via a GitLab CI/CD pipeline. - Build and manage Ansible playbooks for use in various automation processes. - Manage and maintain the AWS infrastructure, including a deep working knowledge of all aspects of Cloud Development Platforms including IT operations, virtualization, containerization, networking, storage, disaster recovery, security policy and controls implementation, system management, and data migration. - Ensure that all tenants’ AWS resources are secure, FedRAMP compliant, and optimized for performance - Demonstrates expertise in enterprise solution design, SOA models and approaches to enterprise cloud solutions, and promotion of architecture principles along with conducting research in specific technology areas and being up to date with the latest industry trends on application architecture. - Continuously optimizing AWS Cloud platform services across all integration points. - Independently apply a wide set of engineering disciplines for planning, design, analysis, coding, testing, roll-out and support of information systems architectures. - Collaborate with the Architecture team to implement solutions that align with best practices for AWS cloud infrastructure - Adhere to Change Management procedures - Collaboration and Knowledge Sharing - Collaborate with other team engineers to resolve development issues/incidents and implement improvements - Document solution designs, process procedures, and lessons learned to enhance team knowledge - Provide technical mentorship and knowledge sharing to more junior engineers less familiar with pipeline and infrastructure automation Qualifications Basic Qualifications: - Bachelor’s Degree in Computer Science, Management Information Systems, Engineering, or IT related field of study and 12 years of experience or a Masters degree and 10 years of experience in an IT related field of study. - Must be a U.S. Citizen with the ability to obtain/maintain a DHS Public Trust clearance. - 10+ years of experience providing technical/management leadership on major tasks or technology assignments involving IT Operations - 5+ years of experience in cloud services and infrastructure. - 3+ years of extensive hands-on experience with automation involving a wide range of AWS services including but not limited to EC2 instances, S3 buckets, VPC configurations, RDS databases, and other services - Must have extensive demonstrated knowledge in infrastructure architecture, server configuration, and management of Linux servers in AWS. - Must have proven knowledge of Change and Configuration Management best practices (knowledge of DHS Change Management Processes preferred) - Required Certification: - AWS Certified Solutions Architect Professional - Required pipeline and infrastructure automation experience: - Pipeline Orchestration tool experience required, with GitLab preferred - IaC experience required, with Terraform preferred - Config as Code is required, with Ansible preferred - Some PowerShell and bash scripting required - Python programming language required - Experience with incident management, root cause analysis, and resolving high-priority incidents in large, multi-tenant environments. - Extensive knowledge and understanding of AWS GovCloud and deploying in Government environments. - Exemplary communication, analytical skills, and technical knowledge across the client environment. - Ability to produce concise and clear technical documentation. Preferred Qualifications: - Preferred Certifications: - Experience with PowerShell, AWS CLI, or other automation scripts to troubleshoot and resolve issues - AWS Certified SysOps Developer Associate - AWS Certified Developer - Associate - AWS Certified DevOps Engineer – Professional - Relevant Agile Certification - Other Certifications such as Red Hat Ansible Peraton Overview Peraton is a next-generation national security company that drives missions of consequence spanning the globe and extending to the farthest reaches of the galaxy. As the world’s leading mission capability integrator and transformative enterprise IT provider, we deliver trusted, highly differentiated solutions and technologies to protect our nation and allies. Peraton operates at the critical nexus between traditional and nontraditional threats across all domains: land, sea, space, air, and cyberspace. The company serves as a valued partner to essential government agencies and supports every branch of the U.S. armed forces. Each day, our employees do the can’t be done by solving the most daunting challenges facing our customers. Visit peraton.com to learn how we’re keeping people around the world safe and secure. Target Salary Range $112,000 - $179,000. This represents the typical salary range for this position. Salary is determined by various factors, including but not limited to, the scope and responsibilities of the position, the individual’s experience, education, knowledge, skills, and competencies, as well as geographic location and business and contract considerations. Depending on the position, employees may be eligible for overtime, shift differential, and a discretionary bonus in addition to base pay. EEO EEO: Equal opportunity employer, including disability and protected veterans, or other characteristics protected by law.
Job Requirements
- Bachelor’s Degree in Computer Science, Management Information Systems, Engineering, or IT related field of study and 12 years of experience or a Masters degree and 10 years of experience in an IT related field of study
- Must be a U.S. Citizen with the ability to obtain/maintain a DHS Public Trust clearance
- 10+ years of experience providing technical/management leadership on major tasks or technology assignments involving IT Operations
- 5+ years of experience in cloud services and infrastructure
- 3+ years of extensive hands-on experience with automation involving a wide range of AWS services including but not limited to EC2 instances, S3 buckets, VPC configurations, RDS databases, and other services
- Must have extensive demonstrated knowledge in infrastructure architecture, server configuration, and management of Linux servers in AWS
- Must have proven knowledge of Change and Configuration Management best practices (knowledge of DHS Change Management Processes preferred)
- AWS Certified Solutions Architect Professional
- Pipeline Orchestration tool experience required, with GitLab preferred
- IaC experience required, with Terraform preferred
- Config as Code is required, with Ansible preferred
- Some PowerShell and bash scripting required
- Python programming language required
- Experience with incident management, root cause analysis, and resolving high-priority incidents in large, multi-tenant environments
- Extensive knowledge and understanding of AWS GovCloud and deploying in Government environments
- Exemplary communication, analytical skills, and technical knowledge across the client environment
- Ability to produce concise and clear technical documentation
- Preferred Qualifications
- Experience with PowerShell, AWS CLI, or other automation scripts to troubleshoot and resolve issues
- AWS Certified SysOps Developer Associate
- AWS Certified Developer - Associate
- AWS Certified DevOps Engineer – Professional
- Relevant Agile Certification
- Other Certifications such as Red Hat Ansible
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Contribute to the design, development, testing, deployment, configuration, and maintenance of infrastructure, software, and components in Azure Cloud • Collaborate with Enterprise Cloud Architects, and Cybersecurity compliance teams • Implement Infrastructure as Code (IaC) and Configuration as Code (CaC) • Implement and operationalize Policy-as-Code using tools such as Azure Policy, OPA/Gatekeeper, and Rego • Support the alignment of use cases and objectives with architectural standards and security requirements
• Build, optimize, and manage Continuous Integration and Continuous Deployment (CI/CD) pipelines. • Automate build, testing, and deployment processes. • Ensure faster, reliable, and repeatable software releases. • Troubleshoot pipeline failures and improve performance. • Design and manage infrastructure using code instead of manual configuration. • Automate provisioning of servers, networks, and environments. • Ensure consistency across development, staging, and production environments. • Implement version control for infrastructure changes. • Architect, deploy, and manage cloud-based systems. • Optimize scalability, availability, and cost efficiency. • Monitor cloud performance and resource utilization. • Implement backup, recovery, and disaster recovery strategies. • Implement monitoring and alerting systems for infrastructure and applications. • Ensure high system availability and performance. • Manage incident response and root cause analysis. • Maintain logging solutions for troubleshooting and auditing. • Integrate security practices into development and deployment pipelines. • Manage access control, secrets, and vulnerability scanning. • Ensure systems comply with organizational and regulatory standards. • Implement automated security checks.
Staff Engineer - SRE, Retail and Pharmacy
CVS HealthBringing our heart to every moment of your health.
We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time. The Staff Engineer – SRE, Retail & Pharmacy will implement and maintain comprehensive observability solutions, providing real-time insights into the performance and overall health of systems to proactively identify and address potential issues. This role is responsible for investigating and resolving incidents quickly during critical situations and performing root cause analysis to prevent future recurrence. You will collaborate with cross-functional teams to build robust monitoring, alerting, and telemetry solutions, enabling proactive issue detection and resolution across distributed systems. As a senior member of the SRE team, you will drive best practices, mentor others, and shape the strategic evolution of our observability ecosystem in a complex, edge-centric architecture. What You Will Do: - Observability Strategy & Implementation - Design and implement comprehensive observability solutions tailored for edge computing environments, including monitoring, logging, tracing, and metrics collection, to provide deep visibility into system performance and health across distributed remote facilities - Define and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and business KPIs to measure and enhance system reliability in edge and centralized infrastructure - Build and optimize dashboards, visualizations, and alerting systems to enable real-time insights and rapid incident response for edge nodes and remote facilities - Implement distributed tracing and log aggregation systems to troubleshoot complex issues in edge computing environments - System Reliability & Performance - Collaborate with engineering teams to ensure applications and infrastructure at edge locations are designed with observability in mind, incorporating best practices for instrumentation and monitoring in resource-constrained environments - Drive proactive identification of issues in edge facilities through advanced observability tools, reducing Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) across distributed systems - Lead incident postmortems, analyzing root causes specific to edge environments and implementing observability-driven improvements to prevent recurrence - Tooling & Automation - Develop and maintain tools, scripts, and automation to enhance observability pipelines, optimizing for the unique challenges of edge computing, such as bandwidth limitations and intermittent connectivity - Evaluate and integrate industry-standard observability tools (e.g., Prometheus, Grafana, ELK Stack, OpenTelemetry) and recommend solutions tailored for edge computing use cases - Optimize observability data storage, retention, and querying to balance performance, cost, and scalability across a large number of remote facilities - Leadership & Collaboration: - Mentor and guide junior SREs and engineers on observability best practices for edge computing, fostering a culture of reliability and proactive monitoring - Partner with solution, engineering, and business teams to align observability efforts with business objectives, ensuring seamless operation of edge and centralized systems - Lead cross-functional initiatives to improve observability, reliability, and operational efficiency across distributed edge infrastructure - Continuous Improvement: - Stay current with emerging observability trends, tools, and methodologies, particularly those suited for edge computing and distributed systems, and advocate for their adoption - Contribute to the development of observability standards, runbooks, and documentation tailored for edge environments to ensure consistency and scalability - Drive cost optimization for observability infrastructure while maintaining high-quality monitoring and alerting capabilities across remote facilities Minimum Qualifications: - 8+ years of experience in SRE, DevOps, or related technology roles - 5+ years of experience in delivering software in a large-scale environment with reliability and resilience concepts (multi-region, multi-cloud, containerization, etc.) - 5+ years of experience with observability and monitoring tools such as Splunk, Dynatrace, Datadog, Prometheus, Grafana, etc. - 3+ years of experience with programming/scripting languages (e.g., Python, java) for automation and tooling in distributed environments - 3+ years of experience on Cloud Technologies (AWS, Microsoft Azure, Google Cloud - 3+ years of experience with source control and continuous integration tools like Git/Stash, BitBucket, or Jenkins - 2+ years of engineering team leadership or management experience - Experience using customer feedback tools such as Quantum Metrics, Medalia, and Adobe Analytics - Deep understanding of microservices architecture and cloud-native technologies - Experience in configuring, supporting, and managing Rancher, Kubernetes, and/or Docker - Experience in Incident Management, Change Management, Infrastructure Support, and Problem Management concepts and processes - Excellent interpersonal and communication skills, including the ability to engage technical and non-technical stakeholders Preferred Qualifications: - Expertise working in edge computing environments with a large number of remote facilities, managing observability for distributed, high-latency, or resource-constrained systems - Familiarity with chaos engineering principles to validate observability systems in edge environments - Experience with retail SRE organizations, including experience with store systems; Point of Sale (POS), hand-helds, etc. - Expertise in cloud development and deployment technologies, including containerization and multi-cloud configurations - Demonstrated understanding of various API management and related platforms like Apigee, Vordel, Data power Education: - Bachelor’s degree in Computer Science, Engineering, or related field required - Master’s degree in Computer Science, Engineering, or related field preferred Pay Range The typical pay range for this role is: $118,450.00 - $260,590.00 This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company’s equity award program. Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong. Great benefits for great people We take pride in our comprehensive and competitive mix of pay and benefits – investing in the physical, emotional and financial wellness of our colleagues and their families to help them be the healthiest they can be. In addition to our competitive wages, our great benefits include: - Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan. - No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching. - Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility. For more information, visit https://jobs.cvshealth.com/us/en/benefits We anticipate the application window for this opening will close on: 04/24/2026 Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
Senior Site Reliability Engineer
JobgetherWe use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Role Description This role offers a unique opportunity to ensure the reliability, scalability, and performance of critical platform services in a fast-paced, technology-driven environment. The Senior Site Reliability Engineer (SRE) will combine software engineering expertise with operational excellence to automate processes, improve observability, and reduce operational risk across the platform. You will collaborate closely with development, DevOps, release engineering, and security teams to embed reliability and security best practices throughout the software lifecycle. This position emphasizes proactive problem-solving, automation, and continuous improvement while providing mentorship to peers and contributing to high-impact projects. The role is ideal for someone who thrives on solving complex technical challenges while shaping the platform’s resilience and scalability. - Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for critical services. - Lead capacity planning, performance tuning, design reviews, and disaster recovery exercises to validate platform resilience. - Automate infrastructure provisioning, patching, and operational tasks using Terraform, Ansible, and CI/CD pipelines to eliminate manual processes. - Partner with security teams to enforce compliance (SOC2, CIS benchmarks), implement least-privileged IAM policies, and maintain hardened, secure systems. - Serve as Tier-2 escalation during incidents, lead root cause analysis, and continuously improve incident response playbooks and on-call processes. - Identify repetitive operational tasks and implement automation or self-service modules to reduce toil and improve developer productivity. - Measure system performance, track reliability metrics, and collaborate with leadership to drive iterative improvements. Qualifications - Bachelor’s degree in Computer Science, Engineering, or related field. - Minimum of 5 years of experience in Site Reliability Engineering, DevOps, or Systems Engineering roles. - Strong experience with AWS multi-account environments, Terraform, Ansible, CI/CD tools (GitHub Actions, Bitbucket, Jenkins, AWS CodeBuild/CodePipeline), and observability platforms (New Relic, CloudWatch). - Background with containerized environments (ECS, Fargate, EKS) and resilient system architectures. - Preferred certifications: AWS DevOps Engineer or Solutions Architect, Kubernetes, or SRE/DevOps practitioner certifications. - Excellent analytical, troubleshooting, and problem-solving abilities. - Strong collaboration skills to work effectively with cross-functional teams, mentor peers, and contribute to continuous improvement. Benefits - Competitive salary range: USD $120,000 – $125,000 per year. - Day-one medical, dental, vision coverage with flexible spending options (HSA/FSA). - 401(k) with company match available from day one. - Paid sick leave, volunteer time, and parental leave options. - Employer-paid life and disability insurance. - Wellbeing on Demand program to support personal health and wellness. - Flexible work environment with remote opportunities and casual dress code. Company Description



