Job Closed
This listing is no longer active.
Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.
DevOps Lead
Location
United States
Posted
102 days ago
Salary
$107K - $195K / year
Seniority
Lead
No structured requirement data.
Job Description
DevOps Lead
Leidos
Leidos is seeking a DevOps Lead to lead and further mature platform engineering, site reliability engineering (SRE), and application security functions within a large-scale software development and IT operations program. This role is responsible for driving operational excellence, security, resiliency, and scalability across enterprise DevOps platforms supporting mission-critical applications for federal government customers. Position Summary: The DevOps Lead will oversee a multidisciplinary team tasked with enabling continuous integration/continuous deployment (CI/CD), optimizing cloud and on-premise platforms, securing development pipelines, and ensuring highly reliable and performant systems. This leader will collaborate with architects, engineering leads, project managers, and customer stakeholders to maintain an automation-first culture and introduce best practices for reliability, security, and DevOps maturity. This opportunity is ideal for an accomplished DevOps leader who thrives on solving complex platform, security, and reliability challenges at scale and is passionate about technical excellence and customer mission achievement. Location: This position may be remote, though periodic visits to customer sites or the Program Office (within Washington, DC region) are required. Primary Responsibilities DevOps & Platform Management: Direct the design, implementation, and maintenance of CI/CD pipelines, automated provisioning, and monitoring processes across cloud and hybrid environments. Lead efforts to standardize and optimize platform engineering practices, adopting Infrastructure-as-Code (IaC) and microservices deployment models. Site Reliability Engineering (SRE): Develop and enforce SRE principles, including release management, system reliability, observability, incident management, SLAs/SLOs, and fault tolerance. Implement monitoring and alerting solutions to proactively identify issues, reduce mean time to resolution (MTTR), and drive service uptime objectives. Application Security: Integrate security throughout the SDLC, ensuring robust code review, vulnerability scanning, and threat modeling within automated pipelines. Collaborate with security teams to remediate vulnerabilities and achieve compliance with industry and federal standards (e.g., FedRAMP, NIST). Team Leadership & Collaboration : Mentor and lead a multidisciplinary team, promoting an agile, collaborative, and innovative work environment. Partner with development, security, and operations teams to align platform strategies with business and mission requirements. Platform Engineering: Champion creation and curation of reusable infrastructure patterns, automation scripts, cloud orchestration templates, and developer self-service platforms. Evaluate emerging tools and technologies for enhancing developer productivity and system resilience. Service Operations & Incident Response: Oversee operational readiness, incident response, root cause analysis, and continuous improvement initiatives to ensure high availability and rapid recovery from service disruptions. Continuous Improvement: Drive a culture of innovation by assessing and implementing advancements in DevOps, platform engineering, and SRE practices. Regularly review system metrics, operational KPIs, and propose enhancements. Reporting & Stakeholder Communication: Prepare and present operational dashboards, incident reports, risk assessments, and status updates to program leadership and customers. Ensure transparent communication of operational posture and improvement initiatives. Required Qualifications Bachelor’s degree and 10+ years of progressive experience in software development, DevOps, or platform engineering. 3+ years of technical team leadership or management experience. Demonstrated expertise in advanced DevOps practices, including CI/CD, configuration management, automation, and cloud-native operations (AWS, Azure, or similar). Hands-on experience with SRE frameworks, monitoring, logging, alerting, and reliability engineering techniques. Proven background in securing applications and systems, including integrating security into pipelines and coordinating with security/compliance teams. Strong technical knowledge of container orchestration (Kubernetes, Docker), IaC (Terraform, CloudFormation), and end-to-end application/platform lifecycle management. Excellent interpersonal, written, and verbal communication skills. Strong problem-solving skills and ability to thrive in a fast-paced, dynamic environment. U.S. citizenship and ability to obtain and maintain required government security clearanc e. Preferred Qualifications Certifications in cloud platforms (e.g., AWS Certified DevOps Engineer, Azure DevOps), SRE, or security (e.g., CISSP, CISM). Experience managing federal or large enterprise DevOps/SRE/Platform Engineering teams. Familiarity with federal compliance frameworks (FedRAMP, FISMA, NIST 800-53). ITIL Foundations or equivalent IT service management training. Prior experience with infrastructure modernization, large-scale migrations, or complex system integrations. Familiarity with Department of Justice or federal government IT environments. If you're looking for comfort, keep scrolling. At Leidos, we outthink, outbuild, and outpace the status quo — because the mission demands it. We're not hiring followers. We're recruiting the ones who disrupt, provoke, and refuse to fail. Step 10 is ancient history. We're already at step 30 — and moving faster than anyone else dares. For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above. The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
**Program Description:** **NOTE**__**:**__ This is a short-term position with an expected duration of April 1, 2026 - September 15th, 2026. This program provides IT services focused on building, securing, and operating the Department of Veteran Affairs LGY’s home loan product-line technology. The contract’s purpose is to modernize and sustain critical home loan technology systems that support LGY’s delivery of mortgage-related services to program stakeholders, to provide continuous delivery and security integration. **Position Description:** This position focuses on creating and modifying pipelines using GitHub Enterprise Cloud repositories. The role requires expertise in developing and maintaining pipelines using Jenkins servers and troubleshooting deployment issues. Candidates should incorporate metrics such as Mean Time To Build (MTTB) and Mean Time To Deploy (MTTD). Experience with multiple CI/CD tools, Git Actions, and code scanning tools like CodeQL, Fortify, SonarQube, and Nexus is desired. Familiarity with automation tools such as Selenium, Cucumber, Maven, and AWS CodeBuild/CodeDeploy is advantageous. **Responsibilities:** · CI/CD Pipeline Engineering Design, implement, and maintain CI/CD pipelines aligned to team and program delivery practices. · Create and modify pipeline definitions and workflows tied to GitHub Enterprise Cloud repositories. · Develop and maintain pipeline jobs and shared libraries on Jenkins (pipelines-as-code, scripted/declarative approaches as applicable). · Standardize pipeline patterns and reusable templates to reduce duplication and improve maintainability. · Deployment Troubleshooting & Operational Support Diagnose and resolve build failures, deployment issues, and environmental inconsistencies across lower and higher environments. · Perform root cause analysis (RCA) and implement corrective actions to prevent recurring failures. · Partner with engineering, QA, security, and platform teams to remediate pipeline blockers and streamline deployments. · DevSecOps Metrics & Continuous Improvement Instrument and report delivery metrics including MTTB and MTTD; identify bottlenecks and implement improvements. · Monitor pipeline performance (queue time, build duration, failure rates, flaky tests) and drive optimization. · Improve automation coverage and reduce manual steps through pipeline enhancements. · Security & Code Quality Integration (“Shift Left”) Integrate code scanning and quality gates into pipelines using tools such as: CodeQL, Fortify, SonarQube, and artifact/repository controls like Nexus Ensure pipelines enforce consistent security and quality checks prior to merge/release. · Collaborate with security stakeholders to tune scanning thresholds, manage findings, and support remediation workflows. · Automation Enablement Implement or enhance automation steps using tools such as: Selenium, Cucumber, Maven Support automated build/test/deploy stages and improve feedback loops to developers. · Documentation & Enablement Document pipeline standards, usage guides, and operational runbooks. · Provide guidance and mentoring to teams on CI/CD best practices, branching strategies, and pipeline troubleshooting.
Senior DevSecOps Consultant – GitLab Platform
Trility ConsultingStart delivering technology solutions that simplify, automate, and secure your business.
• Design and implement a CMMC-aligned GitLab architecture supporting 250–500+ users • Deploy and operate self-managed GitLab on Kubernetes using Crossplane • Architect secure GitLab runner strategies (pooling, isolation, autoscaling) for mixed workloads • Evaluate and document architectural approaches (single vs. segregated GitLab instances) with clear tradeoff analysis • Translate NIST 800-171 and CMMC requirements into enforceable GitLab configurations and access controls • Implement configuration-as-code using Terraform (e.g., GitLab provider) to ensure versioned, auditable, and repeatable platform management • Design and implement RBAC, least-privilege models, and segregation of duties • Establish drift detection and audit mechanisms to monitor and remediate unauthorized changes • Integrate GitLab into the broader Kubernetes platform ecosystem, including GitOps workflows (e.g., ArgoCD) • Produce architecture documentation, runbooks, and reference patterns to enable internal ownership and long-term sustainability • Collaborate with cybersecurity, architecture review boards, and platform teams to validate compliance and design decisions
• Design, implement, and test the build, deployment, and configuration management solutions in a Microsoft based implementation • Build and test the automation tools for infrastructure provisioning • Manage CI and CD processes, tools, and configurations with the team • Contribute to new ideas and ways to improve development delivery • Provide technical guidance and educate team members and coworkers on DevOps methodologies, and help establish and follow industry standard best practices • Develop and implement improvements to our IaC codebase (Terraform) • Develop and maintain CaC standards (Packer, Chef) • Monitor metrics (Datadog) and propose ways to improve application observability and visibility • Create and maintain documentation for build and deployment processes • Participate in sprint planning, implementation, standups and demos • Create PoC and prototype solutions for varied technical initiatives
Site Reliability Engineer II
InvestorFlowInvestorFlow is a leading provider of integrated CRM and portals for asset and investment managers.
• Design and implement comprehensive monitoring strategies rather than owning observability platforms outright. • Collaborate with DevOps and Engineering on shared observability platforms (Grafana, Prometheus/Loki, Azure Monitor/Application Insights). • Define golden signals dashboards, measure SLOs/SLIs/error budgets, and help implement actionable alerting. • Drive structured logging standards, distributed tracing patterns, and OpenTelemetry implementation standards for teams to deploy and SRE to validate. • Conduct monitoring/auditing of production systems to ensure instrumentation completeness. • Take ownership of production incident response, lead incident handling, and drive remediation. • Conduct blameless post-incident reviews and ensure follow-through on action items. • Continuously improve operational processes, reliability practices, and team readiness. • Monitor system resource utilization and forecast future needs. • Tune autoscaling configurations in partnership with Engineering teams. • Evaluate capacity efficiency and support cost optimization strategies. • Validate DR environments and test failover processes—not build them. • Ensure DR capabilities are functioning as-designed with clear documentation. • Define and lead regular DR drills in partnership with Engineering/Platform teams. • Work with the Non-Functional Testing team on resilience and DR scenario simulations. • Support chaos experiment planning and validation as a nice-to-have capability.




