Job Closed

This listing is no longer active.

Leidos logo
Leidos

Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.

Site Reliability Engineer, Artificial Intelligence Engineer

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 10,001+Since 1969H1B SponsorCompany SiteLinkedIn

Location

California + 3 moreAll locations: California | Hawaii | Virginia | Washington

Posted

115 days ago

Salary

$131.3K - $237.4K / year

Seniority

Senior

Bachelor Degree5 yrs expEnglishDistributed Systems

Job Description

Site Reliability Engineer, Artificial Intelligence Engineer

Leidos

• Design, develop, and maintain AI/ML models for anomaly detection, trend analysis, and signal correlation across metrics, logs, traces, and events. • Reduce alert noise through intelligent alert grouping, suppression, and prioritization. • Enhance observability platforms with AI-generated insights supporting SLO and error-budget management. • Implement AI-driven incident classification, enrichment, and summarization. • Provide probable root-cause analysis recommendations based on historical and real-time telemetry. • Support on-call and incident response teams with AI-guided remediation suggestions. • Contribute AI insights to post-incident reviews and reliability improvement plans. • Apply AI techniques to identify repetitive operational tasks and automation opportunities. • Assist in generating, validating, and optimizing automation playbooks and workflows. • Analyze automation execution data to improve success rates, resiliency, and reuse. • Build and maintain AI-searchable knowledge repositories containing runbooks, SOPs, lessons learned, and historical incident data. • Enable natural-language access to operational knowledge for SREs and operations staff. • Develop predictive models for capacity planning, failure forecasting, configuration risk, and reliability debt identification. • Support proactive remediation strategies to prevent incidents before customer impact. • Assist SRE leadership in data-driven prioritization of reliability investments. • Ensure AI solutions adhere to organizational security, compliance, and data-handling policies. • Establish guardrails for AI recommendations and automation execution. • Promote transparency, explainability, and auditability of AI-driven operational decisions.

Job Requirements

  • Bachelor’s degree in computer science, Engineering, Information Systems, Data Science, or related discipline
  • 5+ years in Site Reliability Engineering, DevOps, IT Operations, or Systems Engineering
  • 2+ years applying AI/ML techniques in operational, analytics, or automation contexts
  • Demonstrated experience supporting production systems in high-availability environments
  • Must have an active Secret Clearance in order to be considered for the position
  • Proficiency in data analysis tooling
  • Experience with machine learning fundamentals (anomaly detection, clustering, time-series analysis, NLP)
  • Familiarity with observability platforms (metrics, logs, traces, events)
  • Experience with automation frameworks and infrastructure-as-code concepts
  • Strong understanding of distributed systems and operational telemetry

Benefits

  • Competitive compensation
  • Health and Wellness programs
  • Income Protection
  • Paid Leave
  • Retirement

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 1,001-5,000Since 2015H1B Sponsor

• Doradztwie technologicznym i prowadzeniu warsztatów dla klientów z obszaru DevOps / Cloud Native • Projektowaniu, budowaniu i rozwoju środowisk Kubernetes, OpenShift • Migracji aplikacji i usług do architektur kontenerowych • Projektowaniu i implementacji pipeline’ów CI/CD • Automatyzacji infrastruktury (IaC) i procesów operacyjnych • Rozwiązywaniu złożonych problemów technicznych w środowiskach produkcyjnych • Współpracy z zespołami developerskimi, architektami oraz klientami biznesowymi

Poland
Flowhub logo

Software Engineer II – SRE/DevOps

Flowhub

The cannabis retail platform for modern dispensaries. Making safe cannabis products accessible to every adult on Earth.

DevOps Engineer115 days ago
OtherRemoteTeam 51-200Since 2015H1B No Sponsor

• Own the stability and reliability of all environments, including production, by implementing and supporting SRE practices across the organization. • Lead the execution and maintenance of our Observability stack, developing and maintaining comprehensive monitoring, alerting, and logging capabilities. • Manage and tune core database systems (OLAP, OLTP) to ensure high availability and reliability as application and business needs evolve. • Lead performance testing and engineering efforts for key services, conducting load tests and identifying bottlenecks to maintain system scalability. • Maintain and optimize our CDN/Edge infrastructure, ensuring optimal performance and security feature configuration. • Provide critical cross-support for core infrastructure, assisting with automation, cluster maintenance, disaster recovery, and CI/CD troubleshooting. • Develop internal tooling and platform capabilities to accelerate engineering productivity. • Mentor and guide junior engineers and developers to ensure application and infrastructure needs are tightly aligned.

United States
$115K - $145K / year
Job Closed
Partly logo

Site Reliability Engineer

Partly

Building the first global platform for replacement parts, starting with auto parts.

DevOps Engineer115 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

• Reliability Engineering: Ensure the stability, scalability, and security of our cloud infrastructure, Partly & 3rd party applications in our Kubernetes powered clusters. Leverage Infrastructure-as-Code and automation (Terraform for GCP, GitOps with ArgoCD, Custom scripts in Python/Bash, etc.) to deploy and manage workloads and resources in a repeatable, automated way. • Cost Optimisation: Monitor and optimise costs across our cloud and on-prem infrastructure, ensuring we get maximum value from our investments. Make recommendations for resource allocation or architecture changes to improve cost-efficiency without sacrificing reliability or performance. • Cross-Functional Collaboration: Work closely with developers, data engineers, and leadership to plan infrastructure needs and improvements. Provide tooling, guidance and training to the engineering team on SRE practices, and collaborate during software delivery to ensure smooth integrations from code to production. • Software Engineering: Make sure our software meets high production readiness standards. When you see a problem or an opportunity to improve, you drive the solution. • Troubleshooting: participate in incidents resolutions, give developers helping hand in debugging applications, networks, databases, compute systems.

Australia
Job Closed
Booz Allen Hamilton logo

DevOps Engineer

Booz Allen Hamilton

Booz Allen Hamilton is an award-winning provider of strategic innovation, management consulting, technology, and engineering services. Founded in 1914, the comp

DevOps Engineer115 days ago

DevOps Engineer The Opportunity: Everyone is trying to “harness the cloud,” but not everyone knows how. As a DevOps engineer, you’re eager to develop, manage, and secure a container platform that meets your client’s needs and takes advantage of cloud capabilities. We need you to help us develop container management sof tware to solve some of our clients’ toughest challenges. As a platform DevOps Engineer at Booz Allen, you can use your technical skills to affect mission-forward change. On our team, you’ll strengthen your skills using the latest cloud technologies as you look for ways to improve your client’s environment with current container sof tware to ensure seamless orchestration. Using your DevOps platform knowledge, you’ll support your team as you inform strategy and design while ensuring standards are met throughout the containerization process. You’ll work with your team to recommend resources that will help your client manage and securely adopt containers. Additionally, you’ll gain DevOps skills and experience while supporting the development of critical cloud platforms. Work with us to use cloud platform technology for good. Join us. The world can’t wait. You Have: 2+ years of experience with containerization technologies 2+ years of experience with container orchestration platforms 2+ years of experience managing sof tware deployments through CI / CD pipelines Experience developing enterprise cloud-native solutions and applying basic principles, theories, and concepts Experience with OOP scripting or program languages Ability to work with AWS or Azure Ability to work with container orchestration platforms Secret clearance Bachelor's degree Nice If You Have: Knowledge of automation, programming and scripting languages, infrastructure automation, and microservices Knowledge of triaging and resolving issues related to both open source and commer cia l tools in public cloud environments Top Secret clearance Clearance: Applicants selected will be subject to a security investigation and may need to meet eligibility requirements for access to classified information Compensation At Booz Allen, we celebrate your contributions, provide you with opportunities and choices, and support your total well-being. Our offerings include health, life, disability, financial, and retirement benefits, as well as paid leave, professional development, tuition assistance, work-life programs, and dependent care. Our recognition awards program acknowledges employees for exceptional performance and superior demonstration of our values. Full-time and part-time employees working at least 20 hours a week on a regular basis are eligible to participate in Booz Allen’s benefit programs. Individuals that do not meet the threshold are only eligible for select offerings, not inclusive of health benefits. We encourage you to learn more about our total benefits by visiting the Resource page on our Careers site and reviewing Our Employee Benefits page. Salary at Booz Allen is determined by various factors, including but not limited to location, the individual’s particular combination of education, knowledge, skills, competencies, and experience, as well as contract-specific affordability and organizational requirements. The projected compensation range for this position is $61,900.00 to $141,000.00 (annualized USD). The estimate displayed represents the typical salary range for this position and is just one component of Booz Allen’s total compensation package for employees. This posting will close within 90 days from the Posting Date. Identity Statement As part of the application process, you are expected to be on camera during interviews and assessments. We reserve the right to take your picture to verify your identity and prevent fraud. Work Model Our people-first culture prioritizes the benefits of flexibility and collaboration, whether that happens in person or remotely. If this position is listed as remote or hybrid, you’ll periodically work from a Booz Allen or client site facility. If this position is listed as onsite, you’ll work with colleagues and clients in person, as needed for the specific role. Commitment to Non-Discrimination All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, local, or international law.

California
$61.9K - $141K / year
Job Closed