Job Closed
This listing is no longer active.
Centene Corporation is a Fortune 500, mission-driven healthcare leader committed to transforming the health of the communities we service, one person at a time.
Site Reliability Operations Analyst II
Location
Kansas + 4 moreAll locations: Kansas | Louisiana | Missouri | Tennessee | Texas
Posted
60 days ago
Salary
$63.6K - $114.6K / year
Seniority
Mid Level
Job Description
Site Reliability Operations Analyst II
Centene Corporation
• Collaborate with cross-functional teams to proactively monitor, maintain, and enhance production systems. • Bridge the gap between software development and IT operations, with an emphasis on incident triage, systems operations and documentation, business user support and communications, toil reduction, work automation and improving the reliability of services. • Provide timely and accurate user support/troubleshooting as well as management of escalations within enterprise SLA’s (Service Level Agreements). • Researches and analyzes enrollment/provider data and Utilization Management/Case Management business processes to support applications issues while acquiring a more advanced skill set for assistance of product design and solutions. • Perform On Call post deployment application validations of enhancement/code fixes and monitoring for escalations. • Act as a key player in application incident management, ensuring prompt detection, triage, and resolution of production incidents. • Monitor and manage technology business cycle processing within the application environment. • Establish, document, and refine incident management processes to reduce downtime and service degradation. • Participate in post-incident reviews, ensuring that lessons learned are incorporated into operational practices and automation efforts. • Develop, improve and review runbooks and documentation and keep them up to date to ensure consistency across SRE teams. • Document operational processes including workflows, system configurations, troubleshooting guides, and incident reports to improve the team’s ability to respond quickly to incidents and system failures. • Review and provide feedback and process improvement recommendations on topics like automation, monitoring gaps, toil reduction and systems resiliency. • Comply with all policies and standards.
Job Requirements
- A Bachelor's degree in a quantitative or business field (e.g., statistics, mathematics, engineering, computer science)
- 2 – 4 years of related experience, or equivalent experience acquired through accomplishments of applicable knowledge, duties, scope and skill reflective of the level of this position
- 2+ years Strong problem-solving skills, with a focus on identifying and resolving issues in real-time and improving systems proactively required
- 2+ years Ability to analyze system performance data and recommend solutions for optimization required
- 2+ years Excellent written and verbal communications skills to interact effectively with technical and non-technical stakeholders required
- 2+ years Ability to work cross-functionally, building relationships with various IT organizations and Operations
- One or more desired skills: Knowledge of Application support or testing experience, Knowledge of Business and process analysis, or healthcare industry experience, Experience with Data analytics and BI tools, such as Power BI, R, etc., Experience with Operating Systems, such as Linux, Unix, and Windows, Experience with Cloud platforms, such as AWS or Azure, Experience with Mongo DB, MySQL, Oracle Database Management System (DBMS), PL SQL, SQL Programming Language, Experience with IT Service Management tools, such as ServiceNow, Atlassian, BMC, Experience with Microsoft Excel
Benefits
- competitive pay
- health insurance
- 401K and stock purchase plans
- tuition reimbursement
- paid time off plus holidays
- flexible approach to work with remote, hybrid, field or office work schedules
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
TL;DR We're hiring a Site Reliability Engineer to own and evolve deepset's cloud and customer infrastructure end to end. You'll work across SaaS, private cloud, and on-prem environments to make our self-hosted platform production-ready, drive CI/CD and GitOps maturity, and reduce complexity at scale. Your work will directly shape how deepset's AI platform is built, deployed, and scaled for our own cloud and for customers running it in their own environments. Why deepset At deepset, we’re on a mission to make custom AI solutions accessible to every organization. With Haystack, thousands of developers build advanced LLM applications every day, while our enterprise-ready AI Platform helps companies turn large language models into business value. We’re remote-first, flexible, and built on a culture of trust and ownership. You’ll collaborate with top-tier tech talent, tackle meaningful challenges, and help transform complex AI into solutions that are simple, powerful, and ready for the real world. What you will do You won’t just “keep things running” - you’ll help define how our platform is built, deployed, and scaled across cloud and customer environments. - Build and operate real-world infrastructureDesign, configure, and evolve infrastructure that runs both in our cloud and inside customer environments (SaaS, private cloud, on-prem). - Make self-hosted production-readyHelp us deliver a production-grade, self-hosted platform that can be deployed on any Kubernetes setup in weeks - not months. - Drive automation & platform maturityImprove CI/CD pipelines, GitHub workflows, and GitOps setups so teams can ship faster with confidence. - Reduce complexity and costContinuously simplify systems and optimize infrastructure spend without compromising performance or reliability. - Shape how we buildChampion best practices in reliability, scalability, and security across the organization, not as rules, but as working systems. Requirements - 2-5 years of experience working with large-scale production infrastructure - Fluent German language skills - Experience with distributed or service-oriented architectures - Hands-on expertise with: - AWS - Kubernetes - CI/CD and GitOps (e.g. ArgoCD) - Working knowledge of Infrastructure as Code (Terraform preferred) - Solid troubleshooting skills - you can debug across systems, not just within one layer - A pragmatic mindset: you balance speed, simplicity, and reliability - Ownership and accountability - you take responsibility for systems end-to-end - Ability to work independently while staying aligned with the team’s goals Nice to have - Familiarity with observability stacks (e.g. Datadog, Prometheus) - Experience optimizing cloud costs at scale - Interest or experience in Machine Learning / LLM systems - Experience improving developer experience and platform tooling using AI agents - Contributions to SRE practices like postmortems, SLIs/SLOs, and reliability engineering culture Benefits - Remote-first setup with flexible hours & tech of your choice - 30 days vacation + extra days for family sick leave - Competitive salary & stock options for every team member - Monthly sports & mental health support allowance with Oliva - Annual learning & development budget - Monthly team socials & in-person meetups - Dog-friendly Berlin HQ
TL;DR We're hiring a Site Reliability Engineer to own and evolve deepset's cloud and customer infrastructure end to end. You'll work across SaaS, private cloud, and on-prem environments to make our self-hosted platform production-ready, drive CI/CD and GitOps maturity, and reduce complexity at scale. Your work will directly shape how deepset's AI platform is built, deployed, and scaled for our own cloud and for customers running it in their own environments. Why deepset At deepset, we’re on a mission to make custom AI solutions accessible to every organization. With Haystack, thousands of developers build advanced LLM applications every day, while our enterprise-ready AI Platform helps companies turn large language models into business value. We’re remote-first, flexible, and built on a culture of trust and ownership. You’ll collaborate with top-tier tech talent, tackle meaningful challenges, and help transform complex AI into solutions that are simple, powerful, and ready for the real world. What you will do You won’t just “keep things running” - you’ll help define how our platform is built, deployed, and scaled across cloud and customer environments. - Build and operate real-world infrastructureDesign, configure, and evolve infrastructure that runs both in our cloud and inside customer environments (SaaS, private cloud, on-prem). - Make self-hosted production-readyHelp us deliver a production-grade, self-hosted platform that can be deployed on any Kubernetes setup in weeks - not months. - Drive automation & platform maturityImprove CI/CD pipelines, GitHub workflows, and GitOps setups so teams can ship faster with confidence. - Reduce complexity and costContinuously simplify systems and optimize infrastructure spend without compromising performance or reliability. - Shape how we buildChampion best practices in reliability, scalability, and security across the organization, not as rules, but as working systems. Requirements - 2-5 years of experience working with large-scale production infrastructure - Fluent German language skills - Experience with distributed or service-oriented architectures - Hands-on expertise with: - AWS - Kubernetes - CI/CD and GitOps (e.g. ArgoCD) - Working knowledge of Infrastructure as Code (Terraform preferred) - Solid troubleshooting skills - you can debug across systems, not just within one layer - A pragmatic mindset: you balance speed, simplicity, and reliability - Ownership and accountability - you take responsibility for systems end-to-end - Ability to work independently while staying aligned with the team’s goals Nice to have - Familiarity with observability stacks (e.g. Datadog, Prometheus) - Experience optimizing cloud costs at scale - Interest or experience in Machine Learning / LLM systems - Experience improving developer experience and platform tooling using AI agents - Contributions to SRE practices like postmortems, SLIs/SLOs, and reliability engineering culture Benefits - Remote-first setup with flexible hours & tech of your choice - 30 days vacation + extra days for family sick leave - Competitive salary & stock options for every team member - Monthly sports & mental health support allowance with Oliva - Annual learning & development budget - Monthly team socials & in-person meetups - Dog-friendly Berlin HQ
Senior DevOps Engineer
AbacumAbacum is the leading business planning platform that empowers Finance teams to drive performance.
• Design and implement our systems to be efficient, scalable, accountable, and secure • Team up with other Engineers to perform experiments and test new ideas • Build a strong DevOps culture and tooling that enable our delivery teams to be autonomous while providing best practices (security, observability, scalability, performance, etc.) • Deploy and manage our infrastructure provisioning • Develop and drive real time observability solutions that provide visibility into system health • Provide technical guidance and educate team members and coworkers on operations and cloud best practices • Continuously improve development delivery CI/CD • Ability to develop and implement security measures related to the development processes and operational needs driven by our security and compliance team • Build and scale our Kubernetes clusters and workloads • Manage and scale our cloud databases • Participate in a 24x7 on-call rotation
Role Description We are looking for a skilled and collaborative DevOps Engineer to join our Cloud Security and Infrastructure team. As a DevOps Engineer, you will be responsible for building, maintaining, and optimizing cloud-based infrastructure. You will work alongside engineering teams to solve complex operational problems, design solutions, and implement practices that drive efficiency at scale. Reporting into the Engineering Manager, DevOps & Infrastructure, your contributions will be crucial in promoting collaboration between engineering teams to ensure the delivery and management of robust services. - Work closely with Engineering stakeholders to design and maintain a reliable, scalable, and secure platform. - Optimize and enhance deployment tooling and infrastructure, including creating and maintaining new CI/CD pipelines, improving infrastructure for cost-effectiveness and performance. - Collaborate with the Engineering team to identify areas for improvement and implement innovative solutions. - Provide expert DevOps guidance for existing projects and new initiatives. - Set up and maintain test environments for both manual and automated testing. - Conduct vulnerability scans and penetration tests, analyzing results and acting on findings. - Plan and execute ongoing routine application maintenance tasks. - Participate in on-call rotations for monitoring, alerting, and incident response. - Identify cost-saving opportunities and efficiency improvements within our infrastructure. Qualifications - 3+ years of experience in DevOps, Site Reliability Engineering, or automation engineering. - Strong understanding of cloud technologies, particularly AWS. - Hands-on experience managing systems and services on a cloud platform (AWS). - Solid understanding of database technologies such as MongoDB. - Knowledge of Docker, Kubernetes, and Linux Systems. - Proficiency with Terraform and Packer for infrastructure as code. - Experience deploying and managing cloud services, monitoring and alerting systems, and handling critical issues. - Agile development experience and familiarity with CI/CD practices. - Strong attention to detail, problem-solving, and troubleshooting skills. - Exceptional organizational skills and the ability to manage multiple projects concurrently. - Ability to work independently and thrive in an autonomous environment with minimal supervision. - Excellent communication and collaboration skills. - A passion for continuous learning and technical curiosity. Requirements - AWS DevOps Engineer certification. - Experience with cloud-based security protocols. Benefits - We’re a remote-first company built on trust, autonomy, and accountability. - Minimum of 3 weeks vacation, 5 sick days, and 6 personal/flex days, plus a company-wide winter holiday shutdown. - Health, dental, and vision, long-term disability, and a Health Spending Account (HSA). - Flexible parental leave benefits, including top-ups. - A dedicated work-from-home allowance to get you set up for success.


