Job Closed
This listing is no longer active.
Optum, part of the UnitedHealth Group family of businesses, is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. At Optum, we support your well-being with an understanding team, extensive benefits and rewarding opportunities. By joining us, you’ll have the resources to drive system transformation while we help you take care of your future. We recognize the power of connection to drive change, improve efficiency and make a difference in health care. Join a team where your skills and ideas can make an impact and where collaboration is key to creating technology that produces healthier outcomes.
Site Reliability Engineer - Remote
Location
Minnesota
Posted
12 days ago
Salary
$72.8K - $130K / year
Seniority
Junior
Job Description
Site Reliability Engineer - Remote
Optum
Requisition Number: 2361712 Opportunities with Logistics Health Incorporated (LHI), part of the Optum family of business. We're dedicated to simplifying the logistics of complex workforce health programs with cost-effective solutions and a seamless distribution process. With offices in La Crosse, Wis., a satellite office in Chicago and remote employees throughout the country, we have a variety of rewarding career opportunities for you. Elevate your career as you help us create a healthier tomorrow for everyone and discover the meaning behind Caring. Connecting. Growing together. We are seeking a Site Reliability Engineer to join our Optum Serve team. In this remote role, you will build, maintain, and operate our AWS-hosted platform to support critical government healthcare services. You will work closely with development teams to identify and measure SLOs, SLAs, and SLIs, ensuring high availability, performance, and scalability. This is an exciting opportunity to implement advanced automation, self-healing mechanisms, and robust monitoring to improve production systems and ensure seamless service delivery. You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week. Primary Responsibilities: - Build, maintain, and operate the AWS-hosted platform - Work closely with development teams to identify and measure SLOs, SLAs, and SLIs - Contribute to the development of platform services including architecture, provisioning, configuration, deployment, and support - Integrate applications with centralized logging, metrics dashboards, instrumentation, incident monitoring, and management tools - Participate in an on-call rotation for incident resolution for the platform and any dependent components - React to production deficiencies by continuously implementing automation, self-healing systems, and real-time monitoring - Maintain and improve operational tooling and frameworks - Perform root cause analysis and deliver swift resolution for tools and automation failures - Build, integrate, and administer systems and tools that enable engineering teams to observe their applications in production with autonomy (Dashboards, APMs) - Automate alerts for metrics on performance, cost, vulnerabilities, risk, and compliance violations - Conduct comprehensive postmortems after production issues to drive continuous platform improvement You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in. Required Qualifications: - 2+ years of experience building, maintaining, and operating platform infrastructure within AWS (specifically with EC2, VPC, IAM, Lambda, S3, and CloudWatch) - 2+ years of experience with Infrastructure-as-Code (IaC) using Terraform, AWS CloudFormation, or CDK - 2+ years of experience with Linux system administration and shell scripting - 1+ years of experience building or managing CI/CD pipelines using Git and GitLab - 1+ years of experience monitoring and alerting with tools such as CloudWatch or Dynatrace - 1+ year of scripting experience in Python or PowerShell Preferred Qualifications: - Bachelor's degree in Information Technology, Computer Science or related field - Experience utilizing AI-driven anomaly detection in CloudWatch for proactive issue resolution - Experience with automation of patching and scaling using predictive models - Experience supporting infrastructure for AI-based applications - Demonstrated understanding of federal security and compliance frameworks, such as FedRAMP Moderate or NIST 800-171 - Familiarity with containerized workloads (e.g., ECS, EKS) *All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $72,800 to $130,000 annually based on full-time employment. We comply with all minimum wage laws as applicable. Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants. At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission. UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations. UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Data Center Reliability Engineer
PhaidraPhaidra is an industrial AI company. We create self-learning, intelligent control systems for industrial facilities. The physical world today is filled with static infrastructure. Factories, power plants, and other industrial facilities are frozen in time — they operate in the same way they've operated for decades because their control systems are hard-coded. And hard-coded systems cannot change, leading to performance degradation and a lack of resiliency. Phaidra creates AI-powered control systems that automatically learn, adapt, and get better over time. Just as autonomous vehicles get smarter over time, so too will future industrial facilities. Our team has already delivered 40% energy savings at Google's data centers, and we're rapidly bringing AI technology to other types of industrial facilities.
• Utilize existing data ingestion and delivery platforms to "teach" models to understand the physical world, filling a critical expertise gap in the data center industry. • Use telemetry tools to analyze sensor data across mechanical (chillers, pumps) and electrical (UPS, switchgear, power feeds) systems to identify "failure signatures" for LLM-driven monitoring tool. • Act as a primary user of platforms, identifying gaps in current mechanisms and collaborating with Engineering to influence future features and data quality. • Translate raw telemetry into "SME-level" logic and directions used by the LLM tool to guide data center operators in real-time. • Cultivate deep domain expertise in all facets of data center infrastructure. • Move from shadowing peers to directly supporting customers, using the platform to provide clear, data-backed direction on complex problems. • Oversee pilot projects to test how AI-driven SME tool interprets real-world stressors, ensuring the output is operationally realistic, accurate, and actionable. • Remain agile and proactive in a fast-moving team environment.
DevOps Engineer – Secret
Xcelerate SolutionsXcelerate Solutions is a mission-driven company specializing in security, management, and IT solutions to strengthen America’s national security, safeguard cr
• Automating, optimizing, and securing the delivery pipeline for enterprise-grade, mission-critical systems. • Integrating DevSecOps best practices and innovative toolsets. • Working with application development teams to refactor or create solutions that leverage the DevSecOps CI/CD pipeline and tools. • Facilitating the development of new software solutions and transition of existing solutions from monolithic structures to micro-service structure operating within hardened containers. • Deploying and sustaining microservices factory utilizing COTS and open-source solutions.
Senior Site Reliability Engineer
DraftKings Inc.Defining what it means to build and deliver the most extraordinary sports & entertainment experiences.The Crown is Yours
At DraftKings, AI is becoming an integral part of both our present and future, powering how work gets done today, guiding smarter decisions, and sparking bold ideas. It's transforming how we enhance customer experiences, streamline operations, and unlock new possibilities. Our teams are energized by innovation and readily embrace emerging technology. We're not waiting for the future to arrive. We're shaping it, one bold step at a time. To those who see AI as a driver of progress, come build the future together. The Crown Is Yours As a Senior Site Reliability Engineer, you'll build and scale the critical infrastructure behind every product. In this role, you'll take on complex challenges across global data centers, multiple cloud platforms, and on-premise systems-designing automation-first solutions that elevate performance and eliminate operational friction. You'll be trusted to drive stability at scale, influence architectural decisions, and build tools that empower our teams to move fast and deliver reliably. This is where your impact won't just be felt, it'll be foundational. What You'll Do - Drive stability and scalability across our global compute platform spanning numerous data centers, multiple public clouds, and on-premise environments, serving as the foundation for every product. - Operate and evolve our GitOps delivery model, using Rancher Fleet and Flux with Helm to deploy core cluster services and application workloads declaratively and repeatably. - Build self-healing, fault-tolerant infrastructure and internal tooling that eliminates repetitive operational work and reduces toil for both platform and application teams. - Own cluster autoscaling and capacity strategy, including Karpenter, HPA and KEDA, and predictive scaling driven by event and calendar data. - Define SLOs and reliability metrics for platform components, using Datadog and our logging pipeline to surface cluster and workload health. - Support technical growth by sharing knowledge, participating in design discussions, and contributing to a collaborative team culture, including on-call rotation. What You'll Bring - Bachelor's degree in Computer Science or relevant education, experience, and training. - At least 4 years managing distributed cloud and on-premise environments at scale, with strong hands-on AWS experience. Exposure to GCP, vSphere, or Nutanix is a plus. - Deep expertise in container orchestration with Kubernetes, including the ability to design, scale, and troubleshoot complex workloads. - Strong experience developing software for automation and infrastructure tooling such as Go and Python. - Working knowledge of networking and Linux-based systems, including container runtimes such as Docker and containerd, packet-level debugging, and kernel troubleshooting. - Experience with Infrastructure as Code (IaC) and configuration management tools to ensure scalable and repeatable infrastructure provisioning. #LI-MF1 Join Our Team We're a publicly traded (NASDAQ: DKNG) technology company headquartered in Boston. As a regulated gaming company, you may be required to obtain a gaming license issued by the appropriate state agency as a condition of employment. Don't worry, we'll guide you through the process if this is relevant to your role. The US base salary range for this full-time position is 128,000.00 USD - 160,000.00 USD, plus bonus, equity, and benefits as applicable. Our ranges are determined by role, level, and location. The compensation information displayed on each job posting reflects the range for new hire pay rates for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific pay range and how that was determined during the hiring process. It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
Systems Administrator, Azure DevOps
Hagerty GroupHagerty is an automotive company that acts as a lifestyle brand for car lovers and also provides other services. As an employer, the company strives to build a work environment tha
Systems Administrator, Azure DevOps (ADO) United States Full time posted on Posted 5 Days Ago job requisition id R5225 Say hello to Hagerty Hagerty is a company built by drivers for drivers. We put our members at the center of everything we do and are dedicated to making it easier and more enjoyable for enthusiasts to drive and celebrate the machines they love. We’re proud to be the world’s largest insurer of collectible and enthusiast vehicles and are home to the Hagerty Drivers Club, the world’s largest car club. Our Marketplace business presents live and digital sales across the U.S. and Europe, we host a number of driving events and concours, and our award-winning automotive journalists produce the most popular car magazine globally, alongside internationally awarded videos. We’re committed to Never Stop Driving. Ready to get in the driver’s seat? Join us! As a Systems Administrator, Azure DevOps (ADO), you will be responsible for establishing, governing, and evolving the use of Azure DevOps as a core platform supporting delivery across value streams. This role ensures that ADO Boards, backlogs, workflows, and reporting structures are configured to enable visibility, consistency, and effective flow of value. Partnering closely with Scrum Masters, Agile Coaches, Engineering Managers, and IT Portfolio Managers, the ADO System Administrator balances governance with flexibility—ensuring teams can operate effectively while maintaining enterprise standards. This role also supports automation, user management, and data access through pipelines, Active Directory integration, and ADO APIs. What you’ll do ADO Governance & Operating Model - Establish and maintain a governance model for ADO Boards and backlogs across value streams. - Define and enforce standards for: - Work item types and hierarchies (Epic, Feature, Story, Bug) - Area Paths and Iteration Paths - Workflow states and transitions - Field usage and data integrity - Ensure consistency in how teams’ structure and manage work while allowing for context-specific flexibility. - Partner with Portfolio Operations and delivery leadership to evolve ADO practices in alignment with delivery maturity. Backlog & Board Configuration - Configure and maintain ADO Boards to support value stream delivery, including: - Backlog structures and levels - Swim lanes, filters, and board views - Custom fields and rules - Support teams in aligning their boards to flow-based delivery practices and readiness standards. - Troubleshoot and resolve configuration issues impacting team execution or visibility. User Access & Automation - Manage user access and permissions in alignment with organizational policies. - Execute and maintain pipelines and automation for user provisioning via Active Directory. - Ensure appropriate role-based access controls are in place across projects and teams. - Support onboarding of new users and teams into ADO. Reporting, Analytics & API Utilization - Leverage Azure DevOps APIs and analytics capabilities to: - Develop and maintain reporting solutions - Enable visibility into delivery metrics (e.g., flow, throughput, backlog health) - Support Portfolio and Program-level reporting needs - Partner with stakeholders to translate reporting requirements into scalable solutions. - Ensure data quality and consistency to support reliable insights. Intake, Support & Continuous Improvement - Serve as the primary intake point for ADO-related requests from: - Scrum Masters - Agile Coaches - Engineering Managers - IT Portfolio Managers - ADO Business Users - Triage, prioritize, and fulfill requests related to configuration, reporting, and access. - Provide guidance and education to teams on effective use of ADO. - Identify opportunities to improve tooling, automation, and governance practices. Reporting & Role Context - Supports: Value Streams, Delivery Teams, and Portfolio Leadership - Partners closely with: - Scrum Masters & Agile Coaches (ways of working) - Engineering Managers (team structure & access) - IT Portfolio Managers (visibility & reporting) - Primary platform: Azure DevOps (Boards, Pipelines, Analytics, APIs) How This Role Fits in the Ecosystem - IT Portfolio Managers orchestrate the work that is prioritized and how value flows. - Scrum Masters enable how teams execute and improve delivery. - ADO System Administrator enables the system of record and visibility, ensuring tooling supports both execution and decision-making. - Together, these roles ensure alignment between delivery practices, tooling, and portfolio insights. This Might Describe You - Strong experience administering Azure DevOps (Boards, Pipelines, Security, Analytics) - Experience defining and implementing governance models for work management tools - Familiarity with Active Directory integration and user provisioning automation - Experience working with REST APIs for reporting or automation - Strong understanding of Agile delivery practices and work item management - Systems thinker who balances standardization with team flexibility - Clear communicator who can translate technical configurations into practical guidance - Comfortable supporting multiple teams in a dynamic, evolving environment - Experience in enterprise or regulated environments preferred Preferred Skills - Experience with Azure DevOps Analytics or Power BI integration - Scripting experience (e.g., PowerShell, Python) for automation - Familiarity with CI/CD pipelines in Azure DevOps - Experience supporting large-scale Agile or value stream-based delivery models Other things to note - This position is open to U.S. remote work. However, team members who reside within 20 miles of the Traverse City headquarters will follow a hybrid schedule, working from the office three days per week. - May require travel for quarterly events. - Familiarity with public company requirements, including Sarbanes Oxley and key regulations, if applicable. For SOX compliant roles, responsible for designing, executing, and documenting internal controls where they have been identified as owners to prevent errors in financial reporting, processes, and business operations. Including attestation to the completeness, accuracy, and compliance of all financial reporting data, where applicable.



