Partnering with health systems to find time for the best care.
Senior Site Reliability Engineer
Location
United States
Posted
4 days ago
Salary
$125K - $165K / year
Seniority
Senior
No structured requirement data.
Job Description
Senior Site Reliability Engineer
DexCare
Role Description We're looking for a Senior Site Reliability Engineer who genuinely enjoys the craft. Someone who takes pride in a clean Terraform module, cares about observability because they've felt the pain of flying blind, and believes good documentation is an act of kindness for your teammates. You'll be hands-on with our AWS infrastructure, especially EKS, IAM, and RBAC, building things that are secure by default, not as an afterthought. You'll own our CI/CD pipelines in GitHub Actions, set up guardrails that let engineers ship quickly and confidently, and keep Datadog tuned so we know what's happening in our systems before our customers do. On any given week you might be writing Terragrunt modules, building a Python script to eliminate a tedious manual process, writing a runbook that'll save someone's 2am, or digging through a postmortem with the team with a focus on learning, not blame. We work in an Agile environment with an on-call rotation. We approach our processes with thoughtfulness and the intent to constantly iterate and make it better. You don't need to have all the answers; you just need curiosity, clear communication, and a willingness to own your slice of the system while keeping it accessible and scalable, enabling us to build together. What You’ll Do - Design, scale, and operate resilient, cloud-native infrastructure in AWS with a strong emphasis on EKS, IAM, RBAC, and modern security-first practices. - Build and optimize CI/CD pipelines with GitHub Actions and GitHub Advanced Security, enabling velocity without compromising safety. - Own observability across the stack using Datadog (metrics, logging, alerting, and tracing). - Write and maintain Terragrunt, Terraform modules, and infrastructure-as-code (IaC) automation. - Develop internal tools and scripts in Python to automate operational workflows and reduce manual overhead. - Document everything from runbooks to standards so teams stay aligned and systems stay stable. - Actively contribute to Agile workflows using Jira, with clear tracking of work, priorities, and progress. - Participate in on-call rotations, postmortems, and continuous improvement efforts — always with a blameless, team-first mindset. Qualifications - 4+ years in a Senior SRE or DevOps role supporting production cloud infrastructure at scale, preferably in SaaS, PaaS, high-growth, or fast-paced environment. - Deep experience with AWS (IAM, EKS, VPC, EC2, Secrets Manager, Serverless) and RBAC. - Knowledge of compliance standards like HIPAA, HITRUST, or SOC 2. - Hands-on proficiency with Terraform, Terragrunt, Helm, and container orchestration. - Proven experience building and maintaining GitHub Actions for CI/CD, including GitHub Advanced Security features like secret scanning and code policy enforcement. - Strong Datadog experience building dashboards, tuning alerts, setting up monitors, and interpreting telemetry. - Solid Python scripting experience for automation and internal tools. - You value clear, accurate documentation as a core part of engineering, not an afterthought. - Comfortable working in Agile/Scrum environments with well-tracked Jira workflows. - Practical experience with resource analysis and infrastructure optimization. Preferred Experience - AWS DevOps Engineer Professional Certification. - Familiarity with Lambda, Fargate, and serverless infrastructure. - Experience with multitenant platforms or customer-isolated deployments. - Experience with Azure or moving from Azure to AWS. Benefits - Eligible for Annual Bonus. - Healthcare benefits, short/long-term disability coverage, life insurance, and 401k. - Paid Parental Leave. - Nine paid holidays & Unlimited PTO. - Remote working arrangements.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Supporting the implementation and operation of automated CI/CD pipelines • Advising on the design and implementation of scalable cloud and infrastructure architectures • Introducing automation for deployments and infrastructure • Developing integrations for monitoring, logging and security solutions • Supporting colleagues in the service team by sharing knowledge
Lead – Agentic AI Forward Deployment Engineering
NetomiEmpowering the highest quality customer experiences.
• Serve as the primary technical and delivery leader responsible for transforming enterprise customer requirements into production-grade Agentic AI solutions. • Partner directly with customers to understand their business processes, design scalable AI-powered workflows, lead end-to-end implementations, and ensure successful deployments that deliver measurable business outcomes. • Own the complete customer deployment lifecycle—from discovery and solution design to implementation, quality assurance, go-live, and continuous optimization. • Work closely with Customer Success, Product, Engineering, Integration Engineers, and QA teams while acting as a trusted advisor to enterprise stakeholders throughout their AI transformation journey.
DevOps Engineer
PearsterHelping your business with top-tier IT talent ready to build smarter, scalable, and more human tech solutions.
• Design, build, and maintain scalable Azure infrastructure using Infrastructure as Code. • Develop and manage CI/CD pipelines to support automated deployments and release processes. • Monitor system performance, troubleshoot issues, and ensure high availability of environments. • Collaborate closely with development and QA teams to improve build and release workflows. • Implement security best practices and compliance standards across cloud environments. • Create and maintain documentation for infrastructure, processes, and configurations. • Participate in incident management and root cause analysis when necessary.
• Work with DevOps teams to design, implement, and maintain secure CI/CD pipelines integrating security testing at every stage of the software development lifecycle • Implement automated security scanning including SAST, DAST, SCA, container scanning • Deploy and support API Security tools • Ensure tools consistently report to aggregator • Collaborate with development teams to promote secure coding practices and provide security guidance throughout the development process • Ensure compliance with industry standards relevant to the travel industry including PCI-DSS, GDPR, and SOC 2 • Mentor junior engineers and promote a security-first culture across engineering teams




