Amwell (previously known as American Well): digital care delivery will transform healthcare
Senior Site Reliability Engineer
Location
United States
Posted
2 days ago
Salary
$129.3K - $140K / year
Seniority
Senior
Job Description
Senior Site Reliability Engineer
Amwell
• Support production systems on platforms such as ESXi, Azure, AWS, and GCP • Utilize configuration management tools for scalable and repeatable systems management including Ansible and Puppet • Design, develop, and maintain automation frameworks, scripts, and operational tooling to improve scalability, reliability, and operational efficiency across infrastructure and platform services. • Configure, maintain, patch, and troubleshoot Linux operating systems with basic knowledge of Windows operating systems • Ensure compliance with security and data handling policies to meet PCI, HIPAA, and other standards • Develop and maintain Infrastructure-as-Code (IaC) solutions using tools such as Terraform, Ansible, and Puppet to support repeatable and standardized deployments. • Collaborate with peers as an accountable and supportive member of Amwell technology teams • Participate in 24/7 call rotation and scheduled maintenance tasks
Job Requirements
- 5 or more years of experience managing Linux based systems, certifications a plus
- Strong experience with Infrastructure-as-Code and configuration management technologies including Terraform, Ansible, Puppet, or similar automation frameworks.
- Build automation workflows for system provisioning, patch management, monitoring, configuration management, incident response, and operational remediation.
- Experience with on-prem and cloud based virtualization platform compute and storage, such as ESXi, Azure, AWS and GCP
- Experience with Elasticsearch/Logstash/Kibana analytics engine (ELK Stack)
- Experience managing Identity and Authentication solutions including LDAP, Active Directory, and Multi-Factor Authentication
- Strong scripting and software development skills using languages such as Python, Bash, or PowerShell, with experience building reusable automation tooling and operational integrations in hybrid cloud and on-premise environments.
- Experience developing monitoring, alerting, and self-healing automation solutions.
- Solid foundation of TCP/IP networking concepts
- Experience supporting large-scale production environments with an automation-first operational mindset.
Benefits
- Flexible Personal Time Off (Vacation time)
- 401K match
- Competitive healthcare, dental and vision insurance plans
- Paid Parental Leave (Maternity and Paternity leave)
- Employee Stock Purchase Program
- Free access to Amwell’s Telehealth Services, SilverCloud and The Clinic by Cleveland Clinic’s second opinion program
- Free Subscription to the Calm App
- Tuition Assistance Program
- Pet Insurance
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Developing, testing, and distributing changes to software, services, and tools. • Developing subject matter expertise in VHP components. • Developing CI/CD pipelines to drive a highly sustainable infrastructure platform supply chain for our rapidly growing Akamai Cloud fleet. • Comfortable working in new tooling, code and environments and automating what’s possible. • Collaborating with our support, operations and engineering teams to troubleshoot complex problems
• Developing, testing, and distributing changes to software, services, and tools the VHP team is responsible for. • Designing and implementing enhancements to VHP observability infrastructure in order to identify and correct problems before they impact our customers. • Developing subject matter expertise in VHP components. • Comfortable working in new tooling, code and environments and automating what’s possible. • Collaborating with our support, operations and engineering teams to investigate and troubleshoot complex problems. • Participating in on-call rotations, guiding restoration and repair of service-impacting issues.
• participating in the day to day operations including the integrity, architecture, modeling, security, and performance tuning for MySQL databases • managing and improving the health and stability of MySQL instances • partnering with development and systems groups to provide deep subject matter expertise in various projects • installing, configuring, upgrading, and migrating existing databases • responding to and resolving database related requests from other departments • defining database requirements as part of the product lifecycle to influence new designs and standards • identifying data security related issues and improve overall security of the database environment • collaborating with support, operations and engineering teams to investigate and troubleshoot complex problems • participating in on-call rotations, guiding restoration and repair of service-impacting issues
Senior Site Reliability Engineer - Observability
Grupo QuintoAndarHelping people love where they live
Role Description Our mission in the Tech Platform line is to open doors so that QuintoAndar’s engineering teams can build incredible solutions with total autonomy, safety, and speed. As our go-to expert in Observability, you will join us at a pivotal moment of scale and transformation, taking ownership of shaping the future of our metrics, logs, and distributed tracing strategy. The challenge is big and beautifully complex: - It is not just about keeping systems up, but about translating raw data into business intelligence. - Improving the daily experience of hundreds of developers. - Ensuring our platform continues to redefine housing and quality of life for thousands of people with maximum reliability. If you love solving high-scale problems with autonomy, collaboration, and zero arrogance, this is the place for you. Qualifications - Minimum of 7 years of experience as a SRE, Platform Engineer or Infrastructure Engineer. - Solid experience as an SRE, Platform Engineer, or Infrastructure Engineer in large-scale production environments. - Advanced knowledge of modern observability pillars (metrics, logs, and distributed tracing). - Hands-on, consistent experience with tools like Prometheus, Grafana, and OpenTelemetry. - Strong proficiency with Kubernetes and cloud ecosystems. - Deep understanding of Linux, networking, and troubleshooting in distributed systems. - Practical experience with Infrastructure as Code using Terraform or similar tools. - Automation and development skills using languages like Go, Python, or similar. - Proven experience in architectural design and high-impact technical decision-making. Requirements - Experience with Grafana ecosystem tools (Loki, Tempo, Mimir) or Thanos. - Background in developing Internal Developer Platforms (IDPs). - Knowledge of FinOps applied to observability cost optimization and efficiency. - Experience working in multi-cloud environments. - Track record of leading cross-team technical initiatives or contributing to open-source communities. Benefits - Competitive salary - Profit sharing - Meal allowance - Health insurance - Dental plan - Life insurance - Childcare subsidy and Atypical Parenthood subsidy - Wellhub - Home office allowance - Employee assistance program (mental health, social, legal, and financial support) - Extended parental leave - Day off on birthday, Mother’s Day, and Father’s Day - Benefits Club (discounts on everyday services) - Discounts at educational institutions - Reading kit for children – PlayKids


