ESO is a fast-paced, growing data, technology, and research company passionate about improving community health and safety through the power of data. We pioneer innovative, user-friendly software to meet the changing needs of today’s EMS agencies, fire departments, and hospitals. We’re small enough to be nimble and fun, but big enough to be a great place to work. We serve thousands of customers out of our offices across the US, Canada and Northern Ireland.
Site Reliability Engineer
Location
United States
Posted
1 day ago
Salary
0
Seniority
Mid Level
No structured requirement data.
Job Description
Site Reliability Engineer
ESO
Role Description The Site Reliability Engineering (SRE) team at ESO is responsible for ensuring the reliability, scalability, and performance of our production systems. We operate at the intersection of engineering and operations, with a strong focus on automation, observability, and continuous improvement. As a Site Reliability Engineer, you will work hands-on with cloud-native systems, supporting production and pre-production environments to maintain system health, improve resiliency, and optimize performance. You’ll partner closely with engineering, infrastructure, and database teams to troubleshoot complex issues, enhance automation, and ensure our services meet reliability and availability expectations. This role is ideal for an engineer who enjoys solving challenging problems, digging into application and database behavior, and continuously improving how systems operate in a fast-paced, high-impact environment. What You’ll Do - Support and maintain production and non-production cloud environments (Cloud Azure/AWS). - Troubleshoot complex, distributed, cloud-based applications to identify root causes and implement durable fixes. - Monitor system health, performance, and reliability using observability tools (e.g., New Relic, ELK and Zabbix). - Investigate application and database performance issues, including writing and optimizing SQL queries. - Participate in incident response, debugging, and post-incident reviews focused on continuous improvement. - Contribute to CI/CD pipelines (e.g., Azure DevOps) to improve automation, reliability, and deployment processes. - Write and maintain automation scripts (PowerShell, bash, Python or similar) to streamline operational workflows. - Collaborate with developers to understand code behavior and support troubleshooting efforts in C#/.NET-based systems. - Help improve reliability standards, documentation, and operational best practices. Qualifications - Hands-on experience working in a cloud environment (Microsoft Azure strongly preferred). - Experience supporting and troubleshooting complex, cloud-native applications in production environments. - Strong understanding of relational databases and solid experience writing and troubleshooting SQL queries. - Ability to read and understand application code (preferably C#/.NET) to support debugging and issue resolution. - Experience working with at least one CI/CD platform (e.g., Azure DevOps). - Familiarity with monitoring and observability tools (e.g., New Relic) and core concepts such as logs, metrics, and traces. - Experience with scripting/automation (PowerShell preferred). - Strong analytical and problem-solving skills with attention to detail. - Clear written and verbal communication skills. Requirements - Passionate about reliability engineering and operational excellence. - Curious and eager to learn, actively seeking feedback and continuously growing your technical skill set. - Coachable and adaptable, able to thrive in a fast-paced and evolving environment. - Comfortable navigating ambiguity and taking ownership of problems through to resolution. - A collaborative team player who values accountability and continuous improvement. Nice to Have - Experience working with Linux-based systems. - Experience working with Kubernetes and container systems. - Exposure to infrastructure-as-code tools (e.g., Terraform). - Familiarity with Git-based version control workflows. Benefits - Competitive health plans (medical, dental, & vision insurance). - PTO (starting at 20 days) & 12 company holidays. - 401(k) with company match. - Telemedicine service provided by ESO. - Savings accounts (FSA, HSA, DCA). - Employee Assistance Program (EAP). - Annual health and wellness reimbursement. - Peace of mind benefits such as life insurance, disability insurance, and worksite benefits. - Paid parental leave, new child program, & flexible parental return-to-work options. - Casual office environments and unlimited office snacks and drinks.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Title: Senior Site Reliability Engineer Location: Austin, TX; Eau Claire, WI; Minneapolis, MN Work Type: Hybrid Job Description: At Jamf, we believe in an open, flexible culture based on respect and trust. Our track record and thriving work environment all stem from the freedom we grant ourselves to get the job done right. We take pride in helping tens of thousands of customers around the globe succeed with Apple. The secret to our success lies in our connectivity, while operating with a high degree of flexibility. Work-life balance remains our priority while feeling connected is important to maintain our strong culture, achieve our goals, and thrive as #OneJamf. What you'll do at Jamf: As a Senior Site Reliability Engineer, you'll help us balance development velocity with the reliability our customers depend on. You'll partner with engineering teams to shape how their services are measured, lead the work to improve them, and use what you learn from production to build the automation and agentic tooling that improves reliability globally. You'll work fluently with agentic development tools as part of your everyday practice, using them to move faster, to investigate harder problems and to multiply your impact. This is a senior individual contributor role at the intersection of Engineering, Product, Customer Success and Technical Support, where you'll play a meaningful part in shaping how we practice SRE at Jamf. This role if offered as remote in Minneapolis, MN; Eau Claire, WI; or Austin, TX metro areas. You may be required to work periodically at a Jamf office or collaborative work location with other Jamf employees in your area for certain events or moments that matter. We are only able to accept applications for those based in one of these locations. What you can expect to do in this role: - Partner with engineering teams to define service-level objectives, error budgets, and supporting indicators for their services, and help them use those measures to inform prioritization and reliability investment. - Investigate complex production issues end-to-end across application, data, infrastructure, and network layers, using AI to correlate logs, metrics, and code and to pressure-test hypotheses before acting. - Produce clear technical documentation, runbooks, architecture notes, postmortems and proofs of concept for both technical and non-technical audiences, in a form that engineers and AI tools can re-use. - Identify systemic sources of toil and lead the work to eliminate them through automation, AI agents, tooling, and process change. - Set the conditions for AI agents to do reliable work in our environment, including repository context, well-specified tasks, integrations such as MCP servers that give AI safe access to the systems it needs, and the tests and guardrails needed for AI-authored change to be trusted. - Participate in team ceremonies to identify and refine work, communicate findings, and drive opportunities to collaborate. - Drive cross-team and cross-department collaboration on reliability initiatives, including reviewing designs, influencing roadmaps, and mentoring engineers on SRE practices, including effective AI use in their reliability work. - Advise senior leadership and stakeholders during critical customer escalations, translating between technical reality and business impact. - Contribute to scaling the SRE practice itself: improving our standards, our tooling, and how we partner with product engineering teams. - #LIRemote What we are looking for: - Minimum of 5 years experience in software engineering, SRE or production operations roles. (Required) - Strong production troubleshooting skills across the stack. Ability to diagnose issues from first principles using the tools available (profilers, heap and thread dumps, query plans, traces, logs, metrics). (Required) - Experience working within a form of the Agile development framework process. (Required) - Hands-on experience operating production services on AWS (e.g. EC2, S3, EKS, RDS/Aurora, CloudFront). (Required) - Experience utilizing observability tools (i.e. Grafana, Prometheus, LogicMonitor). (Required) - Experience creating clear and concise technical documentation that is targeted at both technical and non-technical audiences. (Required) - Experience writing infrastructure as a code. (Required) - Experience writing automation in a general-purpose language (e.g. Python, Go, Java, or similar) to a production standard. (Required) - Strong judgement about how to apply AI effectively across the full range of SRE work, including high-stakes areas such as production access and sensitive data, knowing how to scope and verify work to make it safe. (Required) - Hands-on experience using agentic development tools (e.g. Claude Code, Cursor, Copilot) to deliver engineering and operational work, scoping and delegating bounded tasks, verifying the output, and shipping with confidence. (Required) - Experience improving how a team works with AI, for example authoring reusable skills, repository context files, or prompt patterns that others adopt. (Required) - Experience optimizing SQL queries and database engine tuning. (Preferred) - Experience with CI/CD Tooling (e.g. Github Actions, Jenkins). (Preferred) - Exposure to chaos engineering, fault injection and disaster recovery exercises. (Preferred) - Familiar with FinOps practices. (Preferred) - 2 year / Associates (Required) - 4 year / Bachelor's Degree (Preferred) - A combination of relevant experience and education may be considered OTHER REQUIREMENTS - This position will perform work that the U.S. government has specified can only be performed by a U.S. citizen located physically in the U.S., and therefore any employment offer will be contingent upon verification of both of these requirements. Applicants who are not U.S. citizens or who are located outside of the U.S. are strongly encouraged to apply for other positions at Jamf, which is an equal-opportunity employer. SECURITY AND PRIVACY REQUIREMENTS - Participation in ongoing security training is mandatory - Established security protocols will be adhered to, sensitive data will be handled responsibly, and data protection practices are followed, including understanding relevant privacy regulations and reporting breaches - Acknowledging the Jamf Code of Conduct, where applicable security and privacy policies can be found, is a requirement of all roles at Jamf How we help you reach your best potential: - Named a 2025 Best Companies to Work For by U.S. News - Named a 2024 Best Technology Company to Work For by U.S. News - Named one of Forbes Most Trusted Companies in 2024 - Named a 2024 Best Companies to Work For by U.S. News - Our developers work in agile delivery teams to produce new features, improve software components, and are the subject matter experts for our Jamf product offerings. - You will have the opportunity to make a real and meaningful impact for more than 75,000 global customers with the best Apple device management solution in the world. - We constantly push the boundaries of technology, our developers support new innovations and OS releases the moment they are made available by Apple. - Several Jamf engineers are named in patents and with team names like CatDog, ThunderSnow and Dalek you can expect to have some fun while building cutting-edge software. - You will have the opportunity to work with a small and empowered team where the culture is based on trust, ownership, and respect. - We offer a clear career path that enables you to grow under supportive leadership and management - Visit our Jamf Engineering blog to learn more about the innovative projects our team is working on and what we learn from each challenge we solve. A blog written by engineers, for engineers at medium.com/jamf-engineering - 22 of 25 world's most valuable brands rely on Jamf to do their best work (as ranked by Forbes). - Over 100,000 Jamf Nation users, the largest online IT community in the world. Pay Transparency At Jamf, base pay is one part of our total compensation package and is set within a defined range. These ranges can vary based on hiring location. Where an individual's pay falls within that range depends on several factors, including role scope, location, budget, skills, experience, and qualifications. This approach helps ensure fair, competitive pay and provides room to grow as you develop in your role. Pay Transparency Range $113,300 - $205,520 USD What it means to be a Jamf? We are a team of free-thinkers, can-doers, and problem-crushers. We value humility and the relentless pursuit of knowledge. Our culture flows from a spirit of selflessness and relentless self-improvement - driving both personal growth and collective progress throughout our company. We unite around common goals while respecting personal approaches, believing that fulfilled individuals create a thriving, vibrant workplace. Our aim is simple: hire exceptionally good people who are incredibly good at what they do and let them do it. We provide the support and resources to let everyone be their authentic, best selves at work, at rest, and at play. We are committed to supporting the continual improvement of Apple in the workplace, the organizations that rely on them and the people who keep it all running smoothly. Above it all, waves our banner of #OneJamf - and the knowledge that when we stand together, we accomplish so much more than we could alone. We seek individuals who share this unwavering journey toward growth to join us in our quest for constant improvement.
• Deploy, manage, and maintain AWS infrastructure across development, staging, and production environments • Work with AWS services including Amazon Connect, Lambda, S3, EventBridge and Data Bridges • Build and maintain scalable, reusable and secure Infrastructure as Code (IaC) using Terraform Enterprise • Develop, implement and manage CI/CD pipelines for automated application and infrastructure deployments • Collaborate with cross-functional teams to ensure highly available, secure and performant cloud solutions • Monitor, troubleshoot and optimize cloud infrastructure and deployment processes • Maintain clean, well-documented and reusable infrastructure code aligned with best practices and organizational standards • Participate in code reviews and contribute to infrastructure design discussions
Senior DevOps Engineer – Open Sovereign Cloud
Deutsche Telekom IT Solutions SlovakiaGrowing bigger, getting better. An IT company which creates values for its customers and helps its region to improve.
• Designs, develops, tests and implements infrastructure for CI/CD pipelines and IaC. • Manages source code, configuration management, release management, build and deployment activities. • Setups and manages integration with partner applications. • Conducts of performance analyses and tunings as well as error analyses and troubleshooting. • Consults and implements new innovative technologies to satisfy innovation strategy. • Creates concepts for further automation of services, processes and/or operating models. • Directly support the project teams in the development, quality assurance accompanying development as well as planning and implementation of product releases. • Continuously optimizes the development and system infrastructure. • Provides consulting to project teams on areas of expertise also Prototypes/Proof of Concept solutions. • Researches and develops in assigned technology, determines business requirements, proposes changes and prepares implementation plans.
DevOps Engineer
IRIUMLíderes en gestión de servicios integrados de infraestructuras y plataformas IT.
• Incorporarse a un proyecto estable de última generación tecnológica • Trabajar en un entorno 100% remoto • Colaborar en el desarrollo de soluciones en la nube utilizando Azure y Python • Participar en la implementación de metodologías ágiles



