Empowering eye care providers with world-class software.
Senior Site Reliability Engineer
Location
United States
Posted
119 days ago
Salary
$150K - $200K / year
Seniority
Senior
Job Description
Senior Site Reliability Engineer
Barti
• Lead and participate in the design, implementation, and maintenance of highly available and scalable infrastructure. • Monitor system health, performance metrics, and capacity planning to ensure optimal performance. • Establish and track SLIs, SLOs, and error budgets to measure and improve system reliability. • Design and implement Infrastructure as Code (IaC) solutions using tools like Terraform, Pulumi, or CloudFormation. • Build and maintain CI/CD pipelines to enable rapid, safe deployments. • Automate operational tasks and eliminate toil through scripting and tooling. • Lead incident response efforts, including on-call rotation, post-mortem analysis, and remediation. • Debug and resolve complex production issues across the entire stack. • Implement monitoring, alerting, and observability solutions to detect and prevent issues proactively. • Provide technical leadership and mentorship to engineers on reliability and infrastructure best practices. • Collaborate with cross-functional teams, including Engineering and Product to ensure reliable product delivery. • Lead the technical design of infrastructure solutions, ensuring alignment with architectural principles and business goals. • Stay updated with emerging technologies and industry trends in SRE, DevOps, and cloud infrastructure. • Propose and drive the adoption of best practices, tools, and processes to enhance system reliability and developer productivity. • Conduct chaos engineering experiments and disaster recovery drills to validate system resilience. • Implement and maintain security best practices across infrastructure and applications. • Manage secrets, access controls, and security monitoring systems. • Foster a collaborative environment within the engineering team and across departments. • Clearly communicate technical concepts and system health to both technical and non-technical stakeholders. • Work closely with engineering teams to define reliability requirements and ensure operational excellence.
Job Requirements
- 5+ years (ideally 7+) of relevant work experience in Site Reliability Engineering, DevOps, or Infrastructure roles
- 1+ years of hands-on experience with either Python, Go, or Bash scripting
- Experience with cloud platforms (ideally GCP) and container orchestration (Kubernetes, Docker)
- Proficiency with Infrastructure as Code tools (Terraform, CloudFormation, or similar)
- Strong understanding of Linux systems, networking, and distributed systems
- Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
- Excellent problem-solving and communication skills
- Able to work independently and as part of a team
Benefits
- Be part of a mission-driven, rapidly scaling company changing the future of eye care
- Work remotely from anywhere in the U.S.
- Collaborate with a passionate, fun, and supportive team
- Competitive salary - $150,000 - $200,000
- Equity in a fast-growing startup
- Health, vision, and dental benefits
- Unlimited PTO
- Annual professional development stipend
- A high-impact role with plenty of room for growth, ownership, and creativity
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
SRE End User Services Product Owner
LeidosLeidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.
• Utilize metrics and tools like Aternity to monitor end-user performance and proactively identify potential issues. • Develop strategies to address recurring incidents and improve system reliability. • Collaborate with engineering and operations teams to implement automated solutions for incident prevention. • Lead the planning, coordination, and execution of software deployments across end-user devices. • Ensure deployments are completed on time, with minimal disruption to end users. • Analyze service performance metrics to identify areas for improvement and develop initiatives to enhance the quality of end-user services. • Define and maintain a product vision and roadmap for End User/Seats Services, aligned with organizational objectives. • Serve as the primary point of contact between the End User/Seats Services team and business stakeholders. • Ensure clear documentation of product requirements, progress, and updates for stakeholders.
• Analyze current technology utilized within a customer contract, project, and solution and develop steps and processes to improve and expand upon them. • Building and setting up new development tools and infrastructure. • Understanding the needs of stakeholders and conveying this to developers. • Working on ways to automate and improve development and release processes. • Testing and examining code written by others and analyzing results. • Ensuring that systems are safe and secure against cybersecurity threats. • Identifying technical problems and developing software updates and ‘fixes’. • Working with software developers and software engineers to ensure that development follows established processes and works as intended. • Establish milestones for necessary contributions from departments and develop processes to facilitate their collaboration. • Maintain cloud infrastructure and platform services needed for projects to be completed efficiently. • Mentor and train other engineers and seek to continually improve processes.
• To deploy releases and hotfixes to test environments and production data centers • Be an excellent and creative problem solver • Design, test, implement, manage, and maintain business-critical big data platforms • Safely implement change deployments • Troubleshoot issues for internal and external customers, providing problem identification and resolution • Migrate manual configuration to automated framework wherever possible • Design solutions to meet internal and external customer needs • Assist in designing and applying system standards • Work with development, testing, and QA to improve the product • Create and update tooling used by the support team
Staff Security Engineer
OpenLoop HealthOpenLoop Health is a healthcare technology startup whose services are used by companies that provide telehealth delivery across all 50 states. In past hiring, the award-winning hea
About OpenLoop OpenLoop was co-founded by CEO, Dr. Jon Lensing, and COO, Christian Williams, with the vision to bring care anywhere. Our telehealth support solutions are thoughtfully designed to streamline and simplify go-to-market care delivery for companies offering meaningful virtual support to patients across an expansive array of specialties, in all 50 states. About The Role OpenLoop is looking for a Staff Security Engineer (DevOps Integrations) to join our team remotely. In this role, you will be responsible for being our DevSecOps subject matter expert across the IT, software engineering and product teams. The ideal candidate is someone who has the ability to provide strategic oversight, possesses a wide range of cybersecurity and software engineering technical acumen, and has the ability to think like an attacker to guide us through potential security issues. What You’ll Do: - Build relationships with developers and stakeholders to incorporate security principles into engineering design and deployments. - Supervise validation in security controls and testing across projects, using SAST, DAST, IAST and RASP tools, documenting any security findings, outlining remediation options and overseeing mitigation. - Oversee implementation of defensive practices and countermeasures across infrastructure and applications. - Draft and uphold CI/CD security strategy and practices in tandem with other technical team leads. - Lead continuous product and application security reviews, focused on secure development practices, threat modeling, vulnerability management, architecture and application security design. - Ensure security principles and validations are consistently implemented throughout the CI/CD pipeline by embedding robust, security-focused practices into all automation processes. - Attend and participate in product meetings addressing security requirements for new and existing products. - Build services and tools to enable developers and engineers to use security components successfully - Simplify automation that applies security inter-workings with CI/CD pipelines. - Support the ability to “shift left” and incorporate security early on and throughout the development lifecycle. - Communicate vulnerability results to both technical and non-technical stakeholders, focused on risk tolerance and threat to the business, in order to gain support through influential messaging. - Leverage vulnerability database sources to understand the weakness, probability and remediation options supplied by vendors - Join forces and provision security principles in architecture, infrastructure and code. - Regularly research and learn new tactics, techniques and procedures (TTPs). - Partner with teams to define key performance indicators (KPIs) and metrics across business units. - Ensure regulatory compliance (e.g., PCI, HIPAA, HITRUST, NIST CSF) through effective security controls and processes. - Other duties as assigned. Who You Are: - Bachelor's degree in computer science (preferred), information assurance, MIS or related field, or equivalent. - 7+ years of security and systems administration-related experience, to include 3+ years of related cloud and security engineering experience - Experience with operations and security across Amazon Web Services (AWS) and/or Google Cloud Platform (GCP). - Experience with agile workflows, including Scrum and Kanban. - Understanding of containers (e.g., Docker) and container orchestration (e.g., Docker Swarm, Kubernetes). - Proficient in securing Windows and *nix operating systems, endpoint applications, networking protocols and devices. - Ability to obtain and maintain technical team and business support to influence a collaborative effort to reduce attack surface while performing rapid, continuous implementation. - Understanding of OWASP, CVSS, the MITRE ATT&CK framework and (SLDC). - Knowledge of Payment Card Industry (PCI), Health Information Portability and Accountability Act (HIPAA), Gramm-Leach-Bliley Act (GLBA), National Institute of Standards (NIST) or International Standards Organization (ISO) requirements. - Self-starter mentality requiring minimal supervision. - Analytical and problem-solving abilities with a proactive, risk-based approach. - Highly organized and efficient. - Demonstrated strategic and tactical thinking, along with decision-making skills and business acumen. - Experience in healthcare or digital health is a plus. - Strong internal service minded, to provide support to all teams and leadership - Adaptability to handle dynamic and challenging environments. - Energetic, resourceful, and appropriate work intensity to get the work done. - Strong people acumen and relationship skills. Our Company We have a relatively flat organizational structure here at OpenLoop. Everyone is encouraged to bring ideas to the table and make things happen. This fits in well with our core values of Autonomy, Competence and Belonging, as we want everyone to feel empowered and supported to do their best work. Sound like a good fit? We’d love to meet you.




