Be Part of Something Great
Site Reliability Engineer
Location
Philippines
Posted
63 days ago
Salary
0
Seniority
Senior
Job Description
Site Reliability Engineer
BEL USA LLC
• Focus on improving service reliability through automation. • Reduce operational toil and implement SLOs and error budgets. • Partner closely with software engineering teams for production operations. • Define, measure, and manage SLIs, SLOs, and error budgets. • Analyze system performance and improve reliability, resilience, and scalability. • Lead reliability reviews and prevent incidents. • Build and optimize monitoring, logging, and alerting systems. • Implement distributed tracing. • Enhance CI/CD pipelines for safe deployments. • Participate in on-call rotations and lead incident responses.
Job Requirements
- Bachelor's degree in Computer Science, Software Engineering, or related field.
- 4+ years in SRE, DevOps, or Platform Engineering.
- Strong programming or scripting experience.
- Hands-on cloud experience.
- Understanding of distributed systems.
- Experience with observability tools.
- Familiarity with chaos engineering and resilience patterns.
Benefits
- Professional development opportunities
- Remote work options
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Site Reliability Engineer
DiscountMugsEmpowering you to share your message and create a lasting impression since 1995
• Focus on improving service reliability through automation • Define, measure, and manage SLIs, SLOs, and error budgets • Analyze system performance and identify opportunities for improvement • Build and optimize monitoring, logging, and alerting systems • Reduce toil through automation and build reliability-focused tools • Enhance CI/CD pipelines for safe deployments • Participate in on-call rotations and lead incident response
Senior DevOps Engineer
DocPlannerAt Docplanner Group, we’re on a mission to help people live longer, healthier lives. As the world’s largest healthcare platform, each month, we connect 24 million patients with 280k doctors across 13 countries. Our marketplaces, SaaS and AI tools simplify daily tasks and help doctors, clinics and hospitals work more efficiently. Real impact – We help doctors help patients. Your work truly makes a difference. At scale, yet agile – 3,000+ employees, but still fast, flexible, and hands-on. Shape the future, sustain growth – Make a difference now and build for long-term success.
• Participate in the monitoring, maintenance, and evolution of the system infrastructure supporting the TuoTempo web application. • Design, create, and support system distribution; test and monitor application code. • Support our internal development teams in the effective use of our organizational systems. • Propose and implement new solutions based on innovative technologies. • Planning and assignment of team activities. • Providing technical support to resolve complex issues. • Mentoring and professional development of team members. • Management of performance evaluations.
Site Reliability Engineer 5 – CDN, Live Streaming, Open Connect
NetflixDescribed as the world's top internet television network, Netflix is a publicly-traded entertainment company offering video-on-demand and streaming media. As an
• Support the CDN delivery and day-to-day live-streaming operations for Netflix • Participate in the preparation, validation, and execution of live streaming focused initiatives in collaboration with related production and engineering teams • Impact multiple areas of the live event lifecycle, from the planning phase through testing and event launch days • Lead innovation initiatives, implementing new features, and driving enhancements in the streaming services delivery • Drive continual improvement in resilience, observability, monitoring, instrumentation, and automation to maintain highly scalable and reliable CDN services with excellent quality of experience (QoE) • Implement, automate, execute, and analyze the results from a broad range of streaming CDN delivery focused functional, performance, resilience, and fault injection testing • Coordinate, collaborate, and partner across multiple stakeholders for the smooth execution of live-streaming events • Aggregate, analyze, and correlate large amounts of server and application performance data • Use the innovative Netflix Big Data platform as a toolset for service delivery optimization and system reliability improvements • Participate in an on-call rotation and work flexible hours based on live events schedule, including weekends and holidays
• Work on the automation and standardization of environments using Ansible, Git, and CI/CD pipelines. • Support and evolve workloads running on Kubernetes/OpenShift clusters. • Participate in the full deployment and release cycle, always prioritizing stability, risk mitigation, and continuous improvement. • Collaborate daily with international teams (in English). • Monitor, analyze, and respond to incidents in production environments. • Contribute to the maturity of the DevOps team in Brazil by sharing knowledge and best practices. • Occasionally participate in weekend deployment windows (rotational/on-call schedule).




