Site Reliability Engineer
Location
Worldwide
Posted
30 days ago
Salary
0
Seniority
Mid Level
Job Description
Site Reliability Engineer
66degrees
Role Description 66degrees’ Managed Cloud Optimization (MCO) team works with some of the largest cloud users in the world to help them transform their businesses with technology. Our Site Reliability Engineers (SREs) combine Google Cloud Platform expertise with a passion for devops methodologies to help our clients maintain, optimize, and scale their cloud implementations. On a daily basis, our SREs work with varied and exciting customers on topics including: - Solving critical outages - Designing and deploying new cloud workloads - Building self-healing automation Our SREs work with cutting-edge Google Cloud technologies like: - Google Kubernetes Engine (GKE) - Anthos - BigQuery - Data pipelines They also use leading 3rd party tools like: - Prometheus - Datadog Additionally, our SREs work with languages like Python and Terraform to: - Create automation - Deploy infrastructure - Contribute to open-sourcing If you’re looking to continually build and apply your Google Cloud expertise to new and varied environments while acting as a key contributor to building the best Google consulting partner in the industry – let’s talk. Qualifications - Minimum 3+ years of cloud and infrastructure experience, including demonstrated expertise with Linux, Windows, k8s, databases, and networking services - 2+ years of Google Cloud experience and related certifications strongly preferred but not required - Proficiency with Python required. Other programming language experience is a plus - Strong provisioning and configuration skills using Terraform - Experience with 24x7x365 monitoring, incident response, and on-call support - Experience in troubleshooting that spans systems, network, and code - Experience determining & negotiating Error budgets, SLIs, SLOs, and SLAs with product owners - Demonstrate the ability to work independently and as a member of a greater team, including cross-team activities - Experience working in Agile Scrum, Kanban methodologies in SDLC - Proven experience balancing service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale - Strong communication skills, as this is a heavily customer-facing role - Bachelor’s degree in computer science, electrical engineering, or equivalent required Requirements - Ensure near-zero downtime with monitoring and alerting, self-healing automation, and continuous improvement - Create highly automated, available and scalable systems by applying software and infrastructure principles - Employ and advise clients on DevOps and SRE principles and practices, covering deployment pipelines, HA, service reliability, technical debt, and operational toil for live services running at scale - Provide a proactive approach to our clients’ workloads, anticipating failures, automating tasks, ensuring availability, and providing a great customer experience - Work closely with clients, your team, and Google engineers to investigate and resolve infrastructure issues - Contribute to ad-hoc initiatives such as writing documentation, open-sourcing, and improving operation, making a huge impact at a rapid-growth Google Premier Partner Company Description
Related Guides
Related Categories
Related Job Pages
More Engineer Jobs
Title: Senior Technical Engineer Location: Buffalo United States Job Description: CTG, a Cegeka company, delivers IT and business solutions that enhance clients’ digital agility, empowering them to seize new opportunities and overcome any challenge. Backed by more than 60 years’ experience and a commitment to being a reliable, results-driven partner, we work shoulder to shoulder with clients to shape digital together. Our vision is to be an indispensable partner to our clients and the preferred career destination for digital and technology experts. With more than 9,000 team members in over 15 countries, we combine global expertise with local insight to deliver innovative solutions. We operate across the Americas, Europe, and India, working with over 3,000 clients in many of today's highest-growth industries. Together, we shape what’s next—working shoulder to shoulder to deliver impactful solutions for our clients and society. Our culture is built by the people who work at CTG, the values we hold, and the actions we take. It's a living, breathing thing that is renewed every day through the ways we engage with each other, our clients, and our communities. At CTG, you’ll find a workplace where you are encouraged to grow, supported in your ambitions, and empowered to shape your own career journey. For more information, visit www.ctg.com. CTG will consider for employment all qualified applicants including those with criminal histories in a manner consistent with the requirements of all applicable local, state, and federal laws. CTG is an Equal Opportunity Employer. CTG will assure equal opportunity and consideration to all applicants and employees in recruitment, selection, placement, training, benefits, compensation, promotion, transfer, and release of individuals without regard to race, creed, religion, color, national origin, sex, sexual orientation, gender identity and gender expression, age, disability, marital or veteran status, citizenship status, or any other discriminatory factors as required by law. CTG is fully committed to promoting employment opportunities for members of protected classes.
Service Desk Engineer L2
TabbyOn a mission to create financial freedom. No interest. No fees. Shariah-Compliant.
• Handle and troubleshoot requests from tech support and internal teams, including root cause analysis of complex issues (HTTP flows, API errors, web integrations, browser/client-side errors, performance problems). • Standardize solutions, prepare and maintain clear instructions and playbooks for L1 and L2 engineers. • Investigate product bugs and customer-facing errors, participate in their resolution together with Product and Engineering teams. • Proactively analyze logs to identify root causes, performance bottlenecks, and unusual patterns. • Analyze existing problems in production systems and set well-defined tasks for Engineers to fix them. • Automate routine and repetitive tasks to improve team efficiency. • Take part in on-call duties for incidents and help improve our monitoring, alerting, and response processes.
Service Desk Engineer, L2
TabbyOn a mission to create financial freedom. No interest. No fees. Shariah-Compliant.
- Handle and troubleshoot requests from tech support and internal teams, including root cause analysis of complex issues (HTTP flows, API errors, web integrations, browser/client-side errors, performance problems). - Standardize solutions, prepare and maintain clear instructions and playbooks for L1 and L2 engineers. - Investigate product bugs and customer-facing errors, participate in their resolution together with Product and Engineering teams. - Proactively analyze logs to identify root causes, performance bottlenecks, and unusual patterns. - Analyze existing problems in production systems and set well-defined tasks for Engineers to fix them. - Automate routine and repetitive tasks to improve team efficiency. - Take part in on-call duties for incidents and help improve our monitoring, alerting, and response processes.
Service Desk Engineer, L2
TabbyOn a mission to create financial freedom. No interest. No fees. Shariah-Compliant.
• Handle and troubleshoot requests from tech support and internal teams, including root cause analysis of complex issues (HTTP flows, API errors, web integrations, browser/client-side errors, performance problems). • Standardize solutions, prepare and maintain clear instructions and playbooks for L1 and L2 engineers. • Investigate product bugs and customer-facing errors, participate in their resolution together with Product and Engineering teams. • Proactively analyze logs to identify root causes, performance bottlenecks, and unusual patterns. • Analyze existing problems in production systems and set well-defined tasks for Engineers to fix them. • Automate routine and repetitive tasks to improve team efficiency. • Take part in on-call duties for incidents and help improve our monitoring, alerting, and response processes.


