Founded in 2001, Insight Global (IG) offers enhanced staffing, placement staffing, and temporary-to-permanent staffing services, including long-term and short-term job assignments.
Site Reliability Engineer
Location
United States
Posted
63 days ago
Salary
0
Seniority
Mid Level
Job Description
Site Reliability Engineer
Insight Global
• Be on a Pagerduty on-call rotation to respond to production incidents • Maintain and develop monitoring and alerting solutions to improve the on-call experience • Design, build and maintain scalable infrastructure for running our systems • Assist product developers in debugging and triaging production issues
Job Requirements
- 2+ years experience working as a Site Reliability Engineer or related position
- Experience with AWS, Kubernetes, Docker
- Familiarity with deployment/provisioning tools like Terraform, Helm, Ansible
- Strong knowledge of the Linux platform
- Comfortable working with Golang and shell script
- Experience with observability and monitoring tools - Prometheus, Datadog, NewRelic, Grafana, Loki, or similar
- Experience with MySQL or similar relational databases and GitLab is also a plus
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior Site Reliability Engineer Manager
RemoteStarScale Faster, Reduce Costs, Meet Diversity Targets
• Ensuring the reliability, scalability, and performance of infrastructure and services • Taking full ownership of the production estate from both a technical and process perspective • Providing consistent smooth operation of live systems • Designing and operating a new incident tracking process • Creating and maintaining high-end monitoring and automation tooling • Driving automation initiatives to improve operational workflows • Developing and maintaining tools, scripts, and dashboards to monitor system health • Building a first-class SRE team and providing leadership and guidance
• Join Dev.Pro’s exclusive screening process to gain valuable career insights and access personalized feedback on your skill set. • Get priority consideration by Dev.Pro for suitable job openings. • Opportunity to work with top global corporations and participate in industry-shaping projects. • Send CV in English. • Schedule a call with recruiters. • Participate in an experience interview focused on soft skills. • Undergo online evaluation of technical skills.
Senior Site Reliability Engineer – Team Lead
Dev.ProSoftware Development Partner. Result-driven. Quality-obsessed.
• Oversee a Cloud/SRE support team, ensuring reliable operations, effective processes, and strong collaboration across global and cross-functional teams • Lead our Cloud/SRE Support team, providing coaching, prioritization, and oversight • Drive team performance, ensuring high-quality support, SLA compliance, and continuous improvement • Coordinate with India-based and cross-functional teams for alignment and 24/7 coverage • Translate complex issues into actionable plans and scalable solutions • Design and improve support processes and operational frameworks • Identify gaps and risks, improving operations and team engagement • Collaborate with cross-functional teams to define priorities and communicate progress, risks, and solutions • Oversee execution across MDM operations, access management, monitoring, incidents, and RCA • Maintain clear documentation, runbooks, and escalation procedures • Promote best practices in reliability and customer-focused support
• Develop, implement and manage practical solutions to support clients' growth • Innovate and turn ideas into reality using the best market solutions • Oversee the entire process from conception through solution implementation



