Job Closed

This listing is no longer active.

Growe Talents logo
Growe Talents

Your growth starts here.

System Reliability Engineer – DevOps

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 11-50H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

84 days ago

Salary

0

Seniority

Senior

Job Description

System Reliability Engineer – DevOps

Growe Talents

• Ensure availability, performance, and scalability of infrastructure and services through monitoring, automation, and operational best practices; • Lead incident response, perform root cause analysis, and implement recovery and long-term fixes; • Manage infrastructure using Terraform, Terragrunt, and automation tools for consistency and repeatability; • Implement and maintain metrics, logs, and tracing solutions (Prometheus, Grafana, Loki, VictoriaMetrics, CloudWatch) to ensure system visibility; • Identify bottlenecks, tune systems, and improve infrastructure performance; • Monitor resources, forecast growth, and implement scaling strategies; • Integrate security best practices into IaC, CI/CD pipelines, and deployments; • Support vulnerability management; • Participate in 24/7 rotations (once a week) for timely resolution of critical incidents; • Work with DevOps, PRE, development, and security teams to improve reliability and design resilient systems; • Maintain operational runbooks, incident reports, and system documentation.

Job Requirements

  • 3+ years in a DevOps, SRE, or related role;
  • Strong hands-on experience with AWS services including EC2, ECS, EKS, RDS, DocumentDB, ElastiCache, Keyspaces, S3, EBS, VPC, Route53, KMS, ACM, and CloudWatch;
  • Proficiency with Terraform, Terragrunt, and Atlantis for reproducible and version-controlled infrastructure;
  • Experience with GitLab CI, FluxCD, Argo Rollouts, and automation tools (Ansible, Python, Bash);
  • Solid experience with Docker, Kubernetes (AWS EKS), and Helm (including custom templates, ChartMuseum);
  • Familiarity with cluster add-ons such as KEDA, VPA, Karpenter, External-DNS, ingress-nginx, aws-alb-controller, and ebs-csi-driver;
  • Experience with Grafana, VictoriaMetrics stack, Tempo, metrics exporters, Pingdom, AWS CloudWatch, and alerting systems like PagerDuty, VMAlert, and Alertmanager;
  • Proficiency with OpenSearch, and Vector Agent for centralized logging;
  • Strong understanding of networking concepts, AWS networking (VPC, Network Firewall, Transit Gateway, Site-to-Site VPN), identity and access management, certificate management (ACM, Vault, SOPS), and application security best practices;
  • Familiarity with Cloudflare services, including caching, DNS, and Workers;
  • Exposure to AWS Cost Explorer, KubeCost, and custom cost export tools;
  • Certifications: AWS, Terraform, Kubernetes, or Helm are a plus.

Benefits

  • Health & Wellness Focus;
  • Global Medical Coverage;
  • Growth Opportunities;
  • Benefits Programs (compensation for the gym/stomatology/psychological service & etc.);
  • Performance-Driven Rewards;
  • Dynamic Work Environment.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Relai 🇨🇭 logo

Senior DevOps Engineer

Relai 🇨🇭

Take control of your future

DevOps Engineer84 days ago
Full TimeRemoteTeam 11-50Since 2020H1B No Sponsor

• Work closely with developers and operations teams to scale and optimize our infrastructure for sustained growth. • Design, deploy, and operate our core backend infrastructure using automated, Infrastructure-as-Code approach. • Prioritize and own delivery in a small, highly efficient team — you set the bar, not just maintain it. • Serve as the first line of defense as on-call engineer on workdays and weekends (no night shifts).

Europe
Job Closed
Pluribus Digital logo

DevOps Engineer

Pluribus Digital

We help government agencies deliver public services as modern digital products.

DevOps Engineer84 days ago
OtherRemoteTeam 51-200H1B No Sponsor

• Design, build, and maintain scalable cloud-based solutions using Microsoft Azure or AWS • Develop monitoring and alerting templates, blue-green deployment strategies, and IAM automation workflows • Collaborate with cross-functional teams to contribute to the conceptual, logical, and physical design of cloud solutions • Continuously adopt new tools to enhance performance, automation, and scalability

United States
Job Closed
Growe logo

System Reliability Engineer – DevOps

Growe

Let's grow together and unlock opportunities

DevOps Engineer84 days ago
Full TimeRemoteTeam 501-1,000Since 2019H1B No Sponsor

• Lead incident response, perform root cause analysis, and implement recovery and long-term fixes; • Manage infrastructure using Terraform, Terragrunt, and automation tools for consistency and repeatability; • Support vulnerability management; • Monitor resources, forecast growth, and implement scaling strategies; • Participate in 24/7 rotations for timely resolution of critical incidents;

Poland
Job Closed
Five9 logo

Senior DevOps Engineer

Five9

Helping Companies Bring Joy to CX.

DevOps Engineer84 days ago
Full TimeRemoteTeam 1,001-5,000Since 2001H1B Sponsor

• Design, implement, and automate components of large-scale distributed cloud systems. • Implement and support PAM solutions primarily on OpenStack, ensuring secure and reliable access management. • Build tools, automation, and workflows to improve availability, scalability, latency, and operational efficiency. • Work closely with engineering and delivery teams to deploy high-quality software in a fast-paced environment. • Monitor production and development environments and implement preventive and corrective measures to ensure platform reliability. • Participate in incident response, debugging, and root cause analysis for production issues. • Collaborate across teams to deliver consistent and reliable solutions aligned with. • Document designs, operational procedures, and troubleshooting guides clearly and effectively. • Contribute to improvements in reliability metrics such as availability, MTTD, and MTTR.

India
Job Closed