Job Closed

This listing is no longer active.

Growe logo
Growe

Let's grow together and unlock opportunities

System Reliability Engineer – DevOps

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 501-1,000Since 2019H1B No SponsorCompany SiteLinkedIn

Location

Poland

Posted

89 days ago

Salary

0

Seniority

Senior

Job Description

System Reliability Engineer – DevOps

Growe

• Lead incident response, perform root cause analysis, and implement recovery and long-term fixes; • Manage infrastructure using Terraform, Terragrunt, and automation tools for consistency and repeatability; • Support vulnerability management; • Monitor resources, forecast growth, and implement scaling strategies; • Participate in 24/7 rotations for timely resolution of critical incidents;

Job Requirements

  • 3+ years in a DevOps, SRE, or related role;
  • Strong hands-on experience with AWS services including EC2, ECS, EKS, RDS, DocumentDB, ElastiCache, Keyspaces, S3, EBS, VPC, Route53, KMS, ACM, and CloudWatch;
  • Proficiency with Terraform, Terragrunt, and Atlantis for reproducible and version-controlled infrastructure;
  • Experience with GitLab CI, FluxCD, Argo Rollouts, and automation tools (Ansible, Python, Bash);
  • Solid experience with Docker, Kubernetes (AWS EKS), and Helm (including custom templates, ChartMuseum);
  • Familiarity with cluster add-ons such as KEDA, VPA, Karpenter, External-DNS, ingress-nginx, aws-alb-controller, and ebs-csi-driver;
  • Hands-on experience with Grafana, VictoriaMetrics stack, Tempo, metrics exporters, Pingdom, AWS CloudWatch, and alerting systems like PagerDuty, VMAlert, and Alertmanager;
  • Proficiency with Grafana Loki, OpenSearch, and Vector Agent for centralized logging;
  • Strong understanding of networking concepts, AWS networking (VPC, Network Firewall, Transit Gateway, Site-to-Site VPN), identity and access management, certificate management (ACM, Vault, SOPS), and application security best practices;
  • Familiarity with Cloudflare services, including caching, DNS, and Workers;
  • Exposure to AWS Cost Explorer, KubeCost, and custom cost export tools;
  • Certifications: AWS, Terraform, Kubernetes, or Helm are a plus.

Benefits

  • Ensure availability, performance, and scalability of infrastructure and services through monitoring, automation, and operational best practices;
  • Lead incident response, perform root cause analysis, and implement recovery and long-term fixes;
  • Manage infrastructure using Terraform, Terragrunt, and automation tools for consistency and repeatability;
  • Implement and maintain metrics, logs, and tracing solutions (Prometheus, Grafana, Loki, VictoriaMetrics, CloudWatch) to ensure system visibility;
  • Identify bottlenecks, tune systems, and improve infrastructure performance;
  • Monitor resources, forecast growth, and implement scaling strategies;
  • Integrate security best practices into IaC, CI/CD pipelines, and deployments;
  • Support vulnerability management;
  • Participate in 24/7 rotations (once a week) for timely resolution of critical incidents;
  • Work with DevOps, PRE, development, and security teams to improve reliability and design resilient systems;
  • Maintain operational runbooks, incident reports, and system documentation.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Five9 logo

Senior DevOps Engineer

Five9

Helping Companies Bring Joy to CX.

DevOps Engineer89 days ago
Full TimeRemoteTeam 1,001-5,000Since 2001H1B Sponsor

• Design, implement, and automate components of large-scale distributed cloud systems. • Implement and support PAM solutions primarily on OpenStack, ensuring secure and reliable access management. • Build tools, automation, and workflows to improve availability, scalability, latency, and operational efficiency. • Work closely with engineering and delivery teams to deploy high-quality software in a fast-paced environment. • Monitor production and development environments and implement preventive and corrective measures to ensure platform reliability. • Participate in incident response, debugging, and root cause analysis for production issues. • Collaborate across teams to deliver consistent and reliable solutions aligned with. • Document designs, operational procedures, and troubleshooting guides clearly and effectively. • Contribute to improvements in reliability metrics such as availability, MTTD, and MTTR.

India
Job Closed
Truv logo

Senior DevOps Engineer

Truv

Truv empowers businesses to make confident decisions. Truv is a one-stop income and employment verification solution.

DevOps Engineer89 days ago
OtherRemoteTeam 51-200H1B Sponsor

• Architect and scale our AWS infrastructure, including container orchestration, autoscaling, networking, and cost optimization • Build our observability and alerting platform from the ground up. You'll own it from design through production deployment • Lead infrastructure builds for compliance (SOC 2, HIPAA). We need someone who scopes, builds, and ships, not just advises • Harden container workloads and secrets management across production, staging, and isolated compliance environments • Own the shared infrastructure stack (Postgres, Redis, Celery). Find bottlenecks, fix them, and add capacity before they become incidents • Build and maintain CI/CD pipelines, optimizing for deploy speed, reliability, and security • Extend our Terraform codebase to keep environments reproducible and audit-ready. We ship IaC changes weekly, not quarterly • Define and own our reliability practices: SLOs, incident response, post-mortems, and the production tooling to back them up • Unblock engineering teams by reducing deploy friction, improving dev environments, and eliminating toil • Share on-call with a small team. When things break, you lead the response, run the post-mortem, and make sure the fix ships

United States
$100K - $140K / year
virtual7 GmbH logo

Senior NixOS, DevOps Engineer

virtual7 GmbH

Wir gestalten die digitale Zukunft Deutschlands. Finde deine Berufung - Wachse mit virtual7.

DevOps Engineer89 days ago
Full TimeRemoteTeam 51-200Since 1996H1B No Sponsor

• As a Senior NixOS / DevOps Engineer (m/f/d), you will support our clients in building modern, declarative, and highly automated software and infrastructure processes. • Consulting and implementation around Nix and NixOS: from architecture reviews and the design of modern system landscapes to the production rollout of declarative environments. • Analyze and optimize existing software architectures, source code, and development processes with a focus on quality, efficiency, and reproducibility. • Design, implement, and evolve CI/CD pipelines (e.g., GitLab, Nix Hydra), including automated build, test, and deployment pipelines. • Introduce and secure reproducible builds and modern build systems (e.g., CMake, Meson) in complex enterprise environments. • Build and maintain automated testing environments (unit, integration, and HIL tests) to ensure stable, test-driven development workflows. • Support developer and DevOps teams in adopting declarative, test- and quality-oriented working practices.

Germany
DevOps Engineer89 days ago
Full TimeRemoteTeam 51-200Since 2016H1B No Sponsor

• Working towards improving the system's non-functional qualities, including availability, scalability, security, and durability. • Engaging in Scrum-based work management by participating in team meetings and discussions. • Creating automation and process improvements. • Supporting and developing new tools used by our Engineering department. • Monitor systems and create infrastructure documentation. • Participating in planning and taking full ownership of our initiatives. • Working with our tech stack, including GCP - GKE, Cloud SQL, Memorystore Redis, GCS, VMs, Cloud Run, PSC, Artifacts Registry, Secret Manager, CDN, K8S and Helm, Argo CD and Argo Workflows, Pulumi, Istio, Grafana stack, Gitlab.

Poland
Job Closed