Reply logo
Reply

Reply designs and implements innovative solutions in the areas: Digital Services, Technology and Consulting.

Senior TechOps – DevOps Engineer

DevOps EngineerDevOps EngineerContractRemoteSeniorTeam 10,001+Since 1996H1B SponsorCompany SiteLinkedIn

Location

Brazil

Posted

69 days ago

Salary

0

Seniority

Senior

Job Description

Senior TechOps – DevOps Engineer

Reply

• Support a global project ensuring stability and performance of a live production environment • Maintain and improve cloud infrastructure • Support deployments and handle operational challenges in a dynamic setting • Collaborate with distributed teams

Job Requirements

  • Strong experience with GCP (GCE + GKE)
  • Kubernetes + Docker
  • Linux/Unix systems administration
  • Terraform + CI/CD tools (Jenkins or similar)
  • Experience supporting live production environments (preferably eCommerce)
  • Comfortable with on-call rotation / 24x7 environments

Benefits

  • 100% REMOTE!
  • Temporary 3-month position

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Runlayer logo

Senior Site Reliability Engineer

Runlayer

The Simpler, Safer Way to Connect MCPs

DevOps Engineer69 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

• Own reliability and performance of our cloud infrastructure across AWS (ECS, Aurora, CloudWatch) and GCP • Manage and optimize Kubernetes clusters and container orchestration • Drive database reliability engineering, including performance tuning and scaling • Build and maintain CI/CD pipelines for rapid, safe deployments • Run incident response and on-call rotations • Partner with product engineers to design scalable, resilient systems

United States
Muvr logo

DevOps Engineer

Muvr

On a mission to make your moves effortless, efficient, and enjoyable.

DevOps Engineer69 days ago
Full TimeRemoteTeam 11-50Since 2023H1B No Sponsor

About Muvr Muvr is building the future of on-demand logistics and moving services. Our platform powers real-time booking, pricing, matching, payments, and fulfillment across customers, drivers, and partners. As we scale, infrastructure reliability and operational excellence become product requirements. This role exists to keep production stable, observable, secure, and scalable so engineering teams can ship quickly without sacrificing uptime, correctness, or customer trust. Role Overview The DevOps / Site Reliability Engineer (SRE) owns the reliability foundations of Muvr’s platform. You will design and operate cloud infrastructure, improve deployment speed and safety, strengthen observability, and lead incident practices that prevent repeat failures. This is a hands-on, production-ownership role for someone who values automation, low-toil systems, and practical guardrails that make delivery faster and safer at the same time. You will partner closely with Engineering, Security, Product, and adjacent teams to harden the platform as usage grows. Key Responsibilities 1) Platform Reliability and Production Ownership - Own uptime, latency, availability, and error-rate outcomes for core services. - Establish SLOs, SLIs, and alerting aligned to customer impact and service health. - Improve reliability through resilient patterns such as retries, timeouts, circuit breakers, load shedding, and queue protections. - Reduce operational toil by building automation and self-service tools that improve engineering velocity and operational safety. 2) Cloud Infrastructure and Infrastructure as Code - Design, build, and maintain scalable cloud infrastructure across AWS, GCP, or Azure environments. - Automate provisioning, configuration, and change management using Infrastructure as Code, preferably Terraform. - Improve disaster recovery readiness through backups, restore validation, redundancy, and failover planning. - Maintain strong environment consistency across development, staging, and production to reduce deployment surprises and configuration drift. 3) CI/CD and Release Engineering - Build and improve CI/CD pipelines to increase deployment frequency while reducing release risk. - Standardize deployment practices, including versioning, environment promotion, staged rollouts, canary releases, and rollback mechanisms. - Implement release guardrails such as required test gates, policy checks, dependency scanning, and secrets detection. - Improve developer experience through faster builds, clearer failure signals, and more reliable deployment workflows. 4) Observability and Operational Excellence - Build and maintain observability across logs, metrics, tracing, dashboards, and service-level visibility. - Design alerting that catches critical failures early while minimizing noise and paging fatigue. - Create runbooks and playbooks that are actionable under pressure and linked to specific alerts or operational scenarios. - Improve MTTR through better instrumentation, faster diagnosis paths, and clearer service ownership. 5) Incident Management and Root-Cause Discipline - Lead or coordinate incident response, including triage, communication, mitigation, recovery, and follow-through. - Run blameless postmortems with clear root-cause narratives, contributing factors, and prevention actions. - Ensure corrective actions are tracked to completion and meaningfully reduce recurrence. - Establish incident severity levels, escalation paths, and communication templates that improve consistency during outages or degradation events. 6) Security and Compliance Baselines - Partner with Engineering to implement security best practices, including least privilege, secrets management, encryption, and audit logging. - Improve access hygiene through MFA coverage, key rotation, access reviews, and break-glass procedures. - Identify infrastructure risks and drive remediation with clear prioritization, ownership, and operational follow-through. - Support audit and compliance readiness through clear documentation, logging, and evidence-friendly processes when needed. 7) AI-Enabled Productivity and Execution - Use AI tools thoughtfully to improve productivity, troubleshooting speed, documentation quality, and automation efficiency. - Apply AI responsibly to support analysis, scripting, incident investigation, and workflow improvement while maintaining security, accuracy, and sound operational judgment. Qualifications Required - 3+ years of experience in DevOps, Site Reliability Engineering, Infrastructure Engineering, or similar roles supporting production systems. - Strong experience with at least one major cloud provider: AWS, GCP, or Azure. - Experience building or maintaining CI/CD pipelines using GitHub Actions, Jenkins, CircleCI, or similar tools. - Familiarity with containerization using Docker and orchestration platforms such as Kubernetes. - Strong troubleshooting skills across infrastructure, core networking concepts, deployments, and service operations. - Ability to write automation scripts and tooling using Bash, Python, or similar languages. - Comfortable using AI tools to improve efficiency and work quality, with a willingness to learn emerging AI workflows and apply them responsibly. Preferred - Experience supporting marketplace, logistics, dispatch, delivery, or other real-time operational platforms. - Experience with observability tools such as Prometheus and Grafana, Datadog, New Relic, or similar platforms. - Strong Infrastructure as Code experience using Terraform, CloudFormation, or equivalent tooling. - Experience scaling distributed systems in production, including autoscaling, queue management, caching strategies, and traffic spike handling. - Familiarity with security best practices and compliance expectations for production systems. - Familiarity with tools and systems such as Slack, Google Workspace, ChatGPT, ClickUp, Hubstaff, GitHub, CI/CD platforms, Kubernetes, Terraform, Datadog, Grafana, cloud consoles, ticketing tools, and other infrastructure or reliability platforms. Why Join Muvr - Own reliability and infrastructure for a fast-growing real-time logistics marketplace. - Take on a high-impact role shaping scalability, operational readiness, and production discipline. - Partner directly with engineering leadership to build systems that scale safely and sustainably. - Work on meaningful infrastructure problems where uptime, speed, and correctness directly affect real-world outcomes. - Competitive compensation.

Philippines
Avive Solutions Inc. logo

Senior Devops Engineer

Avive Solutions Inc.

We're on a mission to bring AEDs to the Masses!

DevOps Engineer69 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

• Design, implement, and maintain robust, scalable infrastructure and automation solutions. • Automate and optimize infrastructure provisioning, configuration management, deployment processes and operational repetitive tasks. • Execute infrastructure and application deployments to upper cloud environments. • Develop, maintain, optimize CI/CD pipelines and operational workflows. • Monitor & optimize system performance & resource utilization and identify bottlenecks • Implement scalable and reliable systems that support product growth and team agility. • Troubleshoot, determine root causes and resolve technical issues. • Participate in code reviews and provide feedback on best practices. • Collaborate cross-functionally with cloud, firmware, and hardware development teams to integrate and align solutions. • Document operational processes, tools, and workflows to ensure knowledge sharing and smooth transitions.

United States
$140K - $180K / year
Kindgeek logo

DevOps Engineer, AWS

Kindgeek

We build innovative products that generate value.

DevOps Engineer69 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Administer and maintain project infrastructures for optimal performance and reliability • Deploy and manage infrastructure services, including Kubernetes clusters and cloud environments • Manage cloud resources (VMs, storage, databases) using Infrastructure as Code (IaC) • Ensure high availability, disaster recovery, and cost-effective optimization of cloud deployments • Implement tools for building, deploying, securing, and monitoring infrastructure and services • Design, create, and manage CI/CD pipelines for streamlined software delivery and deployment • Configure monitoring and logging systems for continuous operation and quick issue resolution • Collaborate with developers, operations teams, QA, Architecture, IT, and project management • Provide first-line DevOps support and troubleshoot issues for developers

Ukraine
Job Closed