Job Closed

This listing is no longer active.

Staff Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteLeadTeam 201-500H1B No SponsorCompany SiteLinkedIn

Location

Romania

Posted

59 days ago

Salary

0

Seniority

Lead

Job Description

Staff Site Reliability Engineer

Caseware

• Maintain reliable, high‑performing AWS production systems. • Manage EKS clusters for configuration, scaling, and workload stability. • Set up and support Istio service mesh for traffic control and security. • Oversee GitOps workflows to ensure secure, consistent infrastructure changes. • Create automation tools and platform enhancements. • Design, implement, and manage monitoring, logging, and tracing solutions across a diverse range of applications—including AI workloads, microservices, and data pipelines—to ensure visibility, reliability, and rapid issue resolution. • Respond to incidents, analyze root causes, and recommend lasting solutions. • Work with developers and platform teams to enhance deployments and system operations. • Support nx‑based monorepos for scalable, effective developer workflows. • On call rotation.

Job Requirements

  • Deep understanding of AWS services commonly used in production (EKS, EC2, IAM, networking, load balancing, etc.).
  • Professional experience with Kubernetes (EKS), including workload autoscaling, networking, RBAC, and cluster operations.
  • Hands‑on experience with service meshes, specifically Istio.
  • Expertise with GitHub, GitHub Actions, and modern CI/CD workflows.
  • Experience working with monorepos, especially nx.
  • Understanding of GitOps practices (we use Flux CD).
  • Strong grasp of Linux systems, networking, containers, and Docker.
  • Familiarity with infrastructure‑as‑code: CDK, Terraform.
  • Knowledge of SLOs, error budgets, incident management, and production readiness best practices.
  • Strong English language communication and collaboration skills.

Benefits

  • Innovation is at our core. We work with cutting-edge technology in accounting and financial reporting, constantly pushing the boundaries to create impactful software solutions.
  • We are committed to a collaborative culture, where your ideas are valued, and knowledge sharing is encouraged within a supportive, inclusive team.
  • Work-life balance is important to us. We offer flexible work options, remote opportunities, and generous time-off policies to ensure a healthy work-life balance.
  • We offer competitive compensation, including a competitive salary and comprehensive benefits.
  • We are driven by impactful work. Your contributions directly affect how our clients manage financial processes and drive their success.
  • Recognition and rewards matter to us. We celebrate hard work through recognition programs, performance bonuses, and opportunities for career growth.
  • We embrace global opportunities. Work on international projects and collaborate with a diverse, global team.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Visa logo

Staff Site Reliability Engineer

Visa

Based in Foster City, California, Visa is a global payments technology organization. Visa was founded in 1958, coinciding with Bank of America’s launch of the

DevOps Engineer59 days ago

• Own the end‑to‑end lifecycle (design, provisioning, upgrades, and decommissioning) of core platform components, including cloud infrastructure primitives, Kubernetes clusters, networking, ingress, service discovery, service mesh, and data‑plane components. • Design, build, and evolve a highly reliable and resilient containerized platform supporting critical workloads, applying SRE and cloud‑native best practices. • Lead the design and implementation of infrastructure bootstrap orchestration, enabling deterministic, repeatable platform bring‑up and teardown across cloud, network, and Kubernetes layers. • Drive a strong Infrastructure‑as‑Code and GitOps‑first approach, ensuring platform components are reproducible, auditable, automated, testable, and reversible. • Identify and close automation gaps, leading initiatives that significantly reduce manual effort, onboarding time, and operational risk at scale. • Apply and promote SRE principles such as fault isolation, graceful degradation, capacity planning, saturation control, and clear failure modes across the platform. • Continuously assess platform reliability risks and proactively improve stability, resilience, and operational readiness. • Act as a technical reference and escalation point for platform reliability, participating in on‑call rotations, incident response, post‑incident reviews, and problem management. • Improve platform operability by simplifying day‑2 operations, standardizing upgrade and rollback strategies, and reducing Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR). • Ensure platform operations align with security, compliance, and internal control requirements. • Collaborate closely with cross‑functional engineering teams, influencing technical decisions and promoting best practices through hands‑on contributions and technical leadership. • Contribute to architectural and technical discussions, supporting continuous improvement and long‑term platform evolution. • Stay up to date with emerging technologies, SRE practices, and cloud‑native patterns, sharing insights at squad and collective levels. • Be recognized for delivering high‑impact, high‑quality platform and reliability solutions across the organization.

Brazil
Job Closed
Sicredi logo

Senior DevOps/SRE Analyst

Sicredi

Não é só dinheiro, é ter com quem contar.

DevOps Engineer60 days ago
Full TimeRemoteTeam 10,001+H1B No Sponsor

• Promote continuous delivery, create guides and Golden Paths, and define versioning and security standards. • Lead hands-on workshops for integration and effective use of internal tools. • Configure CI/CD pipelines, develop provisioning code (Infrastructure as Code), automate deployments, and optimize containers. • Integrate tests and quality tools, implement security practices, and participate in audits. • Respond to cloud infrastructure incidents and requests; monitor applications and pipelines to ensure high availability. • Collect metrics, review projects for performance, security, and cost efficiency, and evaluate new practices and tools.

Brazil
Job Closed
Mindrift logo

DevOps Engineer (Automation Systems) - Freelance AI Trainer

Mindrift

Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid. Project time expectations: Tasks are estimated to require around 10–20 hours per week during active phases, based on project requirements; This is an estimate, not a guaranteed workload, and applies only while the project is active. Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.

DevOps Engineer60 days ago

Mindrift is looking for skilled DevOps / Automation Engineers (Infrastructure & Scaling) to join the Tendem project (https://tendem.ai/) and build and maintain scalable infrastructure for automation workflows within our hybrid AI + human environment. In this role, as an AI Pilot – that’s how we refer to this position at Mindrift – you’ll collaborate with Tendem Agents that handle repetitive tasks, while you provide infrastructure expertise, system reliability, and performance optimization to ensure stable and scalable automation pipelines. This part-time remote opportunity is ideal for professionals with hands-on experience in cloud infrastructure, system deployment, and supporting high-load automation environments. What We Do The Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe. About the Role This is a freelance role for a Tendem project. As a DevOps / Automation Engineer, you'll design, deploy, and maintain infrastructure supporting automation workflows, ensuring system stability, scalability, and high availability across environments. Key Responsibilities - Deploy and maintain self-hosted automation environments (e.g., n8n instances or similar systems). - Design and manage infrastructure to support high-volume workflows and large-scale data processing. - Scale automation systems to handle concurrent workloads and performance-intensive tasks. - Set up monitoring, logging, and alerting systems to track workflow performance and detect failures. - Implement alerts for critical events (e.g., node failures, performance drops, system instability). - Ensure system uptime, reliability, and fault tolerance across automation pipelines. Compensation On this project, contributors can earn up to $60 per hour equivalent, depending on their level and pace of contribution. Compensation varies across projects depending on scope, complexity, and required expertise. Please note that other projects on the platform may offer different earning levels based on their requirements.

United Kingdom
Job Closed
Mindrift logo

DevOps Engineer (Automation Systems) - Freelance AI Trainer

Mindrift

Apply → Pass qualification(s) → Join a project → Complete tasks → Get paid. Project time expectations: Tasks are estimated to require around 10–20 hours per week during active phases, based on project requirements; This is an estimate, not a guaranteed workload, and applies only while the project is active. Note: Rates vary based on expertise, skills assessment, location, project needs, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details are shared per project.

DevOps Engineer60 days ago

Mindrift is looking for skilled DevOps / Automation Engineers (Infrastructure & Scaling) to join the Tendem project (https://tendem.ai/) and build and maintain scalable infrastructure for automation workflows within our hybrid AI + human environment. In this role, as an AI Pilot – that’s how we refer to this position at Mindrift – you’ll collaborate with Tendem Agents that handle repetitive tasks, while you provide infrastructure expertise, system reliability, and performance optimization to ensure stable and scalable automation pipelines. This part-time remote opportunity is ideal for professionals with hands-on experience in cloud infrastructure, system deployment, and supporting high-load automation environments. What We Do The Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe. About the Role This is a freelance role for a Tendem project. As a DevOps / Automation Engineer, you'll design, deploy, and maintain infrastructure supporting automation workflows, ensuring system stability, scalability, and high availability across environments. Key Responsibilities - Deploy and maintain self-hosted automation environments (e.g., n8n instances or similar systems). - Design and manage infrastructure to support high-volume workflows and large-scale data processing. - Scale automation systems to handle concurrent workloads and performance-intensive tasks. - Set up monitoring, logging, and alerting systems to track workflow performance and detect failures. - Implement alerts for critical events (e.g., node failures, performance drops, system instability). - Ensure system uptime, reliability, and fault tolerance across automation pipelines. Compensation On this project, contributors can earn up to $60 per hour equivalent, depending on their level and pace of contribution. Compensation varies across projects depending on scope, complexity, and required expertise. Please note that other projects on the platform may offer different earning levels based on their requirements.

United Kingdom
Job Closed