Job Closed

This listing is no longer active.

Robin AI

We make contracts simple. For everyone.

SRE

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 51-200H1B No SponsorCompany Site LinkedIn

Location

South Africa

Posted

71 days ago

Salary

Seniority

Senior

Bachelor Degree3 yrs expEnglishAWS Python Terraform

Job Description

• Help build and maintain cloud infrastructure and applications that powers Legal AI platform • Collaborate with engineering teams for monitoring, incident response, and deployment strategies • Ensure high availability and reliability of proprietary models and services • Standardise and implement observability practices in service-based architecture • Design, deploy, and operate infrastructure to support product teams • Add automation around manual operational tasks • Participate in and improve on-call and incident handling processes

Job Requirements

3+ years of experience in DevOps or Site Reliability Engineering roles
Proficiency in at least one backend programming language (We use Python)
Strong knowledge of AWS services (ECS, S3, RDS, Lambda, etc.), managed by Terraform
Comfortable troubleshooting across the full stack
Knowledge of observability frameworks and tools (We use OpenTelemetry, Cloudwatch & DataDog)
Excellent problem-solving and communication skills
Experience with AI/ML infrastructure deployments is a plus

Benefits

Competitive
Generous equity scheme - everyone gets to be an owner of Robin AI!
20 days PTO, in addition to the public holidays observed in South Africa.
We prioritise promotions for high performers and help you to progress your career.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior – Principal Site Reliability Engineer

DataCrunch

Premium dedicated GPU servers and clusters. Raw performance at an unmatched price.

DevOps Engineer71 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Ensure the reliability, scalability, and performance of HPC and cloud systems. • Build and maintain automation, observability, and monitoring frameworks for compute clusters. • Collaborate with ML, data, and infrastructure teams to deliver high-availability systems. • Develop and enhance CI/CD pipelines, deployment workflows, and on-call processes. • Participate in architecture design and long-term infrastructure strategy discussions. • Participate in a 24/7 on-call rotation, with at least one full on-call week per month.

Ansible AWS Azure Distributed Systems DNS GCP Linux Python Terraform

View details: Senior – Principal Site Reliability Engineer

Germany

Apply

Job Closed

Senior Platform, DevOps Engineer

beyonnex.io

Pioneer of smart real estate

DevOps Engineer71 days ago

Full Time RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Design, build and operate our AWS- and Kubernetes-based platform • Own one or more areas and act as the go-to person in the team • Operate production AWS environments and Kubernetes clusters • Maintain observability stack: Metrics, Logs, Traces, Instrumentation • Define SLOs, dashboards and alerting for teams • Work on Kubernetes networking, Ingress controllers and traffic routing • Build and maintain Terraform modules for AWS and Kubernetes • Support connectivity between cloud and on-prem systems • Participate in design reviews, incident reviews and on-call.

AWS DNS Firewalls Grafana Kubernetes Prometheus TCP/IP Terraform

View details: Senior Platform, DevOps Engineer

Germany

Apply

API Reliability Engineer

Empower

We are an equal opportunity employer with a commitment to diversity. All individuals, regardless of personal characteristics, are encouraged to apply. All qualified applicants will receive consideration for employment without regard to age, race, color, national origin, ancestry, sex, sexual orientation, gender, gender identity, gender expression, marital status, pregnancy, religion, physical or mental disability, military or veteran status, genetic information, or any other status protected by applicable state or local law.

DevOps Engineer71 days ago

Full Time RemoteTeam 10,001+H1B Sponsor

Company Site LinkedIn

• Own and improve the reliability, performance, and scalability of API services in production. • Troubleshoot and resolve P1/P2 production incidents end-to-end, analyzing issues across application, infrastructure, and integrations. • Work closely with API developers to identify and address reliability issues and application-level security vulnerabilities in service design and implementation. • Contribute targeted code-level or configuration fixes to resolve issues and prevent recurrence. • Participate in root cause analysis (RCA) and drive durable, long-term fixes. • Improve API resilience through patterns such as timeouts, retries, circuit breakers, and graceful degradation. • Establish and enhance observability and service health metrics, including logs, metrics, traces, and SLOs, using Datadog and Splunk. • Define and monitor SLAs/SLOs for API performance and availability. • Work with API Gateway and ALB/NLB for traffic management, routing, and system reliability. • Contribute to CI/CD pipelines using Jenkins to ensure safe and consistent deployments. • Contribute to disaster recovery readiness and system resilience planning. • Collaborate across engineering teams to improve system design and operational readiness. • Participate in an on-call rotation for critical incidents (P1/P2).

AWS Distributed Systems DynamoDB EC2 Java Jenkins Splunk Spring Spring Boot SpringBoot

View details: API Reliability Engineer

United States

$87.4K - $123.4K / year

Apply

Job Closed

Senior DevOps Engineer

Lucidworks

Leaders in AI-Powered Search

DevOps Engineer71 days ago

Full Time RemoteTeam 201-500H1B Sponsor

Company Site LinkedIn

• Build the automation tools that ensure our internal and external customers receive resources quickly and painlessly while making our team’s lives easier • Work closely with engineering teams to deliver a high quality product to our customers that meets all of their needs • Aim for at least 99.9% uptime across all of our managed customers • Work on several major projects including automating parts of our infrastructure, creating new monitors and alerts, creating new tooling for both team consumption and company consumption, etc. • Take ownership of Lucidworks’ company-wide cloud-first initiative by making the onboarding process for new customers as smooth as possible for them.

Distributed Systems Kubernetes

View details: Senior DevOps Engineer

United States

$128K - $176K / year

Apply

Job Closed

SRE

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior – Principal Site Reliability Engineer

Senior Platform, DevOps Engineer

API Reliability Engineer

Senior DevOps Engineer