Job Closed

This listing is no longer active.

Articul8 AI

Solving the world's toughest problems with Generative AI.

Senior Site Reliability Engineer – Chaos Engineering

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 11-50H1B No SponsorCompany Site LinkedIn

Location

Brazil

Posted

152 days ago

Salary

Seniority

Senior

Bachelor Degree5 yrs expPortugueseAWS Azure Distributed Systems Docker GCP Grafana Kubernetes NoSQL Prometheus Python SQL Terraform

Job Description

• Architect and maintain scalable, highly available infrastructure for our GenAI platform. • Design and implement robust monitoring, alerting, and observability solutions to proactively ensure system health and performance. • Automate deployment, scaling, and management of our cloud-native infrastructure, reducing toil and improving efficiency. • Define, measure, and improve Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to deliver outstanding service quality. • Participate in on-call rotations and provide rapid response to production incidents, minimizing downtime and user impact. • Collaborate closely with development teams to build reliable, scalable, and efficient systems for complex AI workloads. • Lead incident response efforts, conduct thorough post-mortems, and champion continuous improvement initiatives. • Optimize infrastructure for performance, scalability, and cost-effectiveness—especially for high-demand AI workloads. • Implement and enforce security best practices across all systems and environments. • Create and maintain comprehensive documentation, including runbooks and knowledge base articles, to foster a culture of shared knowledge.

Job Requirements

Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical experience
5+ years of experience in DevOps, SRE, or similar roles
Strong experience with cloud platforms (AWS, GCP, or Azure)
Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.)
Hands-on experience with infrastructure as code tools (Terraform, CloudFormation, etc.)
Solid background in containerization technologies (Docker, Kubernetes)
Proven experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, etc.)
Strong understanding of CI/CD pipelines and automation
Exceptional troubleshooting and problem-solving skills and ability to troubleshoot complex systems
Experience with chaos engineering tools such as Chaos Monkey, Gremlin, or similar frameworks
Familiarity with container orchestration platforms like Kubernetes and related chaos tools
Preferred
Experience supporting AI/ML systems in production
Knowledge of GPU infrastructure management and optimization
Familiarity with distributed systems and high-performance computing
Experience with database systems (SQL and NoSQL)
Certifications in cloud platforms (AWS, GCP, Azure)
Experience with chaos engineering and resilience testing
Knowledge of security best practices and compliance requirements

Benefits

Ready to shape the future of resilient software systems? Apply now and help drive the reliability of tomorrow’s AI at Articul8 AI!***NOTE: This position is available via CLT contract only, Thank you!

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevOps Engineer

BlueMatrix

The leading technology provider for the global investment research industry.

DevOps Engineer152 days ago

Full Time RemoteTeam 51-200Since 1999H1B No Sponsor

Company Site LinkedIn

• Implement and maintain CI/CD pipelines using GoCD and GitLab. • Manage Terraform and Terragrunt modules to provision and maintain infrastructure. • Automate configuration management and environment setup using Ansible. • Administer and optimize Linux-based systems across hybrid cloud environments. • Support database cluster configurations (e.g., MySQL, Cassandra) and troubleshoot issues. • Deploy and maintain Docker and Kubernetes environments across multiple tiers. • Contribute to infrastructure observability using AWS CloudWatch and log pipelines. • Support secrets management, IAM policies, and environment-specific access control using SSM and AWS best practices.

Ansible AWS Cassandra Docker Kubernetes Linux MySQL Python Terraform

View details: DevOps Engineer

India

₹1,500K - ₹3,000K / year

Apply

Cloud DevOps Engineer

Motivity

The only clinically-driven all-in-one practice management solution for ABA. Data collection, scheduling, billing, + more

DevOps Engineer155 days ago

Other RemoteTeam 11-50Since 2015H1B Sponsor

Company Site LinkedIn

• Take on varied roles within a small, growing team of engineers • Tackle full stack development concerns in the frontend, backend and infrastructure • Work closely with the team on architecture, design and code reviews, while continuing to spend the majority of their time doing hands-on development • Work closely with business stakeholders to ensure requests meet the needs of the business and clinical product leaders • Provide technical support as necessary to customers and third-party vendors • Identify and resolve technical issues

View details: Cloud DevOps Engineer

United States

Apply

Job Closed

Independent IT Trainer – Cybersecurity, Stormshield/Sophos, AI, DevOps

NEO-VISION

Voir, Faire et Réaliser Différemment

DevOps Engineer155 days ago

Full Time RemoteTeam 1-10H1B No Sponsor

Company Site LinkedIn

• Develop your personal brand and professional visibility. • Create engaging training courses tailored to learners' needs. • Contribute to the development of IT skills for our learners worldwide.

View details: Independent IT Trainer – Cybersecurity, Stormshield/Sophos, AI, DevOps

France

Apply

Senior DevOps Engineer

eSimplicity

An engineering firm that delivers high-quality Healthcare IT, Cybersecurity, and Telecommunication solutions.

DevOps Engineer155 days ago

Other RemoteTeam 51-200Since 2016H1B No Sponsor

Company Site LinkedIn

• Design, build, and maintain secure CI/CD pipelines using GitHub Actions to deliver applications and infrastructure • Embed security controls, tools (SAST, DAST, SCA), and processes throughout the software development lifecycle • Manage and secure cloud infrastructure using Infrastructure as Code (IaC) with Terraform and Terragrunt • Implement and manage security for containerized applications using Docker • Collaborate with development teams (Java, Python, Django) to identify and remediate security vulnerabilities in code and dependencies • Automate security monitoring, logging, and incident response procedures within the AWS cloud environment • Ensure systems and applications meet federal compliance standards (e.g., FISMA, NIST) and CMS-specific security requirements • Support the security of data platforms and services, including Databricks and Redshift • Work with cross-functional teams to foster a culture of security awareness and best practices

Amazon Redshift AWS Django Docker Java Python Terraform

View details: Senior DevOps Engineer

Maryland

$106.3K - $136.6K / year

Apply

Job Closed

Senior Site Reliability Engineer – Chaos Engineering

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer

Cloud DevOps Engineer

Independent IT Trainer – Cybersecurity, Stormshield/Sophos, AI, DevOps

Senior DevOps Engineer