Captions

Your AI-powered creative studio.

Software Engineer – Site Reliability Engineer

DevOps EngineerDevOps EngineerFull Time Remote Mid LevelTeam 11-50H1B No SponsorCompany Site LinkedIn

Location

India

Posted

142 days ago

Salary

Seniority

Mid Level

2 yrs expEnglishAWS Azure GCP Grafana Jenkins Kubernetes Prometheus Terraform

Job Description

• You will be responsible for the availability and integrity of the infrastructure that underpins Alkira’s Cloud Networking platform • You hold the production systems together; troubleshoot issues that arise in production deployment • Provide 24x7 coverage as a part of scheduled shift and on-call rotation • Work with multiple tools like Prometheus, Grafana, Jira etc. to monitor, manage, triage and document infrastructure issues in real time • Automate infrastructure deployment using CI/CD • Build necessary tools to evolve how we maintain and monitor our solution • Develop and execute system and integration test plans

Job Requirements

At least 2 years’ experience in management of production systems
Self starter and a solution oriented mindset. You see potential challenges as opportunities to learn and grow
Experience with cloud providers, AWS, Azure or GCP
Experience with computer networking and network technologies
Experience with CI/CD pipelines such as Concourse-CI, Jenkins.
Experience with Kubernetes
Excellent problem-solving skills and ability to quickly grasp new concepts
Highly desirable candidates with Hashicorp Certified: Terraform Associate

Benefits

Health insurance
Professional development opportunities

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior Software Reliability Engineer – AI

MixMode

Automated threat detection, unparalleled network visibility, & deep guided investigation powered by Self-Supervised AI.

DevOps Engineer143 days ago

Other RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Own the reliability, performance, and operational health of production AI systems, focusing on improving complex, existing services. • Lead efforts to refactor and harden the AI codebase to improve observability, maintainability, and resilience. • Diagnose and resolve issues across distributed systems, including latency, throughput, data pipelines, and resource utilization. • Design and build monitoring, alerting, and debugging tools for high-availability services. • Partner with researchers and ML engineers to productionize models at scale. • Establish best practices for testing, deployment, capacity planning, and incident response. • Serve as a technical leader during on-call rotations, driving incident response, postmortems, and continuous system improvements.

Distributed Systems Java Apache Kafka Kotlin Kubernetes MySQL PostgreSQL Python Scala Apache Spark

View details: Senior Software Reliability Engineer – AI

California

Apply

Job Closed

Staff Site Reliability Engineer

PathAI

Improving patient outcomes with AI-powered pathology.

DevOps Engineer144 days ago

Other RemoteTeam 501-1,000Since 2016H1B Sponsor

Company Site LinkedIn

• Advancing the state of our operations by implementing SRE best practices - focusing on users, monitoring, and automation. • Engineering infrastructure patterns for cloud environments in Amazon Web Services - building in security, reliability and scalability. • Designing, building, and operating our data center to support our rapidly growing Machine Learning team. • Integrating on-premises datacenter environments with existing cloud infrastructure to create a seamless hybrid cloud environment. • Improving the reliability and resilience of our infrastructure through root-cause analysis and reviewing gaps in designs, and implementations of our infrastructure. • Participating in platform on-call rotations and assisting with urgent incident response.

Ansible AWS Grafana Prometheus Python Terraform

View details: Staff Site Reliability Engineer

Massachusetts

$165.8K - $224.5K / year

Apply

Senior Staff DevOps Engineer

MetaMask

The World’s Leading Web3 Wallet

DevOps Engineer144 days ago

Other RemoteTeam 51-200Since 2016H1B No Sponsor

Company Site LinkedIn

• Deliver, upgrade and maintain infrastructure with high cybersecurity standards (ISO/SOC2) • Drive our code deployment (CI / CD) • Set-up, configure and run development/test and staging/production infrastructure across multiple products and critical applications and multiple cloud providers (AWS, Azure) • Collaborate with developers, SREs, Product Managers and other roles within the business group • Empower development teams on a day to day while thinking strategically and planning for platform growth

Android AWS Azure Firewalls iOS JavaScript Kubernetes Node.js Prometheus Python Terraform TypeScript

View details: Senior Staff DevOps Engineer

United States

$160K - $218K / year

Apply

DevOps Engineer

Impiricus

The future of HCP-Pharma connectivity. Impiricus is the HCP-preferred platform to engage with Pharma.

DevOps Engineer144 days ago

Other RemoteTeam 11-50Since 2020H1B No Sponsor

Company Site LinkedIn

• Design, build, and maintain scalable AWS infrastructure using Infrastructure as Code tools such as Terraform or AWS CloudFormation. • Develop and manage CI/CD pipelines leveraging AWS services (e.g. CodePipeline, CodeBuild, CodeDeploy) and/or third-party tools. • Operate and optimize containerized and serverless workloads using services such as EKS, ECS, Lambda, and Fargate. • Monitor, log, and troubleshoot systems using Amazon CloudWatch, AWS X-Ray, and related observability tools to ensure high availability. • Implement AWS security best practices, including IAM, network security (VPCs, security groups), and secrets management. • Automate infrastructure operations, scaling, and maintenance using scripting and AWS-native automation services. • Lead incident response and post-incident reviews, driving continuous improvements in reliability, performance, and cost optimization. • Support additional infrastructure and operational responsibilities as needed.

AWS Docker Amazon EC2 Jenkins Kubernetes Python Ray Terraform

View details: DevOps Engineer

New York

$110K - $130K / year

Apply

Job Closed

Software Engineer – Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior Software Reliability Engineer – AI

Staff Site Reliability Engineer

Senior Staff DevOps Engineer

DevOps Engineer