Job Closed

This listing is no longer active.

AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.

DevOps / Site Reliability Engineer

DevOps EngineerDevOps EngineerFull Time Remote Mid LevelTeam 1,001-5,000

Location

Worldwide

Posted

5 days ago

Salary

Seniority

Mid Level

No structured requirement data.

Job Description

Role Description We are looking for a Senior Site Reliability Engineer to maintain operational resilience and 24/7 stability across a multi-cloud security program spanning Azure, AWS, and GCP. You will engineer automated security guardrails using Terraform, design and optimize enterprise CI/CD pipelines for continuous ASPM ingestion, and act on CSPM telemetry using Wiz to secure cloud workloads under Zero Trust and federated IAM principles. The role requires deep compliance experience in PCI-DSS or SOC2 environments. Qualifications - 5+ years of experience in Site Reliability Engineering, DevOps, Cloud Security, or related roles; - In-depth architectural expertise in multi-cloud defense, federated IAM, and Zero Trust security principles; - Ability to work autonomously while driving the architecture of complex automated runbooks and mentoring mid-level SREs; - Extensive experience deploying, integrating, and tuning APIs from modern CNAPP/CSPM platforms, specifically Wiz; - Experience building and operating platforms subject to strict financial compliance standards such as PCI-DSS and SOC2; - Strong understanding of cloud security, automation, and infrastructure reliability; - Upper-intermediate English level. Requirements - Scale and maintain operational stability across multi-cloud environments including Azure, AWS, and GCP; - Engineer unified security policies and configuration baselines using Infrastructure as Code (Terraform) to prevent misconfigurations; - Design, maintain, and optimize enterprise CI/CD pipelines supporting continuous ASPM ingestion and deployment; - Act on continuous monitoring alerts and security findings using Cloud Security Posture Management (CSPM) platforms such as Wiz; - Help secure cloud workloads through Zero Trust and federated IAM principles; - Contribute to reliability, automation, and operational excellence initiatives across the platform. Benefits - Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps. - Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive compensation. - Exciting projects: Join projects with modern solutions development and top-tier clients, including Fortune 500 enterprises and leading product brands. - Flextime: Tailor your schedule for an optimal work-life balance, with options for remote work and flexible hours.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior Site Reliability Engineer

Optura

Optura is healthcare’s AI orchestration platform. We help healthcare organizations transform disconnected AI pilots into a unified, enterprise-scale program that delivers measurable value. Our platform enables teams to design, execute, and monitor intelligent agents that drive automation, insights, and action—while providing the control and observability needed to scale safely. Built for real-world complexity, Optura supports multiple model providers, integrates seamlessly with existing infrastructure, and offers both SaaS and self-hosted options. Our mission: revolutionize how healthcare deploys and operationalizes AI in production.

DevOps Engineer5 days ago

Full Time RemoteTeam 11-50

Role Description We’re looking for a Senior Platform Engineer to design, build, and operate the core services that power Optura’s AI Platform. In this role, you will own systems end-to-end, from model and agent orchestration to routing, reliability, and observability. You will partner closely with product and application teams to deliver secure, scalable, HIPAA-aware services. You will play a critical role in shaping the foundation that enables customers to safely deploy AI in real-world healthcare environments. Location: Open to remote or San Francisco Bay Area, Nashville Metro Area, or Raleigh, NC Area What you'll do - Architect and own Optura's multi-cloud infrastructure across AWS, GCP, and Azure — provisioning, networking, identity, observability, and cost governance - Design and operate Kubernetes platforms that run consistently across our cloud environments and inside customer environments, including BYOC and on-prem (potentially air-gapped) deployments - Build a unified deployment framework so Optura ships the same product to SaaS, BYOC, and on-prem customers without bespoke per-customer engineering — Helm charts, operators, install/upgrade tooling, and release pipelines - Own SLOs, capacity planning, incident response, and postmortems across the entire infrastructure stack; set the bar for operational readiness - Drive reliability and performance through error budgets, chaos testing, latency optimization, and disciplined runbook quality - Harden the platform for regulated deployments — HIPAA controls, tenant isolation, audit logging, RBAC, KMS, and secrets rotation - Lead the build-out of IaC, GitOps, and progressive delivery (Terraform, Argo CD, Crossplane) as the team's standard - Partner with engineering and security to set opinionated guardrails: golden paths, base images, policy-as-code, and CI/CD that the rest of the org adopts by default Qualifications - 8+ years operating production infrastructure, including 3+ years in a senior SRE, platform, or staff infrastructure role - Deep Kubernetes expertise across managed (EKS, GKE, AKS) and self-managed/on-prem distributions — not just running it, but operating it at scale across heterogeneous environments - Multi-cloud fluency across AWS, GCP, and Azure, with informed opinions on when to abstract vs. embrace cloud-native primitives - Expert with Terraform (or Pulumi/Crossplane) and GitOps tooling - Experience shipping infrastructure that runs in customer environments — packaging, install/upgrade UX, air-gapped artifacts, support escalation paths - Strong networking, identity, and security fundamentals: VPC design, service mesh, mTLS, OIDC, KMS, secrets management - Production observability ownership (Prometheus, Grafana, OpenTelemetry, distributed tracing) and on-call leadership - A track record of writing real code — Go, Python, or similar — to extend the platform, not just configure it Requirements - Experience shipping HIPAA-regulated workloads, including BYOC or air-gapped customer deployments - Background with enterprise software delivery tooling (Replicated, Cluster API, Talos, Rancher, OpenShift) - Built internal developer platforms (Backstage, golden paths) that measurably reduced lead time for an engineering org - FinOps experience — driving meaningful cloud spend reductions through architecture, not just rightsizing - AI/ML infrastructure exposure: GPU scheduling, model-serving stacks, inference autoscaling - OSS contributions to infrastructure projects, or strong opinions formed running them at scale Benefits - Health, dental, and vision insurance - Generous paid time off - Opportunities for professional growth and development Equal Employment Opportunity At Optura.AI, we’re not just building a product; we are intentionally building the team, culture, and equity we want to see in the tech world. That starts with recognizing that innovation thrives when diverse perspectives come together. Optura is an Equal Employment Opportunity Employer, period. We actively welcome and celebrate every candidate regardless of their race, color, religion, age, marital status, sex (including pregnancy, childbirth, or related medical condition), sexual orientation, gender identity or gender expression, national origin, veteran or military status, disability (physical or mental), genetic information, or any other protected characteristic. More than compliance, we are deeply committed to diversity and inclusion because it’s a non-negotiable part of our foundation. We believe a truly diverse and inclusive workplace is the engine for long-term professional growth and competitive business success, directly fueling our mission to innovate. As part of the Optura team, your voice will be heard, your contributions will directly matter to our trajectory, and your unique background and experiences won't just be celebrated—they will be a vital part of our success. Let's build something exceptional, together.

View details: Senior Site Reliability Engineer

United States

Apply

Site Reliability Engineer

Offchain Labs

We power fast, private decentralized applications

DevOps Engineer5 days ago

Full Time RemoteTeam 11-50Since 2018H1B No Sponsor

Company Site LinkedIn

• At Offchain, we aren’t just building products: we’re leading a movement. • As pioneers in blockchain scalability and security, we're at the forefront of transforming how the world interacts with decentralized applications. • We're laying the foundation that will define the next generation of digital commerce, governance, and human interaction. • This involves tackling real-world challenges that come with scaling blockchain technology, without compromising on its core principles: decentralization, security and transparency. • At the center of this vision is our people. Our team is made up of thinkers and doers that embrace new challenges and seek solutions that push existing boundaries. • If you’re energized by solving unprecedented problems, and believe in the role that decentralized systems will play in creating a more equitable digital future, then we want to hear from you.

AWS Azure Cloud Google Cloud Platform Linux Python Shell Scripting Go

View details: Site Reliability Engineer

California + 1 more

Apply

Manager, Site Reliability Engineering

Aya Healthcare

DevOps Engineer5 days ago

Full Time RemoteTeam 5,001-10,000Since 2001H1B Sponsor

Company Site LinkedIn

• Lead and grow the SRE team • Drive reliability, performance, and availability • Operational intelligence and AI-native operations • Platform efficiency and stakeholder trust

AWS Azure Google Cloud Platform

View details: Manager, Site Reliability Engineering

United States

$230K - $255K / year

Apply

Lead DevOps Engineer

Texas Windstorm Insurance Association

DevOps Engineer5 days ago

Full Time Remote

Role Description We’re looking for a Lead DevOps Engineer to play a key role in driving the delivery, performance, and reliability of our Guidewire insurance platforms. In this role, you’ll sit at the intersection of development, operations, and security—helping accelerate software delivery while ensuring stable, scalable infrastructure. As a technical leader, you’ll define the DevOps vision by shaping architecture, establishing roadmaps, and continuously improving our CI/CD pipelines, automation strategy, and environment governance. You’ll partner closely with engineering and business teams to identify bottlenecks, streamline workflows, and implement solutions that enhance speed and quality across the software lifecycle. This is an opportunity to make a lasting impact—mentoring team members, setting engineering standards, and leading initiatives that modernize how we build, deploy, and maintain applications at scale. Qualifications - Bachelor’s degree in a relevant field or equivalent experience. - 4+ years relevant experience, including experience in: - Deployment/configuration in Jenkins - Network infrastructure, database, cloud and data center operations and security protocols - Programming and scripting languages like Python, Perl, Bash, PHP, Java, SQL, Groovy, or C++ - 5+ years hands-on full SDLC software development (with experience in JAVA preferred) Requirements - Strong knowledge of CI/CD pipeline design and automation, including build, test, deployment, and rollback workflows. - Working knowledge of Java-based application runtimes and configuration/deployment models. - Working knowledge of relational databases and SQL, including automation of database restores, data integrity processes, and deployment-related operations. - Strong knowledge of source code management systems (e.g., Git) and modern branching and merging strategies. - Ability to automate infrastructure, environments, and operational tasks using scripting and Infrastructure-as-Code concepts. - Proficiency in scripting and automation languages such as Shell, Python, PowerShell, or Groovy. - Knowledge of secure environment and secrets management practices across multiple environments. - Ability to integrate quality controls and test automation into CI/CD workflows in agile delivery environments. - Knowledge of IT operations, monitoring, alerting, and incident response concepts supporting production reliability. - Ability to work independently with minimal supervision, proactively manage workload, and remain fully engaged during scheduled work hours. - Good communication skills for coaching and cross-team influence will be critical. Benefits - Comprehensive medical, dental, vision, group life, AD&D, and dependent life insurance coverage. - Flexible Spending Accounts (FSA) and Health Savings Accounts (HSA). - A 401(k) retirement plan with employer matching contributions up to 6%, in addition to a pension plan. - Performance-based incentives, ongoing training and professional development, and certification support. - Generous paid time off, including vacation, sick leave, paid holidays, and personal days. - Flexible scheduling to support work-life balance.

View details: Lead DevOps Engineer

United States

Apply

DevOps / Site Reliability Engineer

Job Description

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior Site Reliability Engineer

Site Reliability Engineer

Manager, Site Reliability Engineering

Lead DevOps Engineer