StarCompliance logo
StarCompliance

We are Reputation Guardians, on a mission to make compliance simple and easy.

Principal Site Reliability Engineering Lead

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 201-500H1B No SponsorCompany SiteLinkedIn

Location

New York

Posted

122 days ago

Salary

0

Seniority

Senior

Bachelor Degree8 yrs expEnglish

Job Description

Principal Site Reliability Engineering Lead

StarCompliance

• Act as a senior custodian of the production promotion process across the software platform estate. • Work closely with Technical Leads and QA to define and evolve promotion practices that emphasise quality, performance, and operational readiness. • Define and evolve observability standards across metrics, logging, tracing, and alerting. • Ensure systems are instrumented to support rapid diagnosis, learning, and recovery. • Drive continuous improvement in platform reliability, performance, and release confidence. • Partner with engineering, architecture, and platform teams to embed operability and resilience into system design. • Lead and participate in on-call and rota-based operational support for production systems. • Coordinate and continuously improve incident management practices, including post-incident reviews and preventative actions. • Act as a senior technical authority for production readiness, operational risk, and release confidence. • Mentor SREs and senior engineers, raising reliability and operational standards across teams. • Influence architectural and platform decisions with a strong operational and delivery lens while remaining hands-on.

Job Requirements

  • Based In East Coast Time Zone
  • Typically, 8+ years of experience in SRE, platform, operational, or software engineering roles with a large amount of these spent in multi-tenant environments.
  • Experience supporting production systems with formal on-call or rota responsibility.
  • Experience in leading and mentoring a team of SRE engineers, with an emphasis on professional and personal growth.
  • Experience enabling regular, multi-service production releases at scale.
  • Right to work in the country of employment.

Benefits

  • All positions require pre-employment screening due to employees potentially having access to highly sensitive and confidential information involving finance and compliance; candidates must be trustworthy and have a heightened sensitivity to protecting confidential financial, professional information.
  • Equal Opportunity Employer Statement
  • We prohibit discrimination and harassment of any kind based on race, sex, religion, sexual orientation, national origin, disability, genetic information, pregnancy, gender identity or expression, marital/civil union/domestic partnership status, veteran status or any other protected characteristic as outlined by country, state, or local laws.
  • This policy applies to all employment practices within our organisation, including hiring, recruiting, promotion, termination, layoff, recall, leave of absence, compensation, benefits, training, and apprenticeship. StarCompliance makes hiring decisions based solely on qualifications, merit, and business needs at the time.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer122 days ago
OtherRemoteTeam 201-500H1B No Sponsor

• Design, develop, test, and maintain full-stack product features and services using modern software engineering practices (backend APIs, cloud services, frontend integrations) • Translate business and clinical requirements into scalable, secure, and maintainable technical solutions • Build and maintain cloud-native services and microservices in AWS and Azure • Implement and maintain CI/CD pipelines to automate build, test, and deployment workflows • Author and manage Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, Ansible, or ARM templates • Implement containerization and orchestration technologies (Docker, Kubernetes, EKS/AKS) • Integrate observability (logging, metrics, tracing) to monitor performance and reliability • Collaborate with cross-functional partners including product, QA, compliance, clinical, and operations teams to deliver impactful software • Apply DevOps and DevSecOps best practices to enhance automation and secure operations • Participate in incident response, troubleshooting, and improving system resilience • Support continuous improvement of systems, tooling, and developer workflows

Pennsylvania
Job Closed
OtherRemoteTeam 501-1,000Since 2015H1B Sponsor

• Establish and evolve SRE best practices across the organization. • Define and drive observability strategy for system health, performance, and reliability. • Design and implement software-driven solutions within the infrastructure domain. • Act as a technical leader and force multiplier, helping set priorities and influencing decision-making. • Take ownership of large, ambiguous initiatives, driving them from concept to delivery. • Combine deep knowledge of software development, infrastructure, and security to improve platform resilience. • Proactively identify systemic risks and reliability gaps. • Partner with engineering teams to improve developer workflows, tooling, and operational maturity. • Provide technical mentorship, architecture guidance, and high-quality design and code reviews.

New York
Job Closed
Arine logo

DevOps Engineer

Arine

Arine optimizes medication to ensure each patient is on the safest, most effective therapy for their unique health needs

DevOps Engineer122 days ago
OtherRemoteTeam 11-50H1B No Sponsor

• Design, build, and maintain automated CI/CD pipelines (e.g., GitHub Actions, Jenkins) to enable fast, secure, and reliable deployments. • Provision, manage, and optimize core AWS services (EC2, ECS, S3, RDS, Lambda, VPC, IAM) to support scalable, highly available applications. • Implement and maintain IaC frameworks (Terraform, CloudFormation, or similar) to ensure infrastructure is version-controlled, repeatable, and auditable. • Leverage AI and automation for log/metric anomaly detection, predictive alerting, AI-assisted troubleshooting, and intelligent CI/CD optimizations. • Build and maintain robust monitoring, logging, and alerting systems (CloudWatch, Datadog, ELK, etc.) to ensure platform stability and rapid issue detection. • Implement DevSecOps best practices, including secrets management, IAM policies, vulnerability scanning, and automated compliance checks. • Partner with development, QA, and product teams to ensure smooth releases, infrastructure reliability, and continuous improvement across the SDLC.

United States
$120K - $150K / year
Job Closed
Bold logo

Semi-Senior Site Reliability Engineer, SRE

Bold

Liberando el potencial de los emprendedores a través de herramientas financieras

DevOps Engineer122 days ago
Full TimeRemoteTeam 1,001-5,000H1B Sponsor

• Liderar la evolución de la plataforma hacia una arquitectura multi-región resiliente, de baja latencia y con conectividad global segura. • Garantizar la continuidad del negocio frente a fallos regionales y elevar los estándares técnicos que sustentan millones de transacciones diarias. • Diseñar e implementar estrategias de alta disponibilidad geográfica (Active-Active / Active-Passive) y Disaster Recovery (DR) entre múltiples regiones de AWS. • Orquestar la conectividad compleja de la organización utilizando Direct Connect, AWS Transit Gateway, VPC Peering y VPNs (Site-to-Site / Client). • Liderar la migración de infraestructura legada hacia AWS CDK, asegurando que cada componente de la red y cómputo esté definido como código. • Gestión avanzada del ciclo de vida de instancias EC2 y configuración de Firewalls (Security Groups/NACLs). • Implementar y gestionar firewalls, WAFs y grupos de seguridad con un enfoque de "Zero Trust". • Centralizar el monitoreo y alertas de múltiples regiones para tener una visión unificada de la salud del sistema.

Colombia
Job Closed