Job Closed

This listing is no longer active.

AlpacaDB, Inc., also known as Alpaca and Alpaca Securities, is an API stock and crypto brokerage platform that enables services to embed investing and developer

DevOps Team Lead, Core Foundation

DevOps EngineerDevOps EngineerFull Time Remote Senior Company Site

Location

Europe

Posted

89 days ago

Salary

Seniority

Senior

EnglishGrafana Kubernetes PostgreSQL Prometheus Terraform

Job Description

• Lead, mentor, and foster a healthy, high-performing globally distributed engineering team. • Own the execution and delivery of highly critical, complex yearly roadmap items centered around large-scale foundational infrastructure upgrades, high availability, and platform resilience. • Own and drive the change management processes across engineering and product domains. You will orchestrate the smooth delivery of major systemic changes, ensuring alignment, mitigating friction, and breaking down silos between diverse technical groups to deliver cohesive infrastructure solutions. • Design, implement, and refine robust support workflows, agile planning methodologies, and deployment/rollout strategies to ensure operational excellence. • Manage and optimize the global on-call rotation to ensure team well-being while maintaining high availability. Lead incident response (via Rootly), establishing clear communication, rapid resolution processes, and blameless post-mortems.

Job Requirements

Proven experience as an Engineering Manager, DevOps Lead, or Site Reliability Engineering Lead, with a track record of successfully managing globally distributed teams.
Exceptional people management skills, with a deep focus on coaching, mentoring, and fostering team culture across multiple time zones.
Deep expertise in engineering support frameworks, roadmap planning, and team prioritization methodologies.
Proven experience owning Change Management lifecycles. You have a unifying leadership style with a proven ability to break down organizational silos, build trust between disparate teams, and shepherd complex systemic updates from conception to deployment.
Extensive experience managing Incident Management lifecycles and running sustainable, global on-call rotations.
Incredibly strong communication and organizational skills, with a proven ability to drive and coordinate complex, multi-stage tech rollouts and deployments.
A solid technical background in modern DevOps/SRE ecosystems. You don't need to be hands-on daily, but you must fluently understand the concepts and operational realities surrounding Kubernetes (GKE), Infrastructure as Code (Terraform), Relational Databases (PostgreSQL), and Observability stacks (Prometheus, Grafana, Thanos).
A strategic mindset capable of navigating shifting priorities, acting as the steady organizational force for the company's core infrastructure foundation.

Benefits

Competitive Salary & Stock Options
Health Benefits
New Hire Home-Office Setup: One-time USD $500
Monthly Stipend: USD $150 per month via a Brex Card

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Principal DevOps Engineer

Zeta Global

We deliver better experiences for consumers and better results for your brand.

DevOps Engineer89 days ago

Full Time RemoteTeam 1,001-5,000Since 2007H1B Sponsor

Company Site LinkedIn

• Design, build, and operate production-grade CI/CD pipelines enabling multiple developers on multiple teams to deploy concurrently to production, multiple times daily, with zero-downtime guarantees. • Implement and optimize advanced deployment strategies including canary releases, blue/green deployments, rolling updates, incremental rollouts, and feature flag-gated releases via Statsig. • Build self-service deployment tooling that empowers developers to own their release process while enforcing safety guardrails, automated rollback triggers, and automate compliance gates. • Establish deployment observability with real-time canary analysis, automated health scoring, and progressive delivery metrics integrated with Grafana, Prometheus, and Honeycomb. • Champion CI/CD workflows using GitLab CI/CD, Helm charts, and Terraform to ensure infrastructure and application deployments are version-controlled, auditable, and reproducible. • Define and enforce SLOs/SLIs/SLAs across services, establishing error budgets that balance velocity with reliability. • Lead incident response processes, including on-call rotations, runbook development, blameless postmortems, and incident command structure. • Design and implement robust observability stacks leveraging Grafana, Prometheus, Loki, and Honeycomb for metrics, logging, tracing, and alerting at scale. • Proactively identify and eliminate reliability risks through chaos engineering, load testing, capacity planning, and failure mode analysis. • Reduce operational toil through automation, self-healing infrastructure patterns, and intelligent alerting to minimize mean time to detection (MTTD) and recovery (MTTR).

Apache AWS DNS Docker DynamoDB EC2 Grafana Java JavaScript Kafka Kubernetes Node.js Prometheus Python React Ruby TCP/IP Terraform

View details: Principal DevOps Engineer

United States

$180K - $210K / year

Apply

Senior Platform Engineer – SRE, PK

PrideLogic

Specializes in building world-class development teams and extending runways for groundbreaking startups.

DevOps Engineer89 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Lead IaC architecture • Drive GitOps at scale • Architect and operate Kubernetes infrastructure • Build self-service infrastructure automation • Own reliability • Set observability standards • Partner with security on zero-trust architecture • Mentor mid-level engineers

AWS Kubernetes

View details: Senior Platform Engineer – SRE, PK

Pakistan

Apply

Job Closed

Senior Site Reliability Engineer, Database

Airwallex

Empowering businesses to grow beyond borders

DevOps Engineer89 days ago

Full Time RemoteTeam 1,001-5,000Since 2015H1B Sponsor

Company Site LinkedIn

• Design and build the platforms, automation, and AI-driven tooling that power Airwallex's database infrastructure. • Build a unified database observability platform providing real-time visibility into availability, security posture, reliability metrics, and latency. • Design and implement secure interfaces that allow AI agents to safely query and interact with production databases. • Develop AI-powered automation that handles routine DBA tasks reducing manual toil. • Create tooling that enables product engineering teams to provision, configure, scale, and manage Postgres and Redis instances through self-service workflows. • Establish and enforce database best practices across the organization. • Partner with product engineers to diagnose database performance issues, review schema designs, and provide guidance on data modeling and access patterns.

Cloud Docker Google Cloud Platform Java Kotlin Kubernetes PostgreSQL Python Redis Terraform Go

View details: Senior Site Reliability Engineer, Database

Washington

$140K - $240K / year

Apply

Senior Platform Engineer – SRE

Wizdaa

Specializes in building world-class development teams and extending runways for groundbreaking startups.

DevOps Engineer89 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Lead IaC architecture • Drive GitOps at scale • Architect and operate multi-tenant Kubernetes infrastructure on AWS EKS • Build self-service infrastructure automation • Lead the use of agentic coding tools for infrastructure work • Own reliability • Set observability standards • Partner with security on zero-trust architecture • Contribute to technical roadmap • Mentor mid-level engineers

AWS Kubernetes

View details: Senior Platform Engineer – SRE

Brazil

Apply

Job Closed

DevOps Team Lead, Core Foundation

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Principal DevOps Engineer

Senior Platform Engineer – SRE, PK

Senior Site Reliability Engineer, Database

Senior Platform Engineer – SRE