Job Closed
This listing is no longer active.
We help eCommerce merchants grow by empowering them with the #1 shipping solution tool needed to save time and money.
Senior Site Reliability Engineer
Location
Hawaii + 6 moreAll locations: Hawaii | Nevada | New Mexico | Ohio | Oregon | Virginia | West Virginia
Posted
109 days ago
Salary
$156K - $212K / year
Seniority
Senior
Job Description
Senior Site Reliability Engineer
Shippo
• As a Senior Site Reliability Engineer (SRE) on our team, you will leverage platform engineering principles to ensure that Shippo's services are reliable, scalable, and performant. • You will be a hybrid software development and operations engineer, responsible for designing, building, and maintaining the infrastructure that supports our applications. • Your work will directly impact our ability to meet and exceed SLAs, and you will collaborate closely with other engineering teams to create services that are automatable, measurable, and resilient to failure. • Design, scale, and secure infrastructure to stay ahead of business needs through fault-tolerant architecture design, performance testing, profiling, and tuning, and capacity planning. • Design, build, deploy, and maintain automation, monitoring, and alerting systems, as well as design, implement, and test disaster recovery solutions. • Ensure scalability and maintainability through microservices adoption, decoupling of concerns and data model, queuing of jobs and application layering. • Enhance and maintain our CI/CD pipeline for smooth and safe production releases via automated testing and verification. • Verify and ensure performance and correctness of systems in response time and throughput. • Participate in peer reviews and testing and contribute to automated test suites and in design reviews for new features, products, and systems. • Participate in an on-call rotation.
Job Requirements
- Experience developing, managing and troubleshooting highly available distributed systems, including operational experience with Kubernetes in a production environment
- Extensive expertise with at least one public cloud provider (AWS, GCP, Azure)
- Exceptional verbal, written, and interpersonal communication skills
- Interest in and understanding of best-in-class security practices, and automation and testing methods
- Familiarity with configuration and maintenance of common infrastructure components such as Redis, Elasticsearch, and Hadoop
- Deep understanding of customer needs and passion for customer success
- BS or MS degree in Computer Science or equivalent experience
- Bonus Advanced knowledge of managing and optimizing Postgresql server configuration
- 3+ years of experience in software development
- Experience with:
- Managing service meshes (e.g. Istio)
- Defining and monitoring Service-Level Objectives (SLOs) and Service-Level Agreements (SLAs) to ensure that systems meet reliability and performance targets; Monitoring Tools like New Relic, Prometheus, Grafana and/or Datadog
- OpenTelemetry knowledge for distributed tracing and metrics collection and experience on using it in production environments
- Managing Python and Golang applications in production
- Microservices architectures
- DevOps tooling such as Docker, Terraform, ArgoCD, ArgoWorkflows, CircleCI, Github Actions, New Relic, PagerDuty, etc
- AWS/Cloud services such as EKS, EC2, S3, Lambda, Route 53, CloudFront, Cloudflare, IAM, etc.
Benefits
- Healthcare coverage for medical, dental, and vision (90% covered by the company, incl. dependents).
- Pets coverage is also available!
- Take-as-much-as-you-need vacation policy & flexible working hours
- One week-long company wide winter slow down
- 3 Volunteer Days Off (VTOs)
- WFH stipend to set up your home office
- Charity donation match up to $100
- Dedicated programs, coaching, tools, and resources for your professional and career growth as well as an individual learning stipend for your personal and focused growth
- Fun team in person time through our Shippos Everywhere program which includes regular team and company off-sites throughout the year as well as local Shippos gatherings throughout the year
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Lead the initial setup of our DevOps and platform engineering practices • Design and deliver an internal platform for personal or feature environments to boost developer velocity • Build and maintain AWS-based infrastructure for performance, scale, and security • Build CI/CD pipelines and automate release processes end to end • Implement observability tooling (logging, monitoring, alerting) to detect and resolve issues early • Collaborate with developers to remove bottlenecks and improve reliability • Ensure high availability through monitoring, incident response, and recovery strategies • Drive Infrastructure as Code best practices and infrastructure automation • Document systems and processes for clarity and reproducibility
Senior DevOps Engineer
BrahmaThe only account you'll ever need to secure, transact, and explore onchain like never before.
• Own build, deploy, and runtime reliability across BRAHMA AI’s hybrid estate. • Deliver secure, scalable infrastructure for Gen AI based workflows and products across hybrid environments. • Partner with infrastructure and multidisciplinary product and research teams to help them innovate and ship fast. • Design, implement, and operate Slurm and Kubernetes-based platforms across cloud and on-prem GPU nodes, including autoscaling, rollout strategies, and multi-cluster operations. • Build CI/CD pipelines for services, model training, and model serving; standardise artifact/version management and environment promotion. • Implement Infrastructure as Code with Terraform/Terragrunt and configuration management; enforce drift detection and repeatable environments. • Design and implement observability stacks (metrics, logs, tracing); drive incident response and postmortems. • Secure the stack with least privilege, secrets management, network policy, and hardened baselines; support ISO/MPA controls with the security team. • Operate model-serving infrastructure for real-time and batch workloads; optimise GPU utilisation, concurrency, and latency. • Drive cost visibility and efficiency across compute, storage, and egress; forecast capacity and plan lifecycle of hardware and licenses.
DevSecOps, Cloud Platform Engineer
Ad HocAd Hoc delivers stable, fast, and scalable technology services for governments at the federal and state levels. The company was established by two members of th
• Design, build, and operate the scalable, secure Treasury Cloud (TCloud) platform and its services. • Automate infrastructure provisioning via IaC and integrate pipelines with the IRS EDP 2.0. • Ensure platform stability, high availability, and optimal cost management. • Writing and maintaining Terraform modules to manage cloud landing zones. • Implementing the HashiCorp Vault centralized secrets management solution. • Developing reusable CI/CD pipeline templates for microservices and API services. • Configuring and monitoring platform observability.
DevSecOps Engineer
MetalBearCreators of mirrord | Faster, cheaper Kubernetes Development | OSS | DevEx
• Maintain and improve our IaC setup, ensuring reliability, scalability, and security. • Oversee security architecture, implementing best practices for cloud security and compliance. • Lead certification efforts, including ISO 27001, SOC 2, and other relevant frameworks. • Continuously assess and enhance security posture across infrastructure and applications. • Design, implement, and maintain CI/CD pipelines to streamline deployment and development workflows.




