DroneUp is a leader in drone flight services that transforms organizations using drone technology and delivery solutions. The company develops SaaS platforms that have mobile app t

SRE – Platform Engineer

DevOps EngineerDevOps EngineerOther Remote Lead

Location

United States

Posted

105 days ago

Salary

$125K - $150K / year

Seniority

Lead

Bachelor Degree8 yrs expExperience acceptedEnglishAWS Azure GCP Grafana Kubernetes Linux macOS Node.js Prometheus Python Terraform Unix

Job Description

• Broad domain architect for the internal developer platform and all cloud engineering • Drive architecture for tooling or in-house software • Mentor other platform engineers to drive strong engineering practices • Enablement of platform engineering technical capabilities in our internal client teams in software engineering • Peer with the senior architects and engineers in software engineering • Architecture and engineering focused on GCP environment • Architect and oversee GKE cluster operations and workload management • Provide feedback to others and participate in peer reviews / pair programming • Drive the broad adoption of Test Driven Development through designing, development, and debugging unit and integration tests for new and existing infrastructure and code • Continuous curiosity of existing implementations and new technologies and sharing with the team • Practice continuous improvement across all job areas and personally / professionally • Clearly communicate with platform engineering teams and other stakeholders and provide technical direction while doing so • Stay current with platform changes and third-party libraries. • Proactively investigate better solutions for current solutions • An understanding of Open Telemetry and true observability and the difference between it and monitoring and logging • Grow the engineering culture towards a high-performing team • Practice the arts of self-service, least privilege and security by default in all solutions • Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets • Lead incident response, including on-call rotations, root cause analysis, and post-mortem reviews • Implement and optimize monitoring, alerting, and observability systems for system reliability • Collaborate on capacity planning and performance optimization to ensure high availability • Other duties as assigned

Job Requirements

Bachelor's degree in Computer Science, Computer Engineering or related field or 8+ years experience as a software engineer
Proficiency in kubernetes. Optional: CKA, CKAD
Extensive experience in Unix / Linux
Polyglot and proficiency in multiple languages (ideally: Golang, NodeJS, Python, HCL and more)
Knowledge of multi-cloud environment, including GCP, AWS, and Azure (familiar with at least two of these environments)
Experienced in using git in trunk-based development models
Experience in use of feature flagging in infrastructure and runtime (k8s)
Experience with backend database technology is a plus, including supporting and performance enhancements
Advanced experience working with and creating public cloud resources in Terraform or other infrastructure as code tools
Experience participating in a 24/7 on-call schedule without supervision and successfully resolving issues without escalation
Experience using Open Telemetry for observability as well as other monitoring tools such as datadog, new relic and others
Good understanding of networking and routing principles
Experience in dockerizing applications and orchestrating them with kubernetes
Familiarity with security configuration for web/api services (SSL, Access control)
Experience with JIRA or other work tracking systems.
Ability to resolve tickets according to priority order and collaborating with the Technical Product Manager to adjust priorities
Excellent documentation details, using Confluence or similar tooling – this could include support notes, runbooks, ADRs, etc
Familiarity with creating an end to end CI/CD pipeline using various tools with artifact storage
Familiarity with use of MacOS as a desktop and predominantly CLI interfaces
Experience in a “product mindset” by understanding stakeholder needs, priorities and business value
Experience with security compliance frameworks including FedRAMP, NIST, and SOC2
Proven experience in SRE practices, including incident management and reliability engineering
Familiarity with monitoring tools like Prometheus, Grafana, or Honeycomb for observability
Experience with chaos engineering, load testing, or reliability testing frameworks.

Benefits

Employees are expected to provide a high level of security to any personal or private information accessed as part of their work, whether at a DroneUp facility or remotely.
Participate in security training.
Remain sensitive to individual rights to personal privacy.
Comply with company policies.

Related Categories

DevOps Engineer

Related Job Pages

Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevOps Engineer

Bart & Associates, Inc.

DevOps Engineer105 days ago

Other RemoteTeam 1,001-5,000H1B No Sponsor

Company Site LinkedIn

• Design, implement, and maintain CI/CD pipelines to support automated build, test, and deployment workflows • Partner with engineering teams to streamline release processes and improve deployment reliability • Implement and manage monitoring, logging, and alerting solutions to ensure system health and performance • Define and maintain cost monitoring and alerting strategies to optimize cloud spend and prevent unexpected usage • Automate infrastructure provisioning and configuration using Infrastructure as Code (IaC) • Troubleshoot production issues and lead root cause analysis efforts • Establish DevOps best practices around reliability, security, and operational excellence • Continuously evaluate tools and processes to improve scalability, availability, and efficiency • Mentor junior engineers and contribute to a strong DevOps culture

AWS Azure Docker GCP Grafana Jenkins Kubernetes Prometheus Python Terraform

View details: DevOps Engineer

United States

Apply

Job Closed

Site Reliability Engineering Manager

ECI Software Solutions

DevOps Engineer105 days ago

Other RemoteTeam 1,001-5,000H1B No Sponsor

Company Site LinkedIn

• Lead and manage SRE operations supporting 24/7/365 availability • Own uptime, SLA compliance, SLIs, SLOs, error budgets, MTTR, and incident trends • Oversee incident management, on-call rotations, and post-incident reviews • Lead FinOps practices across hybrid environments • Drive right-sizing, optimization, and elimination of infrastructure waste • Establish cost visibility, allocation, and reporting • Define and maintain observability standards across hybrid environments, such as AWS, Azure and Vsphere • Utilize platforms such as Coralogix, Open Telemetry, and FireHydrant • Champion GitOps practices and pull request governance • Lead Terraform-based infrastructure automation initiatives • Partner across Product, Engineering, Infrastructure, Finance, and Support teams • Lead, mentor, and develop a high-performing SRE team

AWS Azure Terraform

View details: Site Reliability Engineering Manager

United States

Apply

Job Closed

Senior DevOps Engineer

spiderSilk

spiderSilk delivers tip of the spear threat detection technology for the public and private sectors, globally.

DevOps Engineer105 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Design, build, and maintain robust CI/CD pipelines and cloud infrastructure to accelerate software delivery. • Monitor system performance, troubleshoot issues, and proactively respond to incidents to minimize downtime. • Collaborate closely with software engineers to enable rapid, secure, and reliable releases. • Automate deployment, testing, and scaling processes to enhance operational efficiency. • Implement Infrastructure as Code (IaC) and cloud security best practices to ensure compliance and reduce risk. • Optimize system reliability and performance through proactive capacity planning and tuning. • Champion DevOps best practices, including observability, disaster recovery, and cost optimization. • Stay ahead of emerging technologies and evaluate new tools to improve our tech stack.

Ansible AWS Azure Docker GCP Jenkins Kubernetes Linux NoSQL Oracle Database Python SQL Terraform

View details: Senior DevOps Engineer

United Arab Emirates

Apply

Senior DevOps Engineer, Strong Kubernetes

MAS Global Consulting

Modern digital solutions. Exceptional nearshore delivery.

DevOps Engineer105 days ago

Other RemoteTeam 51-200Since 2013H1B Sponsor

Company Site LinkedIn

• Design, build, and maintain systems that power ephemeral developer environments used in both local development and CI workflows • Improve environment provisioning, stability, and teardown workflows to enhance developer velocity and reliability • Develop tools and automation for testing, debugging, and feature validation across distributed systems • Collaborate with Developer Productivity teams to integrate environment management with CI/CD pipelines and internal tooling • Diagnose and resolve issues related to performance, scalability, and dependency management in developer environments • Contribute to observability and monitoring, including environment health, metrics, and resource utilization • Write and maintain high-quality documentation and internal guides to support developer onboarding and environment usage • Participate in design and code reviews, advocating for maintainability, reliability, and engineering best practices.

AWS Azure Distributed Systems Docker GCP Jenkins Kotlin Kubernetes Python

View details: Senior DevOps Engineer, Strong Kubernetes

Florida

Apply

Job Closed

SRE – Platform Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer

Site Reliability Engineering Manager

Senior DevOps Engineer

Senior DevOps Engineer, Strong Kubernetes