Jump uses AI to help financial managers automatically take notes, stay compliant, update their CRM, and serve clients.
Senior DevOps Engineer
Location
United States
Posted
102 days ago
Salary
$150K - $210K / year
Seniority
Senior
Job Description
Senior DevOps Engineer
Jump - Advisor AI
• Build and scale infrastructure with Terraform, GCP, GitHub Actions, and Kubernetes (GKE). • Collaborate with engineering teams to design and implement resilient, secure, and efficient infrastructures. • Participate in the entire software development lifecycle (SDLC), including coding, testing, and deployment. • Lead technical initiatives and mentor junior engineers. • Manage CI/CD pipelines to ensure efficient deployment and integration processes. • Respond to operational incidents and provide troubleshooting support.
Job Requirements
- At least 5 years of experience in DevOps/SRE/Infrastructure roles, with at least 3 years at a senior level.
- Deep expertise with Terraform and Infrastructure-as-Code practices.
- Strong experience with GCP (Google Cloud Platform), including networking, IAM, GKE (Kubernetes), Cloud SQL, and logging/monitoring stacks.
- Proven track record designing, scaling, and operating distributed systems in production.
- Hands-on experience with CI/CD pipelines (GitHub Actions is a plus!).
- Knowledge of observability tools (Prometheus, Grafana, or equivalents).
- Comfort with system design tradeoffs, performance optimization, and reliability engineering.
- Ability to design resilient, secure, and cost-effective infrastructure for fast-moving engineering teams.
- Experience setting up scalable Kubernetes clusters and managing production workloads.
- Familiarity with security and compliance requirements (SOC 2, Vanta, etc.).
- Strong written and verbal communication skills.
- Experience leading technical initiatives across teams, building consensus, and mentoring others.
- Founder-minded: proactive, self-driven, and eager to take ownership of problems end-to-end.
Benefits
- Health/Dental/Vision insurance
- 401k (no match right now)
- Take the time you need PTO (4 weeks-ish, but we don’t keep track)
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Site Reliability Engineer
PeachGiving lenders the tools to scale and modernize through integration to our API-first, cloud-native platform.
• Help build an effective, inclusive SRE team. • Keep reliability over 99.99% • Design, Develop, and Maintain new data products for our customers • Automate reporting and financial processes for the company • Provide architectural expertise to product teams optimizing for availability and performance. • Participate in infrastructure oncall and the incident response process. • Create infrastructure that is compliant with Fintech regulatory frameworks.
• Own the reliability, scalability, and performance of Peec AI’s core systems and infrastructure • Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available • Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one • Develop and refine incident response practices, ensuring issues are triaged quickly and resolved with minimal user impact • Proactively identify and address bottlenecks, single points of failure, and operational inefficiencies across the stack • Champion operational excellence and a culture of reliability, driving best practices across the engineering organization
• Optimize release deployments and maintain secure cloud infrastructure • Handle day-to-day operations and problem-solving • Ingest new solutions and products from the Build/Automation organization • Use monitoring and logging tools to solve issues • Conduct post-mortem analysis and identify potential issues for improvement • Setup, monitor, and maintain DevOps cloud-based SAAS products and solutions • Maintain security and data privacy and ensure compliance • Work with architects on deployment architecture, security, and CI/CD implementations • Setup and maintain Kubernetes clusters on cloud environments • Analyze and solve operational issues, and respond to incidents • Conduct root cause analysis and implement continuous improvements • Evaluate new technology options and vendor products
SRE – Platform Engineer
DroneUpDroneUp is a leader in drone flight services that transforms organizations using drone technology and delivery solutions. The company develops SaaS platforms that have mobile app t
• Broad domain architect for the internal developer platform and all cloud engineering • Drive architecture for tooling or in-house software • Mentor other platform engineers to drive strong engineering practices • Enablement of platform engineering technical capabilities in our internal client teams in software engineering • Peer with the senior architects and engineers in software engineering • Architecture and engineering focused on GCP environment • Architect and oversee GKE cluster operations and workload management • Provide feedback to others and participate in peer reviews / pair programming • Drive the broad adoption of Test Driven Development through designing, development, and debugging unit and integration tests for new and existing infrastructure and code • Continuous curiosity of existing implementations and new technologies and sharing with the team • Practice continuous improvement across all job areas and personally / professionally • Clearly communicate with platform engineering teams and other stakeholders and provide technical direction while doing so • Stay current with platform changes and third-party libraries. • Proactively investigate better solutions for current solutions • An understanding of Open Telemetry and true observability and the difference between it and monitoring and logging • Grow the engineering culture towards a high-performing team • Practice the arts of self-service, least privilege and security by default in all solutions • Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets • Lead incident response, including on-call rotations, root cause analysis, and post-mortem reviews • Implement and optimize monitoring, alerting, and observability systems for system reliability • Collaborate on capacity planning and performance optimization to ensure high availability • Other duties as assigned




