Job Closed
This listing is no longer active.
20% of fortune 500 fintech trust Kunai for engineering talent.
Senior DevOps Engineer
Location
United States
Posted
72 days ago
Salary
0
Seniority
Senior
Job Description
Senior DevOps Engineer
Kunai
• Design and implement resilient, secure, and scalable cloud environments to support client platforms in production. • Drive production readiness and operations: monitoring and alerting, incident support, runbooks, capacity planning, reliability improvements, and release readiness. • Build and maintain CI/CD workflows and reconfigure/enhance an existing proprietary pipeline using Argo. • Automate infrastructure provisioning and configuration using Infrastructure as Code (Terraform, CloudFormation, CDK). • Support containerized deployments and orchestration using Docker and ECS. • Develop automation scripts and utilities in Python and/or Bash for deployment, configuration, and operational tasks. • Implement and maintain service configuration and deployment automation across environments (dev/test/stage/prod). • Configure and manage cloud networking and access controls, including Security Groups. • Implement and maintain monitoring/observability capabilities (metrics, logs, traces, dashboards) and establish actionable SLOs/SLIs. • Plan and execute performance testing and scalability validation; partner with engineering to remediate bottlenecks and improve system performance. • Collaborate with engineering, architecture, security, and client stakeholders to triage issues, estimate work, and continuously improve delivery and reliability.
Job Requirements
- 5+ years of hands-on DevOps / Platform / SRE experience supporting production systems.
- Strong experience with at least one public cloud provider (AWS).
- Demonstrated practical experience with DevOps tools and practices, with a clear focus on production readiness and operations.
- Experience designing and operating resilient systems (availability, scalability, fault tolerance).
- Strong Infrastructure as Code experience with Terraform, CloudFormation, and/or CDK.
- CI/CD experience, including adapting and improving existing pipelines; experience with Argo preferred.
- Containerization and orchestration experience with Docker and ECS.
- Scripting/automation skills with Python and/or Bash.
- Experience with service configuration and deployment automation.
- Experience configuring and managing Security Groups and related cloud networking controls.
- Hands-on experience with monitoring/observability and performance testing in production-like environments.
Benefits
- competitive compensation
- professional development opportunities
- flexible work arrangements
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer – B2B SaaS, EdTech
EdusignSolution de dématérialisation des feuilles de présence pour organismes de formation.
• Gérer et faire évoluer notre environnement AWS (EC2, ECS Fargate, Lambda, S3) • Optimiser les coûts, la performance et la résilience • Construire et maintenir des pipelines de déploiement fiables • Mettre en place ou améliorer le monitoring, logging et alerting • Renforcer la sécurité de l'infrastructure et des données • Participer à l'optimisation de nos bases de données (MySQL, PostgreSQL, Redis, Elasticsearch) • Travailler main dans la main avec les développeurs et partager les bonnes pratiques
Site Reliability Engineer
ZimperiumThe leader in enterprise mobile endpoint protection and mobile app protection for Android, iOS and Chromebooks threats
• Design, code, test and deliver software to automate manual operational work • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents • Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes • Identify application patterns and analytics in support of better service level objectives • Design self-healing and resiliency patterns • Design automated software and product upgrades, change management, and release management solutions • Participate in the 24×7 support coverage as needed • Mentor and guide junior developers
• Engineering the Platform: You’ll work hand-in-hand with product teams to lay the infrastructural groundwork for new apps. If it’s repetitive, you’ll automate it. If it’s fragile, you’ll harden it. • Killing Toil: You’ll identify the "friction points" in our dev lifecycle and build the Terraform modules or Python tooling to make them disappear. • Architectural Advocacy: You aren't a gatekeeper; you’re an enabler. You’ll champion sound software design and "paved road" patterns that allow devs to move fast without breaking things. • Observability over Monitoring: Moving us beyond simple alerts to deep introspection. You’ll help us use tools like Honeycomb and Grafana to understand *why* things happen, not just *that* they happened. • High-Agency Incident Response: You’ll help us evolve our incident remediation from "fixing the fire" to "learning from the heat," driving a culture of blameless retrospectives and systemic improvements.
• Design, implement, and maintain CI/CD pipelines and shared build tooling that improve throughput, reliability, and developer experience, with clear success metrics captured in Focals. • Deliver pipeline migrations (for example, Jenkins or Tekton to GitHub Actions) by creating automation, templates, guardrails, and documentation that reduces friction for service owners. • Diagnose and resolve CI incidents and performance bottlenecks through hands-on troubleshooting, root cause analysis, and follow-up improvements that prevent repeat issues. • Build and evolve Kubernetes-based automation workflows (for example, Argo Workflows) to reduce manual steps and standardize common delivery patterns. • Partner with internal stakeholders to understand requirements, set expectations, and communicate progress clearly, especially during high-impact changes. • Uphold security and compliance best practices across the delivery stack, including access controls, secrets handling, and infrastructure as code practices using Terraform.




