Job Closed
This listing is no longer active.
Senior DevOps Engineer
Location
United States
Posted
37 days ago
Salary
0
Seniority
Senior
Job Description
Senior DevOps Engineer
Monad Foundation
• Build and maintain testnets and test automation for components of the Monad blockchain • Prototype new deployment patterns across validator and full-node topologies • Design and run performance benchmarks and load tests to surface bottlenecks before they reach production • Drive observability, alerting, and security across the stack • Define best practices for node operators & support them • Develop tooling for internal & external use • Participate in on-call rotation for production support
Job Requirements
- 2+ years of DevOps experience at a professional RPC provider / professional validator company / L1 blockchain
- 3+ years of experience with Infrastructure as Code and Configuration Management (Terraform, Ansible)
- 2+ years of experience with container technologies (Docker, Kubernetes)
- Proficient in bare metal server management and scripting
- Proficient in at least one back-end language such as Python or Go
- Hands-on experience running highly available containerized systems on Kubernetes, including packaging (Helm) and GitOps-style deployment (FluxCD, ArgoCD, or GitHub Actions)
- Strong experience with a major cloud provider (AWS preferred)
- Comfort with network setup, maintenance, and troubleshooting at the systems level
- Experience operating open-source observability stacks (Prometheus, Grafana, Loki, or similar)
- Proficient with command-line git
- Able to juggle competing priorities and move quickly during incidents
- Fluent written and spoken English.
Benefits
- Challenging problems. You’ll tackle deeply complex and technically demanding problems, with autonomy and impact.
- Endless Opportunity for Impact. The Ethereum Virtual Machine (EVM) standard is ubiquitous, but existing EVM-compatible chains are slow and bandwidth-constrained. Monad’s core innovations offer developers and founders the best of both worlds (portability and performance) and are a game-changer to power global on-chain finance.
- The right team. You’ll be part of a world class team, who are exceptional and highly-motivated.
- Culture. We’re a lean team working together to achieve very ambitious goals. We are united in our culture of collaboration, low ego, and high-quality output.
- Strong Ecosystem. The broader Monad ecosystem has attracted support from leading investors, builders and long-term contributors.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Site Reliability Engineer
LineTenLineTen is a cloud-based, SaaS technology platform that enables businesses to aggregate technical transactions.
• Ensure global coverage of our products. • Responsible for ensuring all engineering teams have a first class development experience. • Drive roll out of Docker/Kubernetes across all engineering workstations. • Ensure top-notch observability setup is in place. • Provide engineering support across all products. • Work with the Architecture team for product development direction. • Participate in post-incident reviews.
NEORIS ahora parte de EPAM es un acelerador Digital que ayuda a las compañías a entrar en el futuro, con más de 20 años de experiencia como Socios Digitales de algunas de las compañías más importantes del mundo. Somos más de 4,000 profesionales en 11 países, con una cultura multicultural y de startup donde fomentamos la innovación, el aprendizaje continuo y la generación de soluciones de alto impacto para nuestros clientes. Estamos en búsqueda de: DevOps Semi Senior - Mid! Principales responsabilidades: • Implementar, optimizar y mantener entornos Cloud Native, garantizando escalabilidad, seguridad y rendimiento. • Gestionar y automatizar pipelines de CI/CD utilizando GitHub, GitLab o Jenkins. • Diseñar, desplegar y administrar soluciones basadas en contenedores. • Configurar y mantener herramientas de observabilidad, asegurando una monitorización integral mediante Dynatrace, Prometheus y Grafana. • Colaborar con equipos de desarrollo y arquitectura para definir buenas prácticas DevOps. • Participar en la mejora continua de procesos, automatización y estándares técnicos. Requerimientos: Excluyentes: • Experiencia mínima de 2 a 4 años en roles DevOps (nivel semi senior). • Conocimientos sólidos en prácticas Cloud Native y gestión de contenedores. • Experiencia aplicando herramientas de observabilidad como Dynatrace, Prometheus y Grafana. • Manejo de GitHub o GitLab y experiencia en pipelines con Jenkins. • Conocimiento de herramientas de calidad y seguridad como Kiuwan. Deseables: • Experiencia con Kubernetes u otros orquestadores de contenedores. • Certificaciones en DevOps o Cloud. • Experiencia en entornos de alta disponibilidad o proyectos de transformación digital. • Conocimientos en automatización avanzada e Infrastructure as Code. Ofrecemos • Contrato indefinido con salario competitivo • Modalidad flexible y posibilidad de trabajo remoto. • Plan de carrera personalizado y formación continua (certificaciones, inglés, etc.). • Participación en proyectos estables con alto componente técnico. • Flexibilidad horaria y enfoque en la conciliación. • Beneficios sociales adaptados a tus necesidades Te invitamos a conocernos en http://www.neoris.com, Facebook, LinkedIn, Twitter o Instagram: @NEORIS. #LI-MO1
Senior DevOps Engineer
TekhqsTekHQS is a global technology and AI-driven solutions company delivering scalable SaaS, Cloud, AI/ML, Blockchain/Web3, DevOps, and enterprise software solutions to startups and enterprise clients worldwide. With a team of 300+ professionals across the USA, UK, UAE, Qatar, Pakistan, and India, we specialize in building high-performance digital products across Logistics, FinTech, Healthcare, and emerging technology sectors. At TekHQS, we foster a culture of innovation, ownership, and continuous growth, empowering our teams to build impactful technology that drives real business transformation.
About the Role We are seeking a skilled DevOps Engineer to strengthen our infrastructure, automation, and CI/CD capabilities across multiple projects. The ideal candidate will drive automation, streamline CI/CD pipelines, and ensure reliable deployments across development and production environments. This role requires strong hands-on experience with AWS and/or Azure, containerization, orchestration tools, infrastructure as code, and continuous integration systems. Key Responsibilities Infrastructure & Cloud Management - Design and manage cloud infrastructure on AWS and Azure. - Deploy and maintain services such as EC2/VMs, S3/Blob Storage, RDS/Azure SQL, VPC/VNet, IAM/Azure AD, Load Balancers, and related services. - Ensure high availability, scalability, and security of production systems. Containerization & Orchestration - Build and manage Docker containers. - Deploy and maintain Kubernetes clusters (EKS/AKS preferred). - Optimize container orchestration for performance and cost efficiency. CI/CD & Automation - Design and maintain CI/CD pipelines using Jenkins (experience with Azure DevOps is a plus). - Automate build, test, and deployment processes. - Implement Infrastructure as Code using Terraform. - Automate configuration management using Ansible. Monitoring & Reliability - Implement monitoring, logging, and alerting mechanisms (CloudWatch, Azure Monitor, Prometheus, Grafana, etc.). - Troubleshoot production issues and ensure minimal downtime. - Improve system reliability and deployment velocity. Security & Compliance - Implement security best practices in infrastructure and pipelines. - Manage IAM roles, access controls, and secrets securely across AWS/Azure environments. - Conduct regular system audits and vulnerability checks. Collaboration - Work closely with development, QA, and product teams. - Support release planning and environment readiness. - Document processes, workflows, and infrastructure architecture. Preferred Qualifications - AWS and/or Azure Certifications (Associate or Professional level). - Experience with microservices architecture. - Experience with monitoring tools (Prometheus, Grafana, CloudWatch, Azure Monitor, etc.). - Knowledge of security best practices and DevSecOps concepts. Soft Skills - Strong analytical and troubleshooting skills. - Ability to work independently and within cross-functional teams. - Strong documentation and communication skills. - Proactive and ownership-driven mindset. - Ability to manage multiple environments and deadlines efficiently. Job Details Experience: 5 years Job Type: Fully Remote Location: 9 pm to 5 am About TekHQS TekHQS is a global technology and AI-driven solutions company delivering scalable SaaS, Cloud, AI/ML, Blockchain/Web3, DevOps, and enterprise software solutions to startups and enterprise clients worldwide. With a team of 300+ professionals across the USA, UK, UAE, Qatar, Pakistan, and India, we specialize in building high-performance digital products across Logistics, FinTech, Healthcare, and emerging technology sectors. At TekHQS, we foster a culture of innovation, ownership, and continuous growth — empowering our teams to build impactful technology that drives real business transformation.
Senior Engineer, Site Reliability
ZensarAt Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus. Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.
Role Description The Software Engineer / Site Reliability Engineer (SRE) will play a critical role in driving reliability, scalability, and performance for the Banking Solutions, Payments, and Capital Markets platforms. This role blends core SRE principles, performance engineering, and service health management to support large-scale, mission-critical systems. The ideal candidate will help modernize platforms through automation-first practices, data-driven reliability metrics, and proactive performance optimization, ensuring exceptional customer experience and business continuity in a highly regulated environment. What You Will Be Doing - Core SRE & Reliability Engineering - Design, implement, and operate highly available, resilient, and scalable systems aligned with SRE best practices. - Define and manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets to balance reliability and delivery velocity. - Build and maintain service health dashboards to provide real-time visibility into platform stability and customer experience. - Reduce toil through extensive automation of operational workflows, alerts, and remediation activities. - Monitoring, Observability & Service Health - Design and maintain end-to-end monitoring and observability solutions covering infrastructure, applications, APIs, and user journeys. - Implement advanced alerting strategies to reduce noise and improve mean time to detect (MTTD) and mean time to resolution (MTTR). - Leverage metrics, logs, and traces to drive root cause analysis and proactive incident prevention. - Enable reliability reporting for stakeholders using SLO compliance and service health metrics. - Performance Engineering & Testing - Lead performance engineering initiatives, including load testing, stress testing, endurance testing, and capacity validation. - Identify performance bottlenecks across application, middleware, database, and infrastructure layers. - Conduct capacity planning and performance tuning to support business growth and peak traffic scenarios. - Partner with development and QA teams to embed performance testing into CI/CD pipelines. - Incident Management & Operations - Lead and participate in incident response activities, including triage, mitigation, recovery, and post-incident reviews. - Drive blameless post-mortems and ensure corrective actions are tracked to completion. - Participate in on-call rotations, providing 24x7 support for critical production systems. - Continuously improve operational readiness and resilience. - Automation, CI/CD & Cloud Operations - Design and manage deployment pipelines, configuration management, and environment consistency across lower and production environments. - Implement Infrastructure as Code (IaC) practices for repeatable and secure cloud provisioning. - Collaborate with DevOps teams to improve deployment reliability, rollback mechanisms, and release safety. - Develop and test disaster recovery plans, backup strategies, and failover mechanisms. - Collaboration & Governance - Work closely with Development, QA, DevOps, Security, and Product teams to align on reliability and performance goals. - Ensure platforms meet security, compliance, and regulatory requirements common in financial services. - Act as a reliability and performance advocate throughout the SDLC. Qualifications - Strong experience in Core SRE practices, including reliability engineering, incident management, and automation. - Proven hands-on experience in Performance Engineering / Performance Testing for large-scale distributed systems. - Deep understanding and implementation experience with SLI / SLO / Error Budget frameworks. - Proficiency in cloud platforms (AWS, Azure, or Google Cloud). - Hands-on experience with containerization and orchestration (Docker, Kubernetes). - Strong background in monitoring, observability, and logging tools such as Prometheus, Grafana, Datadog, Splunk, ELK Stack. - Experience with CI/CD pipelines (Jenkins, GitLab CI/CD, Azure DevOps). - Proficiency in scripting and automation using Python, Bash, Terraform, Ansible. - Strong troubleshooting skills across application, infrastructure, and network layers. - Experience designing and running incident response and post-mortem reviews. - Ownership mindset with accountability for service reliability and customer outcomes. - Excellent communication, collaboration, and stakeholder management skills. Nice to Have (SRE+ Skills) - Experience with Keptn or similar tools for automated SLO-based quality gates and continuous delivery. - Programming experience in Java, especially for debugging, performance profiling, or building automation tools. - Familiarity with chaos engineering practices and tools. - Experience working in banking, payments, or capital markets domains. - Knowledge of security best practices and regulatory compliance in enterprise environment. Company Description At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus. Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace.


