Weekday (YC W21) logo
Weekday (YC W21)

We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent

Senior DevOps Lead – Infrastructure Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 11-50Since 2021H1B No SponsorCompany SiteLinkedIn

Location

India

Posted

9 days ago

Salary

₹3,000K - ₹4,000K / year

Seniority

Senior

Job Description

Senior DevOps Lead – Infrastructure Engineer

Weekday (YC W21)

• Drive cloud infrastructure operations across AWS environments • Design, implement, and optimize CI/CD pipelines and deployment workflows • Drive Infrastructure as Code adoption using Terraform and automation tools • Manage Kubernetes clusters and containerized application deployments • Ensure platform scalability, reliability, security, and high availability • Implement monitoring, observability, logging, and incident response systems • Support SOC2, HIPAA, and enterprise-grade compliance requirements • Manage identity and access management solutions including Okta and Azure AD • Optimize deployment strategies with rollback and disaster recovery mechanisms • Collaborate with engineering teams to improve release velocity and platform performance • Maintain and modernize legacy Jenkins pipelines where required • Oversee security integrations, SIEM monitoring, and secure access frameworks • Enable AI-assisted DevOps automation and operational efficiency improvements • Support enterprise integrations, ETL systems, databases, and backend services • Mentor DevOps engineers and establish operational best practices.

Job Requirements

  • 8+ yrs experience in DevOps roles
  • Strong hands-on expertise in AWS cloud infrastructure and DevOps practices
  • Proven experience with Kubernetes, Docker, Terraform, and CI/CD automation
  • Deep understanding of security, compliance, and enterprise infrastructure standards
  • Experience handling large-scale, high-traffic production environments
  • Strong knowledge of observability, monitoring, logging, and incident management
  • Familiarity with identity management platforms such as Okta and Azure AD
  • Expertise in automation-first and AI-driven DevOps environments
  • Strong troubleshooting, analytical, and system optimization skills
  • Ability to work in fast-paced, globally distributed engineering teams
  • Strong leadership, ownership, and cross-functional collaboration abilities.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 51-200H1B Sponsor

• Lead installation, commissioning, startup, and validation activities for modular data center deployments across domestic and international environments • Serve as the senior technical field resource during deployment execution and operational readiness activities • Independently execute and oversee deployment plans, commissioning procedures, and infrastructure validation processes • Conduct site assessments and provide technical leadership for field execution, issue resolution, and operational acceptance • Drive deployment quality, operational consistency, and adherence to Armada engineering and safety standards • Lead troubleshooting and root cause analysis efforts across electrical, mechanical, controls/BAS, networking, and monitoring systems • Interpret and apply electrical schematics, mechanical drawings, control diagrams, and infrastructure documentation to resolve complex operational issues • Make independent technical decisions during commissioning, startup, and live operational events • Drive corrective action implementation and long-term reliability improvements based on field findings • Escalate and coordinate resolution of critical infrastructure issues impacting deployment timelines or operational readiness • Lead BMS, EPMS, and DCIM integration, validation, and operational readiness testing activities • Define and execute infrastructure validation procedures to ensure systems meet performance, safety, and operational standards prior to customer turnover • Support incident response, infrastructure recovery efforts, and operational continuity initiatives • Provide technical oversight during system startup, monitoring validation, alarm testing, and infrastructure acceptance activities • Partner directly with Engineering, Manufacturing, Supply Chain, Deployment Leadership, and Customer Operations teams to drive successful deployment execution • Coordinate technical activities with vendors, subcontractors, and field service partners • Provide field-driven operational insights that influence infrastructure design, deployment methodologies, and product reliability improvements • Act as a trusted technical advisor during deployment planning, execution, and operational review activities

United States
$137.0K - $171.3K / year

Role Description - Apoiar clientes na adoção de soluções de modernização de aplicações e infraestrutura em AWS; - Desenvolver soluções, modelos e ferramentas para a plataforma AWS; - Desenhar, validar e evoluir arquiteturas em nuvem escaláveis, seguras e eficientes, tanto para novos produtos quanto para evolução de soluções existentes; - Provisionar, configurar e gerenciar infraestrutura na AWS; - Definir e implementar pipelines de automação utilizando Infrastructure as Code (IaC) com Terraform, CloudFormation e AWS CDK; - Configurar e manter pipelines de CI/CD para deploy de agentes, aplicações e ferramentas; - Garantir aderência às boas práticas do AWS Well-Architected Framework; - Atuar no levantamento de requisitos junto aos stakeholders; - Compartilhar conhecimento técnico, definir padrões, boas práticas de arquitetura, infraestrutura e governança em nuvem. Qualifications - Certificação AWS Solutions Architect – Professional, comprovando conhecimento em AWS Well-Architected; - Experiência sólida com AWS; - Experiência prática e consistente com AWS CDK (Python), AWS CloudFormation e Terraform; - Vivência em levantamento de requisitos, desenho arquitetural, provisionamento, configuração e desenvolvimento de recursos AWS; - Experiência na estruturação de esteiras CI/CD, uso de containers, Git, boas práticas DevOps e padrões de segurança; Requirements - Desejáveis: Experiência ou atuação com Inteligência Artificial (IA) e MLOps; - Conhecimentos em Analytics.

Brazil
Full TimeRemoteTeam 11-50H1B No Sponsor

• Design, build, and manage AWS infrastructure using Terraform, with a focus on reusable modules and standardisation • Operate and optimise AWS services including ECS, EC2, Lambda, SQS/DLQ, CloudWatch, IAM • Develop and improve CI/CD pipelines (GitHub Actions, CodeDeploy) for consistent, reliable deployments • Build and enhance observability frameworks (logging, monitoring, alerting) across distributed systems • Implement and manage identity and access controls, including SSO and access brokering • Collaborate with Security on platform hardening and integration with security tooling (e.g. SIEM, DLP) • Contribute to platform engineering initiatives • Drive cost optimisation efforts across AWS (rightsizing, reserved capacity, scaling strategies, and cost visibility) • Troubleshoot production issues, perform root cause analysis, and implement long-term fixes • Continuously improve infrastructure through automation, documentation, and best practices • Working closely with Engineering team to design, deploy, harden and consistently keep secure containerisation and deployment • Working with Compliance teams on PCI DSS, ISO27001 and SOC2 standards, making sure infrastructure is compliant

United Kingdom
£68K / year
Lakeside Software logo

DevOps Engineer

Lakeside Software

Lakeside Software helps IT teams monitor and optimize environments by focusing on the quantified end-user experience.

DevOps Engineer9 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

Role Description We are seeking a driven and technically skilled DevOps Engineer with strong Microsoft Azure experience to support, troubleshoot, and improve cloud infrastructure, CI/CD pipelines, automation, monitoring, and operational reliability across production environments. This role is highly operational and troubleshooting focused, requiring someone who is comfortable diagnosing production issues, responding to alerts and outages, managing escalated support tickets, and serving as a key escalation point for infrastructure and application support. The ideal candidate enjoys problem solving, identifying root causes, stabilizing environments, and partnering cross functionally to resolve complex operational issues quickly and effectively. This position operates within Agile/Scrum environments while balancing real time operational support priorities. Responsibilities - Build, deploy, maintain, and troubleshoot scalable Azure cloud infrastructure - Develop and maintain Infrastructure as Code (IaC) solutions - Create, manage, troubleshoot, and improve CI/CD pipelines and deployment automation - Monitor production systems and actively respond to operational alerts, incidents, outages, and performance degradation - Own and manage escalated support tickets and serve as a technical escalation point for operational issues - Investigate and troubleshoot infrastructure, deployment, networking, database, and application related problems - Perform root cause analysis and implement corrective actions to improve long term system stability - Support highly available environments aligned with SLA/SLO objectives - Participate in on call rotations and support critical production incidents as needed - Perform application maintenance, patching, upgrades, and environment support activities - Collaborate with development, security, infrastructure, and support teams to resolve operational issues quickly - Work within Agile/Scrum processes while also handling ad hoc operational and troubleshooting priorities - Implement operational best practices for reliability, security, monitoring, and performance optimization - Maintain operational documentation, deployment standards, troubleshooting guides, and support procedures Qualifications - 5+ years of experience working in technology, infrastructure, cloud engineering, DevOps, or IT operations roles - 3+ years of hands on experience with Microsoft Azure cloud services - Experience supporting and troubleshooting production environments with SLA/SLO requirements - Strong experience responding to operational alerts, incidents, outages, escalations, and infrastructure troubleshooting activities - Experience diagnosing and resolving deployment, networking, application connectivity, and system performance issues - Experience working in fast paced Agile/Scrum and ad hoc operational support environments - Experience acting as a ticket owner or escalation resource for infrastructure and application related support cases - 3+ years of Infrastructure as Code (IaC) experience using Terraform preferred; ARM templates and/or Bicep acceptable - 2+ years of experience working with SQL databases and Active Directory environments - Experience designing, managing, and troubleshooting CI/CD pipelines using GitHub Actions, Bitbucket Pipelines, and/or Azure DevOps - Strong experience with Git based version control systems, primarily GitHub - Experience with automation and scripting using PowerShell, Bash, or Python - Hands on experience with monitoring and observability platforms such as Azure Monitor, Grafana, Uptrends, and Application Insights - Experience troubleshooting Azure networking components including VNets, NSGs, Private Endpoints, peering, load balancing, and application connectivity - Understanding of cloud security, operational reliability, and infrastructure best practices Preferred Qualifications - Microsoft Certified: Azure Administrator Associate (AZ-104) - Experience with containerization and orchestration technologies such as Docker or Kubernetes - Experience supporting or integrating AI/ML related Azure services such as Azure OpenAI, Azure AI Foundry, or Azure AI Search - Familiarity with GitOps or platform engineering concepts - Strong troubleshooting, analytical thinking, and root cause analysis skills - Strong communication and cross team collaboration skills Benefits - 20 Days Annual Leave - 45 Days Annual Leave Maximum - 4 Festival Days Named - 8 Festival Days Select - 12 Days Sick Leave - 100% Paid Medical Insurance & GPA - Wellness Programme - 3x CTC Group Life Insurance - Pension - Employee Referral Scheme

India