Job Closed
This listing is no longer active.
If you’re a digital change maker, we’re your team.
Senior Software Engineer – Site Reliability
Location
India
Posted
69 days ago
Salary
0
Seniority
Senior
Job Description
Senior Software Engineer – Site Reliability
Axelerant
• Architect and implement highly reliable, scalable, and cost-effective infrastructure solutions for mission-critical applications across multi-cloud environments (AWS and Azure). • Lead the definition and refinement of service level objectives (SLOs), service level indicators (SLIs), and error budgets, establishing reliability standards across the organization. • Design and implement sophisticated Infrastructure as Code (IaC) solutions using Terraform, Ansible, and Azure Resource Manager (ARM) templates or Bicep. • Drive automation strategies to eliminate toil, improve operational efficiency, and enable self-service capabilities for development teams. • Lead incident response efforts, conduct thorough post-incident reviews, and implement systemic improvements to prevent recurrence. • Champion cloud-native architectures and modern reliability practices, serving as a technical advisor for infrastructure and platform decisions. • Mentor junior SREs and engineers, fostering a culture of reliability, observability, and continuous improvement. • Participate in and help optimize the on-call rotation, ensuring sustainable practices and effective escalation procedures. • Establish and maintain comprehensive documentation standards, runbooks, and knowledge repositories that enable team autonomy and effective incident response. • Design and implement advanced monitoring, logging, and alerting strategies using observability platforms to enable proactive issue detection and resolution. • Lead container orchestration initiatives using Kubernetes (AKS, EKS) and implement sophisticated deployment strategies including blue-green, canary, and progressive delivery patterns. • Ensure security, compliance, and governance standards are embedded throughout the infrastructure lifecycle, implementing security-as-code practices. • Drive capacity planning, performance optimization, and cost management initiatives across cloud platforms. • Collaborate with architecture and security teams to establish platform standards, reference architectures, and best practices.
Job Requirements
- 5+ years of proven experience as a Site Reliability Engineer or similar role, with demonstrated expertise in designing, implementing, and operating large-scale, distributed systems.
- Deep expertise in Infrastructure as Code (IaC) with Terraform and Ansible, including module development, state management, and multi-environment orchestration.
- Extensive hands-on experience with both AWS and Azure cloud platforms, including advanced services, networking, and security features in both environments.
- Expert-level knowledge of container orchestration with Kubernetes, including architecture, custom resource definitions (CRDs), operators, service mesh implementations, and production-scale cluster management.
- Advanced proficiency in Linux system administration, performance tuning, and troubleshooting complex system-level issues.
- Proven experience implementing GitOps workflows using ArgoCD, Flux, or similar tools, including advanced deployment patterns and progressive delivery.
- Deep understanding of observability principles and hands-on experience with tools such as Prometheus, Grafana, Datadog, Azure Monitor, or the ELK stack.
- Expert knowledge of networking concepts, including load balancing, CDNs, DNS, VPNs, service mesh architectures, and distributed systems communication patterns.
- Strong programming and scripting capabilities in Python, Bash, Go, or PowerShell, with the ability to develop custom tooling and automation frameworks.
- Extensive experience designing and optimizing CI/CD pipelines using Jenkins, GitLab CI, Azure DevOps, GitHub Actions, or CircleCI.
- Demonstrated ability to lead incident response, conduct root cause analysis, and drive systemic reliability improvements.
- Excellent communication and leadership skills with proven ability to influence technical decisions and collaborate with stakeholders at all levels.
- Current certification in AWS (Solutions Architect Associate/Professional or equivalent) and Azure (Azure Administrator or Azure Solutions Architect), with practical experience managing production workloads on both platforms.
Benefits
- Excellent work exposure - Some of our recent clients were the UN, the University of East London, and Doctors Without Borders.
- Meaningful projects to contribute back - Most of our projects are in the education, government, healthcare, and not-for-profit sectors. We also encourage and support team members for open-source contributions.
- Work-life flexibility and remote work - You decide when and where to work. This has allowed many team members, who couldn’t have held a regular job otherwise, to have thriving careers.
- Eight-hour workdays - We don't say 8 hours and expect 12 hours minimum.
- No micromanagement - Micromanagement makes us grunt like the Hulk. So nobody would be looking over your shoulders. But help is always available when asked.
- No discrimination - We believe in equal pay for equal work. Personal decisions like planning to have children will not stop you from getting promoted.
- Championing inclusivity - We like diversity. It enriches our lives and products. If you see something wrong or that could be better on day 1, share through established channels to bring positive change. We listen.
- Meaningful time off - 52 weekends and 40 days per year of consolidated leave, plus maternity, paternity, adoption, and sabbatical allowances. We also have Kindness leaves for emergencies.
- Family Medical Insurance - You want your family’s health secured. So do we. We got you, your spouse, and your little ones covered. And free doctor and health and wellness consultations from medical experts, whenever you need.
- Performance coaching - Our professional, empathetic coaches will help you become your best version through career and personal development.
- Event sponsorship - If your session at any event is selected and aligns with sponsorship guidelines, we cover all expenses for the trip, whether domestic or international.
- Continuing education allowance - We’ll cover up to 2% of your annual salary yearly for classes, certifications, or buying books to further your capabilities.
- Health and wellness allowance
- Generous home office set-up allowance
- Sponsored team meet-ups
- Co-working space allowance
- Event allowance
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer (Senior/Lead) ID55632
AgileEngineAgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. WHY JOIN US If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you! ABOUT THE ROLE As a DevOps Engineer, you will drive the reliability and scalability of complex, multi-stack systems, supporting both legacy and modern cloud-native environments. Working with AWS, Kubernetes, Terraform, and CI/CD pipelines, you’ll automate infrastructure, enhance observability, and manage critical database operations. This role offers strong ownership, cross-team collaboration, and the opportunity to work as a versatile engineer shaping resilient, high-performance platforms. WHAT YOU WILL DO - Manage and automate infrastructure across multiple stacks including legacy and modern environments; - Build and maintain scalable environments using Terraform in AWS; - Implement monitoring strategies to improve infrastructure performance and reliability; - Execute database migrations and manage database instances; - Collaborate with team members to share knowledge and provide cross-product support; - Take end-to-end ownership of projects from provisioning to maintenance; - Create runbooks and technical documentation to support multiple products. MUST HAVES - 5+ years of experience in DevOps or Infrastructure Engineering; - Advanced proficiency with Terraform; - Strong experience with AWS including RDS and Lambda; - Hands-on experience with EKS (Kubernetes) and Docker; - Experience building and maintaining pipelines using GitLab CI or Jenkins; - Experience with SQL management and database migrations; - Strong scripting skills using Python or Bash; - Experience with configuration management tools such as Ansible or Puppet; - Ability to adapt quickly to changing scopes and technologies; - Ability to take ownership of reliability and quality of work; - Proactive communication and collaboration skills; - Flexibility to work with evolving technical stacks; - Upper-intermediate English level. NICE TO HAVES - Experience with Azure cloud environments; - Experience with monitoring and observability tools such as Prometheus or Grafana; - Experience building internal developer tooling or automation frameworks; - Experience supporting large-scale, high-availability platform environments. PERKS AND BENEFITS - Professional growth: Mentorship, TechTalks, and personalized growth roadmaps. - Competitive compensation: USD-based pay with education, fitness, and team activity budgets. - Exciting projects: Modern solutions with Fortune 500 and top product companies. - Flextime: Flexible schedule with remote and office options.
Company Description Version 1 has celebrated 30 years in business and continues to be trusted by global brands to deliver technology and transformation solutions that drive customer success. Our deep expertise enables our customers to navigate the rapidly evolving technology landscape. We foster strong partnerships with global technology leaders including Microsoft, AWS, Oracle, Red Hat, OutSystems, Snowflake, ensuring that our customers are provided with the highest quality solutions and services. We’re an award-winning employer reflecting how our employees are at the very heart of what we do: - UK & Ireland's premier AWS, Microsoft & Oracle partner - 3300+ strong, €350/£300m revenue business - 10+ years as a Great Place to Work in Ireland & UK - Best Workplace for Women in the UK & Ireland by GPTW - Best Workplace for Wellbeing in the UK by GPTW We’re a core values driven company, we hire people who share our values, and we reward those who display and foster them, it’s deeply embedded within our DNA. Invest in us and we’ll invest in you!. Job Description Key Responsibilities - Design, build, and maintain cloud-native applications using AWS serverless services (e.g., Lambda, API Gateway, DynamoDB, SQS, Cognito, CloudWatch). - Develop infrastructure as code using AWS CDK or similar tools (e.g., Terraform, CloudFormation). - Contribute to event-driven and microservices architectures. - Collaborate with cross-functional teams, support DevOps processes (CI/CD pipelines, configuration management), and troubleshoot issues across the stack. - Manage AWS platform to high standards while meeting all the KPIs Qualifications Essential Skills & Experience - Strong hands-on experience with AWS services, especially Lambda, API Gateway, DynamoDB, SQS, Cognito, CloudWatch. - Experience with infrastructure as code (AWS CDK, Terraform, or CloudFormation). - Solid understanding of microservices and event-driven architectures. - Familiarity with CI/CD tooling such as GitLab, GitHub or Jenkins - Experience with test-driven development and relevant frameworks. - Strong troubleshooting skills across the stack. Additional experience for senior candidate: - Experience in leading team and line management - Experience with stakeholder management - Experience with agile ways of working - Strong communication skills Optional skills: - Development experience with Java or Python - AWS solution architect, security specialist or other certifications Additional Information Why Version 1? At Version 1, we believe in providing our employees with a comprehensive benefits package that prioritises their wellbeing, professional growth, and financial stability. - Share in our success with our Quarterly Performance-Related Profit Share Scheme, where employees collectively benefit from a share of our company's profits - Strong Career Progression & mentorship coaching through our Strength in Balance & Leadership schemes with a dedicated quarterly Pathways Career Development programme - Flexible/remote working, Version 1 is tremendously understanding of life events and people’s individual circumstances and offer flexibility to help achieve a healthy work life balance - Financial Wellbeing initiatives including; Pension, Private Healthcare Cover, Life Assurance, Financial advice and an Employee Discount scheme - Employee Wellbeing schemes including Gym Discounts, Bike to Work, Fitness classes, Mindfulness Workshops, Employee Assistance Programme and much more. Generous holiday allowance, enhanced maternity/paternity leave, marriage/civil partnership leave and special leave policies - Educational assistance, incentivised certifications, and accreditations, including AWS, Microsoft, Oracle, and Red Hat - Reward schemes including Version 1’s Annual Excellence Awards & ‘Call-Out’ platform. - Environment, Social and Community First initiatives allow you to get involved in local fundraising and development opportunities as part of fostering our diversity, inclusion and belonging schemes. And many more exciting benefits… drop us a note to find out more. Version 1 is an equal opportunities employer. We are committed to building a diverse, inclusive and respectful workplace where everyone feels valued and able to thrive. We welcome applications from people of all backgrounds, identities and lived experiences, and we value the different perspectives people bring. We want every candidate to have a positive and accessible recruitment experience. If you need reasonable adjustments at any stage of the process, please contact [recruiter email address] at Version 1. We will consider all requests carefully, respectfully and confidentially. #LI-SS1 - Department: Digital, Data and Cloud
AWS DevOps Engineer
Cloud BridgeHarness the full potential of AWS with award-winning Premier Partner, Cloud Bridge
• Own and evolve AWS-based environments for a defined customer account • Drive automation, CI/CD, and Infrastructure as Code adoption • Improve security posture, reliability, and scalability • Work closely with customer teams, developers, and internal specialists • Provide advanced technical support for AWS infrastructure (EC2, S3, RDS, VPC, IAM, Lambda, etc.) • Troubleshoot system, application, and network issues in line with SLAs • Monitor platform performance and availability using CloudWatch and observability tooling • Contribute to post-incident reviews and continuous improvement • Design, implement, and maintain Infrastructure as Code using Terraform and/or AWS CloudFormation • Build and manage CI/CD pipelines (e.g. GitHub Actions, GitOps workflows) • Support and operate containerised environments using Docker and Kubernetes (EKS) • Implement deployment strategies and improve release processes • Automate operational tasks to improve efficiency and reduce manual intervention • Manage AWS environments across multiple accounts • Support modern AWS architectures including Landing Zones, multi-account strategies, and Transit Gateway • Maintain and optimise services such as RDS/Aurora, backups, and performance tuning • Collaborate with development teams to ensure scalable and resilient application deployments • Embed security best practices across infrastructure and delivery pipelines • Implement and manage security tooling (e.g. vulnerability scanning, container/image security) • Apply principles such as least privilege, encryption, and secrets management • Create and maintain runbooks, knowledge base articles, and technical documentation • Identify opportunities for automation and service improvement
• Assist in setting up and maintaining AWS services such as EC2, S3, RDS, and VPC. • Perform basic monitoring of cloud resources using tools like AWS CloudWatch. • Support cloud backup and disaster recovery processes. • Participate in initial incident response for issues in the AWS environment. • Escalate complex issues to senior engineers. • Help document cloud setups and configurations. • Prepare reports on cloud resource utilization and incident logs. • Assist in the installation, configuration, and maintenance of Linux servers. • Monitor system performance and report any issues. • Apply basic security patches and updates under supervision. • Help manage user accounts and access rights on Linux systems. • Support backup operations and participate in recovery drills. • Assist in deploying and managing web applications on Linux servers. • Perform basic monitoring of web application performance and uptime. • Support the updating and patching of web applications. • Work with development teams to understand application requirements and assist in deployment strategies.



