Everything starts with a smile :)
Senior DevOps Engineer, AWS Platform
Location
United Kingdom
Posted
5 days ago
Salary
£68K / year
Seniority
Senior
Job Description
Senior DevOps Engineer, AWS Platform
Enigmatic Smile
• Design, build, and manage AWS infrastructure using Terraform, with a focus on reusable modules and standardisation • Operate and optimise AWS services including ECS, EC2, Lambda, SQS/DLQ, CloudWatch, IAM • Develop and improve CI/CD pipelines (GitHub Actions, CodeDeploy) for consistent, reliable deployments • Build and enhance observability frameworks (logging, monitoring, alerting) across distributed systems • Implement and manage identity and access controls, including SSO and access brokering • Collaborate with Security on platform hardening and integration with security tooling (e.g. SIEM, DLP) • Contribute to platform engineering initiatives • Drive cost optimisation efforts across AWS (rightsizing, reserved capacity, scaling strategies, and cost visibility) • Troubleshoot production issues, perform root cause analysis, and implement long-term fixes • Continuously improve infrastructure through automation, documentation, and best practices • Working closely with Engineering team to design, deploy, harden and consistently keep secure containerisation and deployment • Working with Compliance teams on PCI DSS, ISO27001 and SOC2 standards, making sure infrastructure is compliant
Job Requirements
- Strong experience as a Senior DevOps / Platform Engineer in AWS environments
- Deep hands-on expertise with AWS services (ECS, EC2, Lambda, IAM, CloudWatch, SQS, etc.)
- Strong knowledge of AWS networking (VPC design, routing, security groups, NACLs, private/public architectures)
- Proven experience with Terraform, including building reusable modules
- Openness to and experience with infrastructure-as-code approaches, with Terraform preferred but alternative IaC tools considered
- Solid understanding of AWS Well-Architected Framework and cloud best practices
- Experience designing and operating multi-region architectures
- Strong CI/CD experience (GitHub Actions, CodeDeploy or similar)
- Experience with identity and access management, including SSO
- Strong Linux and containerisation knowledge (Docker, ECS)
- Experience building and maintaining observability and monitoring systems**
Benefits
- Competitive salary 68,000+ DOE
- Flexible working hours and remote-first setup
- Work-from-abroad flexibility
- 28 days holiday + bank holidays
- Private medical insurance (including dependants)
- Pension contributions
- Training, certifications, and professional development support
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer
Lakeside SoftwareLakeside Software helps IT teams monitor and optimize environments by focusing on the quantified end-user experience.
Role Description We are seeking a driven and technically skilled DevOps Engineer with strong Microsoft Azure experience to support, troubleshoot, and improve cloud infrastructure, CI/CD pipelines, automation, monitoring, and operational reliability across production environments. This role is highly operational and troubleshooting focused, requiring someone who is comfortable diagnosing production issues, responding to alerts and outages, managing escalated support tickets, and serving as a key escalation point for infrastructure and application support. The ideal candidate enjoys problem solving, identifying root causes, stabilizing environments, and partnering cross functionally to resolve complex operational issues quickly and effectively. This position operates within Agile/Scrum environments while balancing real time operational support priorities. Responsibilities - Build, deploy, maintain, and troubleshoot scalable Azure cloud infrastructure - Develop and maintain Infrastructure as Code (IaC) solutions - Create, manage, troubleshoot, and improve CI/CD pipelines and deployment automation - Monitor production systems and actively respond to operational alerts, incidents, outages, and performance degradation - Own and manage escalated support tickets and serve as a technical escalation point for operational issues - Investigate and troubleshoot infrastructure, deployment, networking, database, and application related problems - Perform root cause analysis and implement corrective actions to improve long term system stability - Support highly available environments aligned with SLA/SLO objectives - Participate in on call rotations and support critical production incidents as needed - Perform application maintenance, patching, upgrades, and environment support activities - Collaborate with development, security, infrastructure, and support teams to resolve operational issues quickly - Work within Agile/Scrum processes while also handling ad hoc operational and troubleshooting priorities - Implement operational best practices for reliability, security, monitoring, and performance optimization - Maintain operational documentation, deployment standards, troubleshooting guides, and support procedures Qualifications - 5+ years of experience working in technology, infrastructure, cloud engineering, DevOps, or IT operations roles - 3+ years of hands on experience with Microsoft Azure cloud services - Experience supporting and troubleshooting production environments with SLA/SLO requirements - Strong experience responding to operational alerts, incidents, outages, escalations, and infrastructure troubleshooting activities - Experience diagnosing and resolving deployment, networking, application connectivity, and system performance issues - Experience working in fast paced Agile/Scrum and ad hoc operational support environments - Experience acting as a ticket owner or escalation resource for infrastructure and application related support cases - 3+ years of Infrastructure as Code (IaC) experience using Terraform preferred; ARM templates and/or Bicep acceptable - 2+ years of experience working with SQL databases and Active Directory environments - Experience designing, managing, and troubleshooting CI/CD pipelines using GitHub Actions, Bitbucket Pipelines, and/or Azure DevOps - Strong experience with Git based version control systems, primarily GitHub - Experience with automation and scripting using PowerShell, Bash, or Python - Hands on experience with monitoring and observability platforms such as Azure Monitor, Grafana, Uptrends, and Application Insights - Experience troubleshooting Azure networking components including VNets, NSGs, Private Endpoints, peering, load balancing, and application connectivity - Understanding of cloud security, operational reliability, and infrastructure best practices Preferred Qualifications - Microsoft Certified: Azure Administrator Associate (AZ-104) - Experience with containerization and orchestration technologies such as Docker or Kubernetes - Experience supporting or integrating AI/ML related Azure services such as Azure OpenAI, Azure AI Foundry, or Azure AI Search - Familiarity with GitOps or platform engineering concepts - Strong troubleshooting, analytical thinking, and root cause analysis skills - Strong communication and cross team collaboration skills Benefits - 20 Days Annual Leave - 45 Days Annual Leave Maximum - 4 Festival Days Named - 8 Festival Days Select - 12 Days Sick Leave - 100% Paid Medical Insurance & GPA - Wellness Programme - 3x CTC Group Life Insurance - Pension - Employee Referral Scheme
Senior Platform Engineer – Kubernetes, Middleware, DevOps
Trace3We Believe All Possibilities Live in Technology
• Support day-to-day operations of the CU Boulder following systems: • Maintain, monitor, patch, and upgrade application and middleware environments • Manage and support logging, monitoring, and reporting platforms • Troubleshoot and resolve complex issues; perform root cause analysis • Support system implementations, upgrades, and production rollouts • Collaborate with cross-functional teams and stakeholders • Document system configurations, processes, and operational procedures • Evaluate and recommend tools to improve performance, reliability, and cost efficiency • Coordinate and participate in maintenance windows, including after-hours activities
DevOps Engineer - GCCA Remote
TransUnionFounded in 1968, TransUnion is a credit information management services provider for consumers, businesses, and the global credit community. An equal opportunity employer recognize
TransUnion's Job Applicant Privacy Notice What We'll Bring: We Are TransUnion: TransUnion is a major credit reference agency, and we offer specialist services in fraud, identity and risk management, automated decisioning and demographics. We support organisations across a variety of sectors including finance, retail, telecommunications, utilities, gaming, government and insurance. What You'll Bring: Technical Expertise: - Strong Linux systems administration experience, including firewalls and hardening - Expertise in Docker and container orchestration. - Proficiency with Infrastructure as Code (IaC) tools, particularly Terraform. - Experience with network design, administration, and troubleshooting. - Knowledge of programming languages (e.g., JavaScript, Node.js, PHP). - Experience with version control systems, ideally Git. - Web server configuration (Apache, Nginx + nice to have: MSSQL Server). - Database management (MySQL, MongoDB), including high availability and backup solutions. - Hands-on experience managing cloud providers, with significant experience in AWS and Google Cloud Platform (GCP). - Familiarity with GCP services such as Compute Engine, Kubernetes Engine (GKE), Cloud Storage, BigQuery, and IAM. - Familiarity with configuration management and IT automation tools. - Strong understanding of DevOps and SRE principles. Impact You'll Make: Infrastructure & Operations: - Participate in the design, implementation, and maintenance of our infrastructure, ensuring reliability, scalability, and security. - Support, monitor, and enhance the live infrastructure and platform solutions, ensuring high availability and performance. - Help plan and execute the integration of our current infrastructure into TransUnion's group-wide cloud platform while minimising disruptions. - Participate in the migration of infrastructure from AWS to Google Cloud Platform (GCP), ensuring a smooth transition and leveraging GCP services effectively. DevOps & Security: - Maintain robust CI/CD pipelines, collaborating closely with development teams to streamline deployment processes. - Maintain and enhance our security posture, ensuring compliance with industry standards and frameworks (e.g., SOC-2, ISO 27001). - Diagnose and resolve infrastructure outages and incidents, ensuring timely resolution and root cause analysis. Documentation & Best Practices: - Ensure comprehensive documentation of infrastructure, systems, and processes to support onboarding, troubleshooting, and scalability. - Promote and implement DevOps and Site Reliability Engineering (SRE) best practices across the organisation. For positions based in South Africa, preference will be given to suitably qualified candidates from designated groups in line with the company's Employment Equity plan and targets. Should you have not heard from TransUnion within 3 weeks from applying, please regard your application as unsuccessful. Please note it is a requirement of the Global Capability Centre Africa that you reside in a home that is fibre ready; and has space for you to be able to work comfortably and confidentially on a day-to-day basis for the purpose of your proposed employment. You can be based anywhere in South Africa that has fibre, but will not be able to work in a location outside of South Africa. A Minimum of a 100/100 Meg Fibre line is required, should you be successful, you will need to upgrade your line or install fibre for day one. Please note that being a credit bureau, some positions require a clear credit record. At TransUnion, we encourage and are committed to creating a real, positive impact and shared sense of purpose within our Workforce for Good, which empowers our people to grow, innovate and contribute to a better future for our communities and customers. We strive to build an environment where our associates are in the driver’s seat of their professional development— while having access to help along the way. We recognize that success comes when our associates thrive both professionally and personally; that’s why we prioritize work/life flexibility and offer resources for our teams across the globe to collaborate and drive excellence. Be a part of our Workforce for Good – you’ll work with great people, pioneering products and cutting-edge technology. At GCC Africa, you’ll join a purpose‑driven organisation that invests deeply in its people through competitive rewards, comprehensive benefits, and meaningful career growth. We offer flexible, permanent work‑from‑home arrangements, strong wellbeing and support programmes for you and your family, and access to global exposure, continuous learning, and accredited development opportunities. Our inclusive culture, focus on recognition, and commitment to work–life balance ensure you can grow your career while thriving personally This is a hybrid position and involves regular performance of job responsibilities virtually as well as in-person at an assigned TU office location for a minimum of two days a week. TransUnion Job Title Sr Engineer, Development Ops
• Maintain and optimize existing monitoring and automation solutions • Collaborate with stakeholders to gather requirements • Define monitoring strategies and engineer solutions • Design and implement cloud automation and orchestration workflows • Develop and maintain integrations with RESTful APIs • Create and maintain technical documentation • Continuously analyze and improve monitoring KPIs and incident response processes




