The AI Factory. Accelerating the Future.
Senior Infrastructure Engineer
Location
United Kingdom
Posted
44 days ago
Salary
0
Seniority
Senior
Job Description
Senior Infrastructure Engineer
NexGen Cloud
• Own the design, deployment, and operation of OpenStack and Kubernetes environments — ensuring platform performance, scalability, and resilience for GPU workloads • Build and improve infrastructure using infrastructure-as-code and GitOps practices, driving automation across provisioning, deployment, and operational workflows • Optimise GPU workload scheduling using Kubernetes and NVIDIA tooling, and implement monitoring, logging, and alerting to ensure platform stability • Lead incident response and drive continuous improvement of reliability across the platform • Maintain strong security controls across infrastructure and container layers — RBAC, network policies, and tenant isolation • Work closely with Platform, DevOps, AI, Product, and Support teams to align infrastructure capabilities with customer and platform requirements
Job Requirements
- Strong hands-on experience running OpenStack in production environments
- Proven experience operating Kubernetes at scale — ideally bare-metal or private cloud
- Solid understanding of Linux, networking, and storage systems
- Experience with infrastructure automation, CI/CD, and Git-based workflows
- Strong ownership mindset — comfortable operating without heavy oversight and able to simplify and scale systems in a fast-moving environment
Benefits
- Competitive salary and annual discretionary bonus scheme
- Employee wellbeing benefits
- 25 days of holiday, plus public holidays
- Flexible working arrangements (remote or hybrid, depending on role and location)
- Real ownership and autonomy, with the trust to take initiative and experiment
- The opportunity to make a visible, meaningful impact as we scale
- Clear career progression and growth opportunities in a fast-growing company
- A collaborative, international culture built on trust, transparency, and ownership
- The chance to help shape NexGen Cloud's team, culture, and future alongside ambitious, mission-driven colleagues
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
• Own and drive the design, deployment, and operation of OpenStack and Kubernetes clusters optimised for GPU workloads • Lead and develop a team of 4–5 infrastructure engineers, setting clear direction and standards • Build and improve infrastructure through automation — IaC, GitOps, and CI/CD pipelines • Ensure platform reliability through strong monitoring, observability, and incident management practices • Collaborate closely with DevOps, Product, and Support teams to align infrastructure with real-world customer needs • Take ownership of operational governance including incident, problem, and change management • Identify opportunities to simplify, standardise, and scale systems as the platform grows • Communicate clearly with leadership on platform performance, risks, and improvements
The Opportunity Henosia has product-market fit. People are building real apps with us. They're automating work. They’re integrating systems and connecting data. We've proven that vibe coding works, not just as a toy, but for real businesses, in production. You can describe software in plain text and get production-ready code. Here's the challenge: our current infrastructure doesn’t scale to what's coming. We need to go from supporting hundreds of concurrent sandboxes to tens of thousands. And we need to do it without our cloud bills spiralling out of control. Fully isolated development environments that start in a second. That's where you come in. You'll own the entire cloud infrastructure. You'll architect how we scale our sandbox environment using micro VMs. You'll build the systems that keep users isolated from each other. You'll make sure we can handle 100x growth without everything catching fire. There's no infrastructure team. No senior engineer to guide you. It's you, the founders, and the product engineers. You'll make the calls on what tech to use, how to architect it, and when to ship. If you get this right, you'll have built the foundation that lets millions of people build software on Henosia. No pressure. What You'll Do You'll own infrastructure. Specifically: - Build and scale our Micro VM-based sandbox infrastructure from the ground up - Design and build resilient systems for rolling out new releases - Design isolation and security systems so users can run untrusted code safely - Architect for massive scale—think 10x, 100x, 1000x current load - Optimize costs—every Micro VM costs money, every second matters - Work with product engineers to make sure infra integrates smoothly - Monitor, debug, and fix infrastructure issues before users notice them - Make hard calls on technical tradeoffs - Ship fast—this is infrastructure, but it's still a startup Who You Are You've been building infrastructure for 5-10+ years. You know how to scale systems. You've dealt with the pain of production outages at 3am and learned from them. You understand Micro VMs, Linux networking, and containerization deeply. Maybe you've worked with Cloud-hypervisor, Firecracker, gVisor, or similar tech. You know or want to learn Typescript and Go. TypeScript is our main language, but we’re open to adopting Go for key infrastructure components. Ideally, you know both languages, or you’re willing to learn them. You don't need someone to tell you what to build. You see a scaling problem, you architect a solution. You see a security hole, you fix it. You're comfortable making big technical decisions with incomplete information. You're paranoid about security. Not in a theoretical way - in a "users will run potentially malicious code in our sandbox, and I need to make sure they can't escape"-way. You're fine with chaos. Infrastructure at early-stage startups is held together with duct tape and prayers. You know how to move fast without breaking everything. Requirements Must have: - 5-10+ years building and scaling cloud infrastructure (you've been on-call) - Deep experience with Linux containerization and Linux VMs - Track record of architecting systems for scale - Experience with one or more cloud platforms (AWS, GCP, Azure, Hetzner) - Self-starter mindset—you figure things out Nice to have: - Experience with Cloud-hypervisor, Firecracker, or similar Micro VM technologies - Experience managing a fleet of bare metal servers at scale, e.g. Hetzner Robot - Advanced level experience with Linux networking, namespaces, file systems, and memory management - TypeScript and Go knowledge - Worked at an early-stage startup before - Built coding sandbox systems - Dealt with security in multi-tenant environments What We Offer - Base salary is 70-80K DKK/month (depends on experience) - ESOP. For the right profile, real equity in a fast-growing startup - Remote friendly. Denmark-based is a plus but not required - Own the infrastructure. Founding role means you decide how we scale - Work with experienced founders. Jim and Janne have been building products for 20 years - Solve real problems. This isn't maintaining someone else's infra - you're building it from scratch
• Own and drive the design, deployment, and operation of OpenStack and Kubernetes clusters optimised for GPU workloads • Lead and develop a team of 4–5 infrastructure engineers, setting clear direction and standards • Build and improve infrastructure through automation — IaC, GitOps, and CI/CD pipelines • Ensure platform reliability through strong monitoring, observability, and incident management practices • Collaborate closely with DevOps, Product, and Support teams to align infrastructure with real-world customer needs • Take ownership of operational governance including incident, problem, and change management • Identify opportunities to simplify, standardise, and scale systems as the platform grows • Communicate clearly with leadership on platform performance, risks, and improvements
AWS Cloud Infrastructure Engineer
LeidosLeidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.
Leidos was awarded the U.S. Air Force Cloud One Architecture and Common Shared Services contract, and currently has an opening for Cloud Engineers across AWS, Azure, Google, and Oracle clouds. This is an exciting opportunity to use your experience to modernize a leading, global-scale multi-cloud environment in support of a critical mission, supporting USAF system resiliency, security, and cost effectiveness. Location: This position will be remote. Preferred candidates will be located near Hanscom AFB (Boston, MA) or work in Huntsville, AL. Primary Responsibilities: We are seeking an AWS Cloud Operation and support Engineer with expertise in multiple cloud platforms. A successful individual will be responsible for developing in a scalable cloud-native solutions, and ensuring best practices across architecture, development, deployment, and security from design, test, integration, production, sustainment and maintenance. This is a hands-on technical role that requires rolling up your sleeves to architect, code, debug, and mentor. - Perform cloud operations and engineering tasks to enhance, sustain, and maintain scalable, resilient, and secure cloud solutions for AWS cloud environment - Perform AWS cloud operations, sustainment, and maintenance activities to maintain optimum cloud - Adopt and utilize DevSecOps practices, infrastructure as code, and automation frameworks - Through development and sustainment activities, optimize application performance and reliability in cloud environments - Design, implement and sustain secure cloud architectures and networks implementing zero-trust principles and defense-in-depth strategies - Maintain compliance with industry standards (SOC 2, HIPAA, PCI-DSS, etc.) and regulatory requirements - Architect, implement and maintain cloud networking security controls including STIG requirements - Implement identity and access management solutions and security monitoring frameworks - Support development of migration methodologies and ensure minimal organizational disruption during transitions - Utilize CI/CD workflows and infrastructure-as-code development using Jenkins, Terraform, Ansible, Kubernetes, Jira, Confluence, Artifactory, and Guacamole to support DevSecOps practices. - Containerize applications to enhance scalability and deployment efficiency. - Support the design and development of Shared Services. - Configure and troubleshoot cloud, virtual, and physical hardware and software systems. - Establish and maintain SQL and NoSQL databases, ensuring their performance and reliability. - Support preparation of detailed technical documentation of development and operational processes. - Work in cross-functional teams including development, operations, security, and product management Minimum Qualifications - Bachelors and 4+ years or more of experience; Masters and 2+ years or more of experience. Additional experience may be accepted in lieu of degree. - Secret clearance required - US citizenship required - Certifications: CompTIA Security+ or equivalent (IAT-2) - Practiced verbal and written communications skills - Ability to participate in team efforts to accomplish assigned tasks - Demonstrated experience in cloud operations and sustainment and performing tasks and actions described in the primary responsibilities section Preferred Qualifications - Experience with USAF Cloud One or Platform 1 - Knowledge of Zero Trust Architecture. Experience a plus. - Capable of working in high powered teams and maintaining positive interpersonal relationships while delivering products and services to the customer - Understanding Active Directory, AWS AD, SAML and the standards, procedures, and processes - Experience with Ansible, AWS console, Elastic, AWS, Jira, Confluence, Git, Bitbucket and various cloud Software as a Service (SaaS) offerings to conduct DEV/SEC/OPS pipeline development activities - Administration experience with cloud-based applications (MS O365, SharePoint, AWS AD, AWS) - Experience administering Windows Server, and related services - Cloud certifications in AWS, Azure, Google, or Oracle clouds - Certification Examples - AWS Certified Solutions Architect (Professional), Azure Solutions Architect (Expert), MCSE (Server), Certified AWS SysAdmin, AWS Certified Cloud Practitioner, AWS Certified Developer, AWS Certified Solutions Architect (Dev/Associate), AWS Certified DevOps Engineer, AWS Certified Advanced Networking, AWS Certified Security, Azure Developer Associate, Azure Solutions Architec If you're looking for comfort, keep scrolling. At Leidos, we outthink, outbuild, and outpace the status quo — because the mission demands it. We're not hiring followers. We're recruiting the ones who disrupt, provoke, and refuse to fail. Step 10 is ancient history. We're already at step 30 — and moving faster than anyone else dares. Original Posting: April 13, 2026 For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above. Pay Range: Pay Range - The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.


