The #1 Platform for Intelligent Banking Interactions
Senior Infrastructure Engineer, Platform
Location
Portugal
Posted
72 days ago
Salary
0
Seniority
Senior
Job Description
Senior Infrastructure Engineer, Platform
Glia
• maintaining and updating Glia’s core infrastructure; • troubleshooting and resolving infrastructure-related issues; • improving our security posture; • implementing and managing Infrastructure pipelines; • implementing support for telemetry; • consulting other teams on infrastructure-related topics; • working with third-party vendors and service providers.
Job Requirements
- Experience in building and maintaining reliable and highly available systems
- Experience in working with cloud infrastructure and running containerized applications
- At home with Linux/Unix tools and ecosystem
- Infrastructure-as-Code enthusiast
- Experience in coding that extends beyond scripting
- Proficiency in written and spoken English.
Benefits
- Flexible work arrangements
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
The Role As an Infrastructure Engineer, you'll build and deploy massive computational infrastructure that positions Boundless as the leading decentralized proving network.. You'll architect GPU clusters at unprecedented scale, orchestrate proving across every major blockchain, and manage the complex systems that power billions of cycles of ZK proofs daily. This role demands expertise in both bare-metal optimization and cloud-native architectures. What You'll Do - Build Massive Proving Clusters: Design and deploy proving infrastructure with 1000s of GPUs across both on-premises data centers and cloud services (AWS, GCP, Azure) - Orchestrate Multi-Chain Proving: Build infrastructure that coordinates proving workloads across every major blockchain, ensuring optimal resource allocation and throughput - Optimize Container Topology: Design and refine the topology of complex containerized services, maximizing efficiency and minimizing latency in proof generation - Bare Metal Engineering: Work at the hardware level, optimizing GPU performance, managing CUDA installations, and tuning kernel parameters for maximum throughput - Cloud Infrastructure: Architect highly available, auto-scaling cloud infrastructure that can dynamically respond to proving demand across multiple regions - Release Management: Manage deployment pipelines and release schedules for complex distributed software, ensuring zero-downtime upgrades - Performance Monitoring: Build comprehensive monitoring and alerting systems to track GPU utilization, proof generation metrics, and system health - Cost Optimization: Implement strategies to minimize infrastructure costs while maintaining performance, including spot instance management and resource scheduling
Role Description Our client is a venture-backed financial technology firm dedicated to transforming the global movement of money through stablecoin infrastructure. They are currently seeking a Senior Infrastructure Engineer to design and build an internal platform that empowers product teams to deploy software with confidence, reliability, and speed. - Platform Architecture: Own and evolve core platform components, including a TypeScript-based Pulumi codebase and Kubernetes-based runtime environments. - Engineering Standards: Enforce high engineering standards through code, architecting scalable systems that prioritize reliability and security. - Developer Experience: Improve developer productivity by building internal development platforms focused on self-service and "golden paths." - System Observability: Design and maintain the monitoring stack, defining SLIs/SLOs, error budgets, and operational dashboards. - Infrastructure as Code: Build reusable systems and infrastructure primitives rather than one-off scripts to ensure a scalable and maintainable environment. - Operational Excellence: Participate in the maintenance and operations of production-grade systems, including on-call rotations and incident response tooling. Qualifications - 5+ years of software engineering experience, with a significant focus on infrastructure and cloud domains. - Strong programming skills in TypeScript or another strictly typed language. - Deep understanding of AWS architecture and proven experience designing, operating, and scaling Kubernetes workloads. - Strong grasp of distributed systems fundamentals, including availability, consistency, and fault tolerance. - Familiarity with GitOps patterns, deployment automation, and Infrastructure as Code (IaC) systems. - Ability to operate in short feedback loops and a desire to build foundational infrastructure for the future of digital finance. - Ability to provide significant overlap with Eastern Time business hours. Requirements - Opportunity to build foundational infrastructure for a next-generation financial fabric. - Work alongside a small, mission-driven group of builders from high-performance finance and crypto backgrounds. - Engagement with a fast-moving, well-funded startup during a period of high growth. - Commitment to fostering a diverse and equitable workplace regardless of race, religion, gender identity, or veteran status. Interview Process - Recruiter / HR Initial Screening: Candidates will be asked to respond to a set of questions via video recording. - Hiring Manager Interview: A technical and background discussion with the Head of Engineering. - Technical Interview I: Deep dive into engineering capabilities and systems design. - Technical Interview II: Further assessment of infrastructure expertise and coding proficiency. - Final Interview: Comprehensive review and cultural alignment. Commitment to Equality and Accessibility At MLabs, we are committed to offering equal opportunities to all candidates. We ensure no discrimination, accessible job adverts, and providing information in accessible formats. Our goal is to foster a diverse, inclusive workplace with equal opportunities for all.
• Design and implement a highly scalable, multi-tenant control plane that supports Firmus’ growing AI and infrastructure needs • Contribute to the development of exabyte-scale, S3-compatible object storage, distributed file systems, and high-performance filesystems • Work with bare-metal provisioning tools such as Base Command Manager, Warewulf, Ironic, MaaS, and similar platforms • Apply a deep understanding of operating systems, computer networks, software-defined storage, and high-performance applications • Work with technologies including RDMA, GPU Direct Storage, RoCE, InfiniBand, DPDK, Ceph, Weka, DAOS, and others • Collaborate with operations teams to monitor, analyse, and optimise internal clusters and storage platforms • Document architecture designs, operational procedures, and performance results • Collaborate with L2 SRE engineers, site operations, and networking teams to ensure platform reliability, reproducibility, and performance • Contribute to continuous improvement in cluster validation, CI/CD automation, and provisioning and testing frameworks • Apply knowledge of Kubernetes and composable storage clusters • Contribute to the development of custom Kubernetes operators and intelligent orchestration frameworks to optimise AI workload performance for large-scale GPU cluster commissioning
Senior Infrastructure Engineer
dbt Labsdbt Labs is a technology consultancy on a mission to “help analysts create and disseminate organizational knowledge.” Specializing in analytics, data engine
• Design, operate, and support infrastructure systems with parity across tenancy models (single vs multi) and public clouds (AWS, Azure, and GCP) - and work with engineering teams to get their services consistently deployed to those environments • Bring cloud infrastructure expertise to the team, helping us strengthen and scale our infrastructure as we expand dbt Cloud’s multi-cloud capabilities. • Help create a great developer experience while working with our close partners in Architecture, Release Engineering, Product Engineering and Security • Leverage tools and languages such as Terraform, Kubernetes, Python, Bash, Helm, ArgoCD, Go, and DataDog • Design and build automation to eliminate manual toil and streamline infrastructure operations at scale • Identify and implement infrastructure optimizations that reduce cloud spend without sacrificing reliability • Participate in a balanced on-call rotation in an environment that values continuous improvement, and help to upgrade our tooling and reduce toil



