Make is an AI-first design and engineering agency based in Texas, with team members worldwide. We’re proud to have created award-winning software that has been featured in TechCrunch, Mashable, US Weekly, CBS News, Texas Monthly, and The Today Show. A multi-disciplinary team of engineers and designers, we are passionate about creating world-class software that people enjoy using. We are a team of talented individuals who take ownership of the entire project beyond their own craft. We value proactive communication, autonomy, and initiative. You'll be joining a team of 'Managers of One'—people who set their own direction, identify what needs to be done, and dive in without waiting for permission. We’re not only passionate about our craft but also about our culture. We deeply believe that work is purposeful, and that culture is one of the most important parts of any team. Our culture informs our decisions, sets our standards, and guides our interactions.

Site Reliability Engineer - Infrastructure

Infrastructure EngineerInfrastructure EngineerOther RemoteTeam 11-50

Location

United States + 7 more

Posted

86 days ago

Salary

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description This role involves designing and implementing architectural blueprints for our global automation platform. - Design and implement the architectural blueprints that allow our global automation platform to scale while maintaining high availability. - Define the SLIs, SLOs, and error budgets that guide our engineering teams' balance between rapid feature velocity and system stability. - Build and maintain observability pipelines using metrics, logs, and traces to provide engineers with immediate, actionable clarity on service behavior in production. - Participate in the resolution of production incidents and follow the blameless postmortem process to transform system failures into permanent technical improvements. - Cultivate an engineering environment focused on continuous learning from outages to proactively harden our platform against future regressions. - Develop and automate our CI/CD pipelines to ensure code changes are validated and deployed safely using strategies such as canary or blue/green releases. - Introduce and scale chaos engineering experiments to identify and fix infrastructure weak points before they can impact our customers. - Collaborate with developers during early design phases to ensure all new services meet our strict standards for scalability, security, and reliability. - Mentor senior engineers across the organization and represent SRE principles in technical leadership forums to ensure long-term platform health. - Participate in an on-call rotation to respond to incidents and maintain the 24/7 availability of the make.com platform. Qualifications - 6+ years of experience in Software Engineering or SRE roles, with a proven track record of technical leadership. - A thorough understanding of how to apply SLI and SLO principles to drive meaningful reliability outcomes. - A development-first mindset where you approach infrastructure challenges through the lens of a software engineer. - Significant experience in mentoring and leveling up other senior engineers within a high-growth environment. - Deep proficiency in managing and operating Linux/Unix-based infrastructure at scale. - Extensive practical knowledge of cloud providers, with a strong preference for AWS. - Expert-level experience with container orchestration, specifically running production workloads on Kubernetes. - Advanced skills in Infrastructure as Code (IaC) using tools like Terraform to maintain version-controlled environments. - Direct experience building and optimizing CI/CD pipelines and executing modern deployment strategies like canary or blue/green. - Excellent communication skills in English to collaborate effectively with our international teams. Requirements - Proficiency in back-end technologies: Node.js, TypeScript, PostgreSQL, RabbitMQ, Redis, Elasticsearch. - Experience with front-end technologies: Angular, TypeScript, Redux, Web Components, Canvas, Nx. - Knowledge of infrastructure technologies: Amazon AWS, Docker, Kubernetes. - Familiarity with CI/CD tools: GitHub, CircleCI, ArgoCD. - Experience with monitoring tools: DataDog. - Familiarity with AI tools: Claude Code, Cursor, Gemini, GitHub Copilot. Benefits - RSUs grant in a rapidly growing company raising its value every day. - Annual bonus. - Multinational team with 42 nationalities creating the future of automation. - Learning & Development plan (online language, professional courses, conference tickets and other trainings) & 2 learning days per year. - Notebook/Macbook and 34’’ curved monitor. - 25 days of vacation, 4 sick days, Company day off 31.12. - 10 care days to care for your loved ones. - Extra parental vacation (3-6 months). - RSUs grant for a newborn child. - Life insurance. - Benefit Plus Cafeteria (incl. MultiSport Card). - Remote working allowance. - Snack bar, coffee, tea, fruit and vegetable, and sweets all day - every day - available for everyone. - Wednesday lunch, and Friday break, with company-provided food and drinks, with music and lively discussion. - Flexible working hours + home office. - Company therapy pets in Prague's office (dog-friendly office). - Company 3D printer. - Team buildings, parties, and company events multiple times a year.

Job Requirements

6+ years of experience in Software Engineering or SRE roles, with a proven track record of technical leadership.
A thorough understanding of how to apply SLI and SLO principles to drive meaningful reliability outcomes.
A development-first mindset where you approach infrastructure challenges through the lens of a software engineer.
Significant experience in mentoring and leveling up other senior engineers within a high-growth environment.
Deep proficiency in managing and operating Linux/Unix-based infrastructure at scale.
Extensive practical knowledge of cloud providers, with a strong preference for AWS.
Expert-level experience with container orchestration, specifically running production workloads on Kubernetes.
Advanced skills in Infrastructure as Code (IaC) using tools like Terraform to maintain version-controlled environments.
Direct experience building and optimizing CI/CD pipelines and executing modern deployment strategies like canary or blue/green.
Excellent communication skills in English to collaborate effectively with our international teams.
Proficiency in back-end technologies: Node.js, TypeScript, PostgreSQL, RabbitMQ, Redis, Elasticsearch.
Experience with front-end technologies: Angular, TypeScript, Redux, Web Components, Canvas, Nx.
Knowledge of infrastructure technologies: Amazon AWS, Docker, Kubernetes.
Familiarity with CI/CD tools: GitHub, CircleCI, ArgoCD.
Experience with monitoring tools: DataDog.
Familiarity with AI tools: Claude Code, Cursor, Gemini, GitHub Copilot.

Benefits

RSUs grant in a rapidly growing company raising its value every day.
Annual bonus.
Multinational team with 42 nationalities creating the future of automation.
Learning & Development plan (online language, professional courses, conference tickets and other trainings) & 2 learning days per year.
Notebook/Macbook and 34’’ curved monitor.
25 days of vacation, 4 sick days, Company day off 31.12.
10 care days to care for your loved ones.
Extra parental vacation (3-6 months).
RSUs grant for a newborn child.
Life insurance.
Benefit Plus Cafeteria (incl. MultiSport Card).
Remote working allowance.
Snack bar, coffee, tea, fruit and vegetable, and sweets all day - every day - available for everyone.
Wednesday lunch, and Friday break, with company-provided food and drinks, with music and lively discussion.
Flexible working hours + home office.
Company therapy pets in Prague's office (dog-friendly office).
Company 3D printer.
Team buildings, parties, and company events multiple times a year.

Related Categories

Infrastructure Engineer

Related Job Pages

More Remote Jobs

More Infrastructure Engineer Jobs

Infrastructure Engineer

Quavo Fraud & Disputes

Quavo is a leading provider of automated dispute management SaaS solutions for issuing financial institutions.

Infrastructure Engineer86 days ago

Other RemoteTeam 51-200Since 2015H1B No Sponsor

Company Site LinkedIn

About the role: A successful Infrastructure Engineer will work closely with Sr. Infrastructure Engineers and the Infrastructure Team Lead to support internal processes and compliance. This role will be tasked with completing user requests, infrastructure maintenance, and company initiatives as it relates to cloud environments. You will work in a fast-paced environment and support an agile workflow. This role is an instrumental part of our technology team. Responsibilities include: - Maintaining Linux Operating System - Maintaining cloud infrastructure environments - Troubleshooting and researching issues - Resolving Incidents and working Change Requests - On call rotation and ability to work off hours for scheduled maintenance - Ability to work with Non-Infrastructure employees to help with PM Tools (Jira/Confluence/etc) Required Qualifications: - Linux proficiency - AWS proficiency - Kubernetes knowledge - Ability to balance multiple tasks at once - Ability to troubleshoot and resolve operating system issues - Able to work independently on well-defined, less complex tasks - Strong verbal and written communication skills, including the ability to effectively communicate with internal and external business associates - Strong teamwork focus and the ability to foster collaboration within and across teams Preferred Qualifications: - 1+ years of Cloud/SaaS experience - Startup experience with proven growth - Experience using Confluence, SharePoint, and JIRA. Please apply here!

View details: Infrastructure Engineer

United States

$65K - $85K / year

Apply

Job Closed

Cloud Infrastructure Engineer

Bloom

Building better workplaces for everyone.

Infrastructure Engineer86 days ago

Other RemoteTeam 1-10Since 2018H1B Sponsor

Company Site LinkedIn

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description This role involves building, deploying, optimizing, and securing cloud-based and containerized solutions. - Ensure availability, scalability, and security of cloud applications and related services - Focus on automation, performance, efficiency, security, and compliance - Work with teammates to improve legacy processes for operational excellence - Continuously maintain documentation and foster process improvement in the IT department Qualifications - 4+ years of administrative experience in networking, storage systems, operating systems, and hands-on systems engineering experience - 2+ years of non-internship professional software development experience - 1+ years of designing or architecting new and existing systems experience Requirements - Servant-leader qualities with a desire to enhance the work experience for others - Ability to have a technical conversation and drive well-architected solutions - Knowledge of systems engineering fundamentals (networking, storage, operating systems) - Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, or PowerShell - Experience of various ITSM tools and ITIL – Incident, Problems, and Change management preferred - Experience working in an Agile environment using the Scrum methodology - Experience automating and configuring systems using Desired State Configuration (DSC) - Experience utilizing AWS cloud solutions in a DevOps environment - Strong problem-solving abilities to diagnose and resolve process issues - Good organizational and interpersonal skills with experience interacting with technical and non-technical audiences - Demonstrated ability to thrive working independently or as part of a team Benefits - Competitive compensation - Comprehensive health coverage - Long-term growth opportunities - Remote work environment - BeBloom™ employee training and engagement program - Opportunities for mentorship and leadership programs - Employee-led councils for involvement and connection Core Values - Put People First: Uphold and promote a people-first culture emphasizing empathy and kindness - Be Stronger Together: Embrace a team player mentality to collaborate as one team - Do What’s Right: Adhere to high ethical standards and act with integrity - Embrace a Growth Mindset: Foster a culture of continuous learning and professional development - Drive Solutions: Share ideas and solutions that drive the mission forward

View details: Cloud Infrastructure Engineer

United States

Apply

Network Infrastructure Engineer

Akamai Technologies

At Akamai, we make life better for billions of people, billions of times a day. Every moment, billions of people, all over the world, are using the internet to shop, play games, look after finances, learn remotely, share videos, connect across the world, and so much more. These life-shaping digital experiences wouldn’t be possible without Akamai. We power and protect life online. It’s an extraordinary mission, and our global teams achieve it by solving the toughest challenges, and turning the impossible into the possible. With the world’s most distributed compute platform — from cloud to edge — we make it easy for businesses to develop and run applications, while we keep experiences closer to users and threats farther away. That’s why innovative companies worldwide choose Akamai to build, deliver, and secure their digital experiences. Thanks to our world’s most distributed platform for cloud computing, security, and content delivery. Akamai keeps applications and experiences closer and threats farther away. Devoted, determined problem-solvers who share a passion for technology, we’re always pushing ground-breaking ideas and driving innovation. Do you want to power and protect life online, by solving the toughest challenges with us? Be part of an amazing team!

Infrastructure Engineer86 days ago

Full Time RemoteTeam 5,001-10,000Since 1998H1B Sponsor

Company Site LinkedIn

• Remotely diagnose hardware problems • Facilitate router repairs, manage Akamai's hardware assets, and interact with datacenter and ISPs staff • Manage projects for server deployments and network installations through collaboration with cross-functional technical teams • Manage remote hardware assets and troubleshooting issues remotely • Diagnose switch and router problems to ensure stability of Akamai platform • Work closely with field technicians, ISPs and Akamai partners for installation and maintenance of Akamai Network Infrastructure • Train, and mentor new engineers and technicians to share knowledge and up-skill your team

View details: Network Infrastructure Engineer

Colombia

Apply

Job Closed

Senior Infrastructure Engineer

Twilio

Twilio is a Platform-as-a-Service (PaaS) company established in 2007. In support of a flexible workplace, Twilio has previously posted freelance, flexible schedule, part-time, hybr

Infrastructure Engineer86 days ago

Other Remote

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description This position is needed to build and scale Stytch’s core product as we bring Stytch capabilities to Twilio’s full customer base, delivering secure, developer-first identity experiences. Join the Stytch team at Twilio to shape the next generation of identity products. You’ll design and build infrastructure that powers authentication, authorization, and emerging agentic and non-human identity use cases. You’ll partner closely with product and engineering leadership to set direction, deliver high-impact features, and evolve our platform for Twilio scale. This is a rare chance to operate like a startup within Twilio: you’ll work in a high-ownership environment with startup speed, backed by Twilio’s reach and scale. Responsibilities - Build and own the infrastructure and platform capabilities that power Stytch’s identity platform as it scales across Twilio—ensuring security, reliability, and performance for every customer. - Design, implement, and operate scalable cloud infrastructure (AWS/EKS, ECS, networking, data stores), balancing uptime, cost, and developer velocity. - Partner closely with Product and Engineering leadership to set infrastructure direction, translate platform needs into technical plans, and deliver high-impact roadmap work. - Collaborate across Twilio and Stytch teams to align on architecture, integrate platform capabilities, and unblock cross-team initiatives. - Operate with deep technical ownership: author design docs, drive key technical decisions, review code, and stay close to the systems you ship. - Build in ambiguity—break down complex problems, make pragmatic tradeoffs, and adopt new technologies or strategies when they improve outcomes. - Improve production quality and resilience through strong observability, incident response, automated remediation, and continuous reliability engineering. - Make developers’ lives easier by building self-service tooling, safer deployment patterns, and reliable platform primitives that accelerate product teams. - Mentor and support other engineers through pairing, feedback, and knowledge-sharing, helping raise the team’s technical bar and culture. Qualifications - 6+ years of experience as an Infrastructure or Platform Engineer building and operating high-scale, mission-critical cloud production systems. - Strong experience with containerization and orchestration (Kubernetes/EKS, Docker), Infrastructure as Code (Terraform, GitOps, or similar) and AWS. - Hands-on proficiency in at least one modern programming language used in production. - Experience designing and running observability and on-call systems (e.g., Datadog, ELK, Prometheus/Grafana). - Experience scaling cloud infrastructure for distributed systems, including relational databases and high-availability service architectures. - Excellent written and verbal communication skills; comfortable writing design docs and leading technical discussions. - Bachelor’s degree in Computer Science or equivalent practical experience. - Schedule: ability to work non-standard, on-call rotation weekend and holiday hours. Requirements - Experience with multi-region or global infrastructure, including disaster recovery and data replication strategies. - Familiarity with enterprise-scale platform challenges: multi-tenant infrastructure, compliance, and cost/performance optimization. - Builder at heart. Through a hobby or your profession, you are passionate about being hands-on and seeing your work come to life. Location This role will be remote, but is not eligible to be hired in San Francisco, CA, Oakland, CA, San Jose, CA, or the surrounding areas. Travel We prioritize connection and opportunities to build relationships with our customers and each other. For this role, you may be required to travel occasionally to participate in project or team in-person meetings. Benefits - Competitive pay - Generous time off - Ample parental and wellness leave - Healthcare - Retirement savings program - And much more. Offerings vary by location. Compensation Please note the salary range information provided applies only to candidates residing in California, Colorado, Hawaii, Illinois, Maryland, Massachusetts, Minnesota, New Jersey, New York, Vermont, Washington D.C., and Washington State due to local requirements. Compensation for candidates in other locations will be discussed during the hiring process. - Based in Colorado, Hawaii, Illinois, Maryland, Massachusetts, Minnesota, Vermont or Washington D.C.: $141,520 - 176,900. - Based in New York, New Jersey, Washington State, or California (outside of the San Francisco Bay area): $149,840 - 187,300. - Based in the San Francisco Bay area, California: $166,400 - 208,000. This role may be eligible to participate in Twilio’s equity plan and corporate bonus plan. All roles are generally eligible for the following benefits: health care insurance, 401(k) retirement account, paid sick time, paid personal time off, paid parental leave. The successful candidate’s starting salary will be determined based on permissible, non-discriminatory factors such as skills, experience, and geographic location. Applications for this role are intended to be accepted until March 01, 2026, but may change based on business needs. Company Description Twilio thinks big. Do you? We like to solve problems, take initiative, pitch in when needed, and are always up for trying new things. That's why we seek out colleagues who embody our values — something we call Twilio Magic. Additionally, we empower employees to build positive change in their communities by supporting their volunteering and donation efforts. So, if you're ready to unleash your full potential, do your best work, and be the best version of yourself, apply now! If this role isn't what you're looking for, please consider other open positions. Twilio is proud to be an equal opportunity employer.

View details: Senior Infrastructure Engineer

United States

$141.5K - $208K / year

Apply

Site Reliability Engineer - Infrastructure

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

Infrastructure Engineer

Cloud Infrastructure Engineer

Network Infrastructure Engineer

Senior Infrastructure Engineer