Job Closed
This listing is no longer active.
A new platform for working with data
Staff Infrastructure Engineer
Location
New York + 1 moreAll locations: New York | California
Posted
98 days ago
Salary
$215K - $270K / year
Seniority
Senior
Job Description
Staff Infrastructure Engineer
Hex
About the roleWe're seeking an experienced infrastructure engineer to join us as a technical leader who will shape the future of our platform architecture! You'll work directly with our engineering leadership to drive infrastructure strategy, mentor our growing team, and build systems that scale with our ambitious growth plans. We recently raised a Series C and are experiencing rapid growth not just in the number of customers and users, but also in the kinds of data workflows we can support with our kernel compute backend. This isn't a hands-off leadership role – you'll be deeply technical while providing strategic direction. We need someone who has strong opinions backed by experience and isn't afraid to make the hard decisions that come with rapid scaling. What you will doStrategic Leadership - Define and execute our infrastructure roadmap across our multi-tenant and single-tenant stacks - Establish engineering standards, practices, and tooling across the infrastructure team - Collaborate with product and engineering teams to align infrastructure investments with business objectives - Lead deep database performance optimization and scaling strategies - Lead infrastructure cost optimization and capacity planning initiatives Technical Ownership - Architect and implement scalable solutions on our AWS/Kubernetes/PostgreSQL/Redis stack - Design container orchestration strategies with Kubernetes patterns and resource optimization - Design and build robust CI/CD pipelines and deployment strategies - Drive reliability engineering practices including monitoring, alerting, and incident response - Evaluate and integrate new technologies that enhance our platform capabilities Team Development - Mentor engineers and help grow their technical skills - Participate in hiring and building out the infrastructure team - Foster a culture of technical excellence and continuous learning - Lead technical design reviews and architecture discussions About YouTechnical Expertise - 7+ years of infrastructure engineering experience with 3+ years in technical leadership roles - Deep expertise with AWS services (EC2, RDS, EKS, networking, security) - Production experience with Kubernetes orchestration and container management - Experience with database performance engineering - query optimization, execution plan analysis, and datastore selection for different workload patterns - Proficiency with infrastructure as code (Terraform, CloudFormation, or similar) - Solid understanding of application deployment and scaling - Knowledge of security best practices and compliance frameworks Leadership Qualities - Track record of leading technical initiatives in fast-growing companies - Strong opinions on engineering best practices with the flexibility to adapt - Excellent communication skills and ability to influence across organizations - Comfortable with ambiguity and rapid decision-making in a startup environment Startup Experience - Understanding of the unique challenges of scaling infrastructure during hypergrowth - Ability to balance technical debt with feature velocity - Experience with resource constraints and scrappy problem-solving Bonus Points - Advanced Kubernetes operators development and custom resource definitions - Linux kernel development or systems programming experience - Background with observability tools (Datadog, New Relic, Prometheus/Grafana) - Contributions to open source infrastructure projects - Experience with multi-region deployments and disaster recovery planning Our stackOur product is a web-based notebook and app authoring platform. Our frontend is built with Typescript and React, using a combination of Apollo GraphQL and Redux for managing application state and data. On the backend, we also use Typescript to power an Express/Apollo GraphQL server that interacts with Postgres, Redis, and Kubernetes to manage our database and Python kernels. Our backend is tightly integrated with our infrastructure and CI/CD, where we use a combination of Terraform, Helm, and AWS to deploy and maintain our stack. In addition to our unique culture, Hex proudly offers a competitive total rewards package, including but not limited to, market-benched salary & equity, comprehensive health benefits, and flexible paid time off. The salary range for this role is: $215,000 - $270,000 The salary range shown may be a reflection of additional factors such as geographical location and skill ranges/levels we’re open to. Placement in the salary range will be decided upon completion of the interview process, taking into account factors like leaving room for growth, internal fairness & parity, your demonstrated skills, and the depth of your experience. Our Recruiting team will be able to provide more details during the interview process. By submitting an application the candidate consents to the use of their personal information in accordance with the Hex Privacy policy: https://learn.hex.tech/docs/trust/privacy-policy.
Benefits
- 401(K), 401(K) matching, Company equity, Company-sponsored outings, Continuing education stipend, Dental insurance, Disability insurance, Flexible Spending Account (FSA), Flexible work schedule, Generous parental leave, Health insurance, Life insurance, Open office floor plan, Paid holidays, Paid sick days, Relocation assistance, Remote work program, Return-to-work program post parental leave, Free snacks and drinks, Unlimited vacation policy, Vision insurance, Some meals provided, Mental health benefits, Home-office stipend for remote employees, Hybrid work model, Company-wide vacation
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
• Designing, implementing, and managing the cloud infrastructure and platform services that power the Company’s applications, data systems, and internal tools. • Working closely with engineering, data, and product teams. • Assisting with managing cloud environments, automating infrastructure provisioning, supporting application deployments, and monitoring system performance. • Contributing to maintaining stable technology platforms and improving infrastructure processes.
• Automate infrastructure provisioning, deployment, and operational workflows • Build and improve internal tools and platforms that make other engineers more productive • Manage and evolve our AWS infrastructure using Terraform • Participate in on-call rotations and turn incident patterns into permanent fixes • Collaborate with product engineering teams to understand their infrastructure needs and reduce friction • Identify repetitive operational work and replace it with software
Infrastructure Engineer
Attune InsuranceAn insurance company passionate about small business, Attune Insurance was founded in 2016 to help brokers and their clients thrive. Headquartered in New York,
Position Summary The Infrastructure Engineer is responsible for designing, implementing, and managing the cloud infrastructure and platform services that power the Company’s applications, data systems, and internal tools. This role helps ensure systems are secure, reliable, and scalable to support ongoing business operations and growth. Working closely with engineering, data, and product teams, the Infrastructure Engineer assists with managing cloud environments, automating infrastructure provisioning, supporting application deployments, and monitoring system performance. This role contributes to maintaining stable technology platforms and improving infrastructure processes that support the Company’s products and services. Essential Duties and Responsibilities - Cloud Architecture: Architect and manage scalable, secure, and highly available cloud infrastructure primarily within Amazon Web Services (AWS), including EC2, RDS, S3, VPC, and IAM. - Infrastructure-as-Code (IaC): Lead the implementation and maintenance of infrastructure using Terraform or similar tools, ensuring all environments are version-controlled and reproducible. - Automation & Scripting: Develop and maintain automation scripts (Python, Bash, or Go) to eliminate manual tasks and improve operational efficiency. - Containerization: Manage and optimize containerized application environments using Docker and orchestration platforms (e.g., Kubernetes or AWS ECS). - CI/CD Pipeline Management: Design and support robust Continuous Integration and Continuous Deployment (CI/CD) pipelines to automate application releases and reduce deployment risk. - Observability & Reliability: Implement and manage modern observability frameworks (logging, metrics, and tracing) to proactively monitor system health and performance. - Performance Engineering: Monitor system capacity and performance, performing right-sizing exercises and scaling configurations to meet business growth. - Security & Compliance: Enforce infrastructure security best practices, including access control (Least Privilege), network hardening, and automated vulnerability scanning. - FinOps & Cost Control: Partner with leadership to monitor cloud spend and implement cost-optimization strategies. - Incident Response: Troubleshoot complex infrastructure-related issues and participate in a blameless post-mortem culture to prevent recurrence. - Documentation: Maintain high-quality technical documentation for infrastructure designs, standard operating procedures, and disaster recovery plans. - On-Call Support: Participate in an on-call rotation to ensure the 24/7 reliability of production systems. - Other duties as assigned. Qualifications Education Bachelor's degree in computer science, Information Systems, Engineering, or a related field preferred. Equivalent practical experience may be considered. Experience - 3–5 years of experience in infrastructure engineering, DevOps, systems engineering, or related technology roles. - Proven experience managing production-grade AWS environments. - Strong hands-on experience with Terraform (module development and state management). - Experience managing Linux-based systems at scale (Ubuntu, Amazon Linux, or RHEL). - Direct experience building and maintaining CI/CD pipelines (e.g., GitHub Actions, GitLab CI, or Jenkins). Knowledge, Skills, and Abilities - Cloud Proficiency: Deep understanding of AWS networking (VPC, Peering, Transit Gateways) and security groups. - Container Mastery: Proficiency in Docker; experience with Kubernetes (EKS) is highly desirable. - Scripting: Proficiency in at least one scripting language (Python preferred). - Version Control: Expert knowledge of Git-based workflows. - Problem Solving: Strong analytical skills with the ability to troubleshoot complex, distributed systems. - Communication: Excellent verbal and written communication skills with the ability to explain technical concepts to non-technical stakeholders. Preferred Qualifications - Experience with Serverless architectures (AWS Lambda, Fargate). - AWS Certified Solutions Architect or SysOps Administrator certification. - Experience with Data Infrastructure (managing Snowflake, Redshift, or Kafka clusters). - Familiarity with container orchestration technologies such as Kubernetes or ECS. Physical Requirements This position primarily operates in a professional office or remote environment and routinely uses standard office equipment such as computers and phones. The ability to remain stationary for extended periods and operate a computer is required. What We Offer: - Flexible PTO - Generous parental and caregiver leave - 401K match - Excellent medical, dental, and vision plans - Remote-first culture - Annual $1000 tuition reimbursement stipend - And more! The expected annual salary for this position is between $120,000 and $140,000, with the opportunity to earn an annual bonus of up to 15% of the base salary.
Senior Infrastructure Engineer – Cost Optimization, Efficiency
SysdigConfidently secure containers, Kubernetes and cloud services with #SecureDevOps.
• Architectural Optimization: Design and implement Kubernetes scaling strategies (HPA, VPA, Karpenter) that align resource consumption with real-time demand. • The Reliability/Cost Trade-off: Act as the technical lead in determining where we can safely optimize (e.g., Spot instances for non-critical workloads) and where we must invest in over-provisioning to protect our SLOs. • Proactive Analysis: Regularly audit cloud environments to identify underutilized resources and ghost infrastructure, providing actionable data to leadership on potential savings. • Automation & Guardrails: Develop IaC modules and CI/CD policies that prevent "cost-drift" before it happens, ensuring developers have the resources they need without excess waste. • Cross-Functional Advocacy: Partner with Finance and Product teams to translate technical infrastructure metrics into business value and cost-per-feature insights.




