Job Closed

This listing is no longer active.

Onebrief logo
Onebrief

Software for rapid military planning: make planning fast enough for today's environment

Principal Infrastructure Engineer

Infrastructure EngineerInfrastructure EngineerOtherRemoteLeadTeam 1-10Since 2019H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

122 days ago

Salary

$220K - $270K / year

Seniority

Lead

Bachelor Degree10 yrs expEnglishAnsibleAWSKubernetesPythonTerraform

Job Description

Principal Infrastructure Engineer

Onebrief

• Define and execute a one- to two-year technical vision for Onebrief’s infrastructure in partnership with engineering leadership • Design and evolve a deployment strategy focused on AWS and on-prem environments, with support for partner-hosted deployments in other cloud platforms as required • Build security and compliance directly into the infrastructure lifecycle through automation and policy-driven systems • Identify and resolve the most complex technical debt and performance bottlenecks affecting the broader Product and Engineering organizations • Design and lead the implementation of the next generation of the Onebrief’s platform infrastructure

Job Requirements

  • 10+ years of hands-on experience building and scaling infrastructure systems
  • 5+ years of experience in Platform Engineering, DevOps, or Site Reliability Engineering
  • Proven experience leading organization-wide infrastructure initiatives and executing roadmaps aligned with business objectives
  • Deep experience operating in highly regulated, secure, and/or air-gapped environments, including the Department of War or the Intelligence Community
  • Expert-level knowledge of Kubernetes, including multi-cluster design, security, and the broader CNCF ecosystem
  • Experience designing enterprise-grade infrastructure-as-code and automation frameworks using tools such as Terraform or Ansible
  • Proficiency in Python and Bash for building custom tooling and automation
  • Ability to communicate complex technical strategies to engineers and stakeholders and build alignment around long-term architectural decisions
  • Willingness to obtain and maintain eligibility for a Top Secret clearance with SCI

Benefits

  • Offers Equity

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

Sysdig logo

Staff Infrastructure Engineer

Sysdig

Confidently secure containers, Kubernetes and cloud services with #SecureDevOps.

OtherRemoteTeam 201-500Since 2013H1B Sponsor

• Participate in the design, implementation, and maintenance of Sysdig's Infrastructure at scale on different clouds and on-prem. • Create / improve automation tools for infrastructure deployment, monitoring, and maintenance. • Participate in incident response efforts, perform root cause analysis, and implement preventive measures. • Ensure compliance with industry standards for security and data protection. • Participate in handling infrastructure of Sysdig's Data-stores and the unique problems of data-stores resiliency, scalability and cost optimization. • Design solutions with security controls embedded as first‑class requirements. • Guide and influence technical decisions across teams toward strong, aligned outcomes.

United States
$163K - $204K / year
Job Closed
ChipStack logo

Staff ML Engineer - Infrastructure

ChipStack

Founded in 2023, ChipStack delivers agentic AI products that enable RTL designers and design verification engineers to quickly perform verification tasks traditionally requiring specialized expertise, dramatically reducing time and shifting verification earlier in the design cycle while empowering verification engineers with unprecedented design insight. ChipStack's five-agent suite reduces unit verification time from weeks to days. The suite begins by creating a "design intent" mental model that transforms the verification process through deeper design understanding. The agents then provide guided test plan creation, automated testbench generation and updates, verification tool execution, and intelligent debug assistance across formal verification, unit simulation, UVM, and functional coverage. Engineers can control the process using intuitive natural language interfaces.

OtherTeam 21

About Us Chips are at the center of today's tech-driven world. But how we design them has not changed in decades, while their complexity and specialization have skyrocketed due to increasing performance demands from applications like AI. We want to change that. Our team is small, technical, and fast-moving. We’ve built and shipped at the intersection of AI, EDA, and systems software, with deep roots at companies like Qualcomm, Nvidia, Google, Meta, and the Allen Institute for AI. We’re backed by top investors including Khosla Ventures, Cerberus, and Clear Ventures, and already deployed with 10+ innovative customers—from Fortune 100s to cutting-edge AI silicon startups. About This Role This role offers a unique opportunity to be part of the founding team at ChipStack, where we are reinventing how modern silicon chips are designed. You will work alongside highly experienced chip designers who have built complex chips, ML scientists who have trained LLMs at scale, and top-notch infrastructure and software engineers. You will get to leverage your experience building ML and data infrastructure and apply it to some of the hardest problems in chip design. About You You want to be at a startup because you love to be at the center of all the dynamism that a startup offers. You are willing to put in the hours and go the extra mile to ensure every customer has an exceptional experience. You are self-motivated with a sense of urgency and can operate independently without much guidance. You are not afraid of difficult problems and enjoy venturing into areas you have not explored before. This Role We’re looking for a strong, experienced ML Infrastructure Engineer to join our founding team. We are seeking someone with experience designing and scaling ML infrastructure and training pipelines. You’ll be responsible for building the core infrastructure that enables training, fine-tuning, evaluation, and deployment of LLMs across cloud and on-premise environments. Your work will directly impact product capabilities and speed of iteration. What's needed 5+ years of experience in ML infrastructure or adjacent roles Deep expertise in Python and experience with training frameworks like PyTorch or TensorFlow Strong systems engineering skills and experience with distributed training, data pipelines, and performance optimization Experience deploying ML models to production (REST APIs, batch jobs, streaming pipelines) Proficiency with cloud platforms (e.g., GCP, AWS) and containerized systems (Docker, Kubernetes) Experience managing GPU/TPU workloads efficiently Good communication skills and the ability to work directly with engineers and customers Prior experience training or fine-tuning LLMs Experience setting up observability, monitoring, and evaluation pipelines for ML models What's good to have Exposure to chip design fundamentals (via coursework or elsewhere) Experience at an early-stage startup Our Culture Challenge status quo : We are innovators who can challenge the status quo and push forward our vision of the world. Strong opinions, loosely held : We are low on ego, but high on collaboration. We are okay to be wrong and are always open to learning. Ship fast, ship quality : We ruthlessly prioritize what matters. We build a few things, but at lightning speed with high quality. Proud of our craft : Attention to detail is in our DNA. We take pride in what we build and ensure they exceed the high standards of the semiconductor industry.

Washington
OtherRemoteTeam 51-200Since 2002H1B No Sponsor

• Support and maintain the ServiceNow platform and its underlying infrastructure in a disconnected (dark site / air-gapped) environment • Design, deploy, configure, and maintain secure, highly available infrastructure including servers, virtualization platforms, operating systems, and application deployments • Deploy, configure, and manage VMware vCenter, ESXi hosts, virtual machines, templates, and resource pools • Administer Windows Server and RHEL environments, including Active Directory and identity services • Create and maintain OVAs, golden images, and VM templates for consistent deployments • Automate provisioning and maintenance using PowerShell, Bash, or similar tools • Ensure scalability, availability, and operational continuity • Provide infrastructure-level support for a disconnected ServiceNow deployment • Assist with application deployments, configuration, and troubleshooting • Manage storage, basic networking, and backup/restore processes (e.g., Cohesity or similar) • Provide light support for related systems such as MS SQL tools (WSUS, Volume Activation) • Collaborate with Architecture, DevOps, Security, and application teams to deliver compliant solutions • Produce and maintain infrastructure diagrams, build guides, and compliance documentation

United States
$140K - $170K / year
Job Closed
OtherRemoteTeam 51-200Since 2002H1B No Sponsor

We are seeking a hands-on Infrastructure Engineer with a Secret clearance to support and maintain the ServiceNow platform and its underlying infrastructure in a disconnected (dark site / air-gapped) environment. This role is responsible for designing, deploying, configuring, and maintaining secure, highly available infrastructure including servers, virtualization platforms, operating systems, and application deployments. The ideal candidate is proactive, security-focused, and experienced with VMware, Windows and Linux administration, automation, and restricted/offline environments. Basic database support may be required; deep DBA expertise is not necessary. Infrastructure & Virtualization - Deploy, configure, and manage VMware vCenter, ESXi hosts, virtual machines, templates, and resource pools. - Administer Windows Server and RHEL environments, including Active Directory and identity services. - Create and maintain OVAs, golden images, and VM templates for consistent deployments. - Automate provisioning and maintenance using PowerShell, Bash, or similar tools. - Apply security hardening and compliance standards (CIS, NIST, STIGs). - Support offline patching, updates, and secure change management in air-gapped environments. - Ensure scalability, availability, and operational continuity. Platform & Application Support - Provide infrastructure-level support for a disconnected ServiceNow deployment. - Assist with application deployments, configuration, and troubleshooting. - Manage storage, basic networking, and backup/restore processes (e.g., Cohesity or similar). - Provide light support for related systems such as MS SQL tools (WSUS, Volume Activation). Database (Secondary) - Perform basic MariaDB/MySQL administration including configuration, backups, monitoring, and minor tuning. - Assist with availability and recovery activities as needed. Collaboration & Documentation - Partner with Architecture, DevOps, Security, and application teams to deliver compliant solutions. - Produce and maintain infrastructure diagrams, build guides, deployment procedures, and compliance documentation. - Contribute to incident response and root-cause analysis in secure environments.

United States
Job Closed