Bedrock Ocean builds and operates autonomous underwater vehicles (AUVs) that collect georeferenced ocean-floor data at commercial scale. We deliver bathymetric and imagery data products to customers through our own platform, and we’re scaling toward continuous, around-the-clock data collection campaigns spanning months at a time. Keeping vehicles in the water and data flowing reliably is a core engineering problem and this role owns the reliability of the systems on both ends of that pipeline. Headquartered in Richmond, California, Bedrock Ocean Exploration is building autonomous ocean intelligence that will enable the ocean economy to solve the world’s most pressing challenges in maritime security, infrastructure, energy, and climate.
Senior Site Reliability Engineer, Robotics & Cloud Infrastructure
Location
United States
Posted
3 days ago
Salary
$164K - $220K / year
Seniority
Senior
Job Description
Senior Site Reliability Engineer, Robotics & Cloud Infrastructure
Bedrock Ocean Exploration
Role Description We’re looking for an SRE who is equally comfortable on the robotics side—compute on the vehicle, topside operator machines, field deployments—and the cloud side: data ingestion, processing pipelines, and our customer-facing platform. You’ll build the automation, observability, and operational guardrails that let a small team run continuous AUV operations without continuous heroics, turning manual recovery steps into self-healing systems and shrinking the set of failures that only one person knows how to fix. This is a hands-on senior infrastructure role with a strong automation mandate and a shared on-call rotation. You’ll set reliability direction across vehicle-side and cloud-side systems, raise the operational bar for the team, and mentor others toward it. You’ll be a force multiplier for reliability across the company, not a ticket queue. Reports to: Head of Software. East Coast location is required to support coverage across both European operations and the East Coast during 12-hour on-call shifts. Travel to field deployments and Richmond HQ is expected (approximately 5–15%). What You’ll Do - Own reliability across the full path from vehicle to customer: AUV onboard compute (Jetson-class modules, ROS 2), topside/operator systems, cloud data pipelines, and the platform that delivers data products. - Build and extend infrastructure automation—provisioning, configuration management, deployment, and self-recovery—so that routine field operations and pipeline runs require minimal manual intervention. - Design and improve observability: metrics, logging, tracing, and alerting that give both robotics and data teams early, actionable signal across vehicle fleets and cloud services. - Drive down on-call burden by identifying and eliminating single points of failure, writing runbooks, and automating the manual steps that currently require tribal knowledge. - Participate in a shared on-call rotation covering both robotics-side and cloud-side incidents in 12-hour shifts spanning European and East Coast business hours; lead and contribute to blameless post-incident reviews. - Define and track reliability targets, availability, data yield, recovery time, tied to continuous-operations goals, and partner with robotics and data teams to meet them. - Manage cloud infrastructure on AWS (compute, storage, networking, IaC, cost, and security posture) for data processing and platform workloads. - Improve fleet- and vehicle-level configuration management, deployment safety, and rollback so changes reach the field reliably and predictably. Qualifications - 5+ years in an SRE, DevOps, or infrastructure engineering role running production systems with real uptime and on-call responsibilities, including senior-level ownership of reliability outcomes. - Experience implementing a scalable incident management and operational excellence mechanism that treats operators as customers, building processes and tooling that serve the people running operations day to day, not just the engineering team. - Strong automation instincts: comfortable scripting and building tooling in Python and/or Go and Bash, and using infrastructure-as-code (Terraform or equivalent). - Hands-on AWS experience across compute, storage, networking, and IAM, plus containerization and orchestration (Docker, Kubernetes or similar). - Working knowledge of Linux internals, networking, and observability tooling (Prometheus/Grafana or equivalents). - Comfort operating across environments that aren’t just cloud: embedded or edge compute, intermittent connectivity, and physical systems that fail in messy ways. - A reliability mindset: you instrument before you guess, you automate the second time you do something manually, and you write things down so the next person or the system can handle it without you. - Strong ownership and communication in a small, fast-moving team. Nice to Have - Experience with robotics or embedded systems: ROS / ROS 2, Jetson or similar edge compute, sensor integration. - Background supporting field operations, autonomous systems, or hardware-in-the-loop environments. - Familiarity with data pipelines and geospatial or large-binary data formats. - Experience standing up on-call practices and incident response from an early stage. - Some connection to the ocean: professional, academic, or personal. You’re excited to be around people who dive, sail, build, and explore offshore. - Active U.S. Secret security clearance or above. Compensation $164,000–$220,000 base salary annually, depending on location. The upper end of the range reflects compensation in the New York, NY metro. In addition, we offer comprehensive employee benefits and equity. Work Authorization Candidates must have legal authorization to work in the United States without visa sponsorship. Bedrock does not sponsor employment visas. Due to the nature of our government and defense work, candidates must be eligible to obtain a U.S. Secret security clearance if requested. An active Secret or higher clearance is not required to apply, but candidates who hold one are strongly preferred. Not a Fit If… - You prefer environments where cloud and hardware never mix. - You’d rather build tickets than eliminate them. - You’re not comfortable with on-call ownership on a small team. - You want to optimize existing systems, not build the reliability practice alongside the product. Company Description Bedrock Ocean builds and operates autonomous underwater vehicles (AUVs) that collect georeferenced ocean-floor data at commercial scale. We deliver bathymetric and imagery data products to customers through our own platform, and we’re scaling toward continuous, around-the-clock data collection campaigns spanning months at a time. Keeping vehicles in the water and data flowing reliably is a core engineering problem and this role owns the reliability of the systems on both ends of that pipeline. Headquartered in Richmond, California, Bedrock Ocean Exploration is building autonomous ocean intelligence that will enable the ocean economy to solve the world’s most pressing challenges in maritime security, infrastructure, energy, and climate.
Related Guides
Related Categories
Related Job Pages
More Cloud Engineer Jobs
• Participate in the migration of processes and analytical solutions from SAS to Databricks. • Develop and maintain data pipelines (ETL/ELT) using Python, PySpark and Databricks. • Analyze existing SAS code and processes, proposing alternatives and improvements for the new platform. • Automate data ingestion, transformation, and delivery processes. • Support the development of solution designs and data architecture. • Collaborate with business stakeholders to understand requirements and validate results. • Prepare and maintain technical and functional documentation for migrated solutions. • Ensure data quality, performance, and reliability during and after the migration process.
• Cloud Delivery Leadership: Lead the implementation and operationalization of cloud solutions, ensuring high-quality, on-time delivery throughout the project lifecycle. • Cloud Infrastructure Implementation: Design, deploy, and manage scalable cloud infrastructure across networking, compute, storage, identity, and platform services to support business and application requirements. • Hybrid Cloud Connectivity: Implement and maintain secure connectivity between cloud and on-premises environments using services such as Network Connectivity Center, Partner Interconnect, and HA VPN. • Cloud Security Deployment: Establish and enforce a strong cloud security posture through the implementation of cloud-native security services, including Cloud NGFW Enterprise, Cloud IDS, Cloud Armor, identity controls, and encryption capabilities. • Access and Service Controls: Implement appropriate access boundaries, segmentation, and data protection guardrails using tools such as VPC Service Controls, IAM policies, and Network Security Endpoints. • Cloud Observability and Operations: Configure and maintain logging, monitoring, alerting, and diagnostic capabilities to support platform reliability, performance management, troubleshooting, and security investigations.
Cloud Engineer
Inspira FinancialInspira Financial provides health, wealth, retirement, and benefits solutions that strengthen and simplify the health and wealth journey. With more than 7 million clients, representing over $62 billion in assets, Inspira works with thousands of employers, plan sponsors, recordkeepers, TPAs, and other institutional partners — helping the people they care about plan, save, and invest for a brighter future. Inspira relentlessly pursues better outcomes for all with our automatic rollover services, health savings accounts, emergency savings funds, custody services, and more. Learn more at inspirafinancial.com.
Role Description The Cloud Engineer will report to the Cloud Engineering Manager in the Technology Department. This role is responsible for the administration, maintenance, and support of the enterprise infrastructure for a dynamic, growing business. The environment spans private and public clouds, networks, firewalls, servers, operating systems, applications, mobile devices, process schedulers, telecommunications, and general databases. The CE will interface closely with the development teams and client service teams to integrate, support, operate, and provide infrastructure related to the Platform Architecture within Inspira. Working within a team environment, the CE participates directly in solution creation, providing hands-on support as well as operational support and training. This individual must be creative, client-focused, solutions-driven, organized, and have the ability to thrive in a dynamic environment. - Provide direct support and improve the day-to-day operations of hardware and operating systems, including Cloud Services: - Evaluate system utilization, monitor response time, and provide primary support for detection and correction of operational problems. - Coordinate and perform additions and changes to servers, network, operating systems and attached devices, including investigation, analysis, recommendation, configuration, installation and testing of new network hardware and software. - Ensure servers, operating systems and network components are implemented and adhere to the information security policies and infrastructure standards. - Utilize metrics and cloud native consumption-based services to improve cost efficiencies. - Configure and Support Istio Service Mesh and Helm Chart Deployments - Configure and Support firewalls and security appliances. - Configure and Support the Disaster Recovery and Business Resumption Plan as it relates to the backup and restoration of the technology infrastructure: - Ensure run books are updated on a regular basis. - Maintain the VMware and Cloud virtual environments. - Maintain the Microsoft Active Directory Domain. - Provide infrastructure problem resolution for various applications throughout the organization. - Provide general SQL Server database troubleshooting and support. - Utilize programming skills to design and develop programs or scripts for various repetitive functions. - Perform all duties with a focus on goals of Inspira, which includes risk mitigation. - Support inbound calls/emails, maintaining tickets within the issue tracking application related to Infrastructure Support. - Cross-train other team members to facilitate coverage. - Other duties as assigned. Qualifications - Years of Experience: 3-5 years of applicable experience. - Degree: Bachelor’s degree in Computer Science or equivalent experience. - Certification: AZ-900, AZ-104. - Minimum of 3 years of experience with: - Windows and/or Linux Operating systems that include building, security and deployment in both a physical and Cloud environment. - Microsoft Azure PaaS and SaaS solution development technologies including Azure API Management, Azure Functions, Logic Apps etc. - JSON, REST and data-based APIs and high scale performance services. - Azure Service Bus and Azure Notifications Hub. - Networking knowledge and the skill sets to configure, maintain, and support Cisco Meraki devices, Azure VNETS, VPN Gateways, Route Servers, Route Tables, and Private Endpoints. - Configure and Support Infrastructure as Code deployments utilizing CI/CD Pipelines. - Exposure to logging and monitoring tools of native Azure. - Exposure to building DR and HA solutions. - Applying rule sets in Information Security best practices and ability to apply information security methodology to operating systems, network, and databases. - Scripting languages such as Powershell, Bash, JS, Python, etc. Experience with Cloud Services Azure (preferred), Google or AWS. - Experience with BDR solutions such as Veeam, VMWare Site Recovery, and Azure Backup/Site Recovery. - Configure and Support Microsoft 365 Services (Exchange, Intune, Sharepoint, and Teams). - Experience with general SQL Server database troubleshooting. - Ability to work independently with minimal supervision. - Must have excellent written and verbal communication skills. - Strong analytical skills, follow-up capability, and problem-solving ability. - Ability to conduct research into hardware and software issues and products as required. - Ability to effectively prioritize and execute tasks in a high-pressure environment. - Ability to use strong interpersonal and presentation skills to share ideas, solutions, and strong working relationships with business units including non-technical users, technical leads, and developers. - Knowledge of TCP/IP protocol and firewalls. - Knowledge of Virtual Machines and Container concepts. - Knowledge of Security as it relates to Cloud Environments including the Shared Security Model. - Experience working with a ticketing system and internal clients. - Ability to respond to emails and text messages after hours to resolve critical issues. - Must possess strong skills in personal diplomacy and client service while consistently demonstrating a high level of motivation, commitment to teamwork, professionalism, and trustworthiness. - Strong vendor management skills. - Highly self-motivated and directed. - Experience in a high availability environment preferred. - Knowledge of ITIL/ITSM practices and framework preferred. Company Description
• Architect and develop modern back-end systems following 12-factor and cloud best practices, with a focus on reliability and quality; • Monitor and troubleshoot production errors; • Work as an engineer within agile, multidisciplinary squads; • Develop secure applications and perform code reviews; • Build unit, integration, functional, mutation, and automated tests; • Work with Git and be familiar with Git Flow; • Develop using MVC and MVVM software architectures; • Use continuous integration and continuous deployment (CI/CD) workflows; • Write reusable components, work with analytics tools, and develop RESTful APIs.



