Job Closed
This listing is no longer active.
Collaborative Robotics' mission is to create a world where humans and robots collaborate in a trusted partnership.
Senior Data Infrastructure Engineer
Location
California + 4 moreAll locations: California | Colorado | Massachusetts | Pennsylvania | Washington
Posted
132 days ago
Salary
$180K - $215K / year
Seniority
Senior
Job Description
Senior Data Infrastructure Engineer
Collaborative Robotics
• Own the full ingestion path from edge to cloud, ensuring robot telemetry, sensor data, and warehouse events are reliably captured, transported, and made available for downstream systems. • Design, build, and operate scalable pipelines and foundational data layers (streaming and batch) that deliver low-latency, reliable data for analytics, AI/ML, and product features. • Build and maintain ingestion pipelines from object storage (e.g., S3) into Databricks, including raw → staged → analytics-ready layers, supporting both streaming and batch workloads. • Own the reliability and CI/CD of the data warehouse and foundational data layers, enabling safe, repeatable deployment of schema changes, transformations, and infrastructure that analytics engineers depend on. • Implement observability, monitoring, and data quality checks to ensure pipeline correctness, detect failures or drift, and maintain trust in data used by Vista, Portal, and Scoutmap. • Scale and optimize multi-tenant data infrastructure, balancing performance, reliability, and cost-efficiency as Cobot’s customer base and data volume grow. • Collaborate directly with robotics, AI/ML, product, and analytics teams to translate product requirements into resilient data systems that unlock customer-facing features. • Establish and enforce best practices for data engineering, reliability, security, and CI/CD across ingestion, staging, and warehouse layers—owning the foundations while enabling analytics engineers to ship metrics, marts, and dashboards efficiently.
Job Requirements
- 5+ years of professional experience in data engineering or data infrastructure roles
- Strong proficiency in Python and SQL, with the ability to write production-quality, scalable, and well-tested code.
- Proven experience designing and operating ingestion pipelines and staging layers (streaming and batch) that support downstream analytics and product use cases.
- Experience deploying and managing cloud data infrastructure in AWS using infrastructure-as-code (e.g., Terraform, Kubernetes, Docker).
- Hands-on experience with cloud-based data platforms, storage systems, and infrastructure.
- Familiarity with data quality practices, testing frameworks, and CI/CD for data pipelines.
- Highly motivated teammate with excellent oral and written communication skills.
- Enjoy working in a fast paced, collaborative and dynamic start-up environment as part of a small team.
- Willingness to travel occasionally for on-site support or testing, as needed.
- Must have and maintain US work authorization.
Benefits
- Equity
- Comprehensive benefits
Related Guides
Related Categories
Related Job Pages
More Infrastructure Engineer Jobs
Site Reliability Engineer
ProArchConsulting and technology- enabled by cloud, guided by data, fueled by apps, and secured by design.
Role Description ProArch is looking for a passionate and skilled Site Reliability Engineer (SRE) to join our team. As an SRE, you will be responsible for ensuring the reliability, availability, and performance of our systems and services. You will collaborate with various teams to optimize production environments, troubleshoot performance issues, and implement best practices for service reliability. Your contributions will be critical to improving system uptime and enhancing user satisfaction. - Monitor system performance and reliability, ensuring uptime meets organizational SLAs. - Implement and maintain observability tools to gather metrics and logs for proactive issue detection. - Troubleshoot and resolve complex production issues across various components of our infrastructure. - Collaborate with software engineering teams to design and implement scalable, fault-tolerant architectures. - Develop and maintain automation scripts for deployment, monitoring, and system management. - Participate in on-call rotation to respond to production incidents and perform root cause analysis. - Contribute to capacity planning and performance tuning to ensure optimal resource utilization. - Document infrastructure, processes, and incident responses to promote knowledge sharing. Qualifications - 8+ years of experience as a Site Reliability Engineer, DevOps Engineer, or related role. - Strong experience with cloud providers such as AWS, Azure, or GCP. - Proficiency in scripting languages such as Python, Bash, or Go. - Experience with container orchestration tools like Kubernetes. - Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI). - Experience in Snowflake. - Account Admin expertise for Snowflake. - Solid understanding of networking and security principles. - Experience with monitoring and logging tools such as Prometheus, Grafana, or ELK stack. - Excellent problem-solving skills and a proactive attitude. - Strong communication and teamwork skills, with an emphasis on collaboration. Preferred Qualifications - Experience with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation. - Knowledge of service mesh architectures and modern microservices patterns. - Background in software development and familiarity with Agile methodologies.
Senior Infrastructure Engineering Manager – FedHealth, Platform
NavaBuilding simple, effective government services. Want to contribute? We're hiring!
• Guide and develop a team of 10-12 platform engineers by providing coaching, feedback, and growth opportunities, setting clear goals, managing performance, and ensuring accountability • Foster a positive, inclusive culture, support employee well-being, and lead by example, while aligning team efforts with organizational goals, removing obstacles, and enabling the team to achieve results effectively • Work closely with development, operations, security, and architect engineers to identify needs, incorporate feedback, and deliver scalable and secure platform solutions • Contribute directly to the design, architecture and implementation of platform features and capabilities • Manage platform engineering budgets, coordinate resource allocation, and optimize costs for cloud and on-premise infrastructure • Define and enforce platform engineering standards, documentation, automated workflows, and compliance requirements • Ensure thorough documentation and facilitate knowledge sharing across technical teams, promoting internal awareness and adoption of platform solutions across Nava
Cloud Infrastructure Engineer – GCP
EgenEngineering new possibilities with platforms, data, and generative AI
• Implement cloud-based IaC solutions • Develop and implement automation to support continuous delivery and continuous integration solutions • Use GCP services to deploy highly available, scalable, and secure applications • Implement workflows to automate the release and upgrade process for applications in Development, Test, and Production environments. • Implement secure integrations using GCP security and networking technologies • Administration and engineering of IAM user Role-Based Access Controls and processes • Create and update support documentation and standards. • Develop automated methodologies for deployment activities, configuration management, supporting systems, and business processes. • Investigate and contribute to solving various issues in production environments.
Role Description We are looking for a Site Reliability Engineer to join our Network and Security Operations Center (NOC), a team at the heart of platform reliability for mission-critical SaaS environments. You will help maintain, optimize, and ensure the reliability and performance of the systems that power our cloud infrastructure across AWS and Kubernetes, with a strong focus on automation, observability, and continuous improvement. This role blends reliability engineering with incident command, giving you real ownership over uptime, performance, and innovation. You will be part of a highly skilled team that values creative problem-solving, operational excellence, and continuous improvement through automation and resilience engineering. Your Responsibilities - Collaborate with other Engineering teams to support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews. - Innovate relentlessly: Identify pain points, propose creative solutions, and drive initiatives that simplify, scale, and strengthen the platform. - Maintain services once they are live by measuring and monitoring availability, latency and overall system health. - Own observability: Enhance and expand monitoring and alerting using Datadog; define SLOs/SLIs and create actionable dashboards that drive reliability outcomes. - Drive automation: Develop and improve internal tooling, IaC frameworks, and pipelines (Terraform, GitLab CI/CD) to reduce manual intervention and enable self-healing systems. - Scale systems sustainably through automation and evolve systems by pushing for changes that improve reliability and velocity. - Act as an agent orchestrator using Amazon Kiro: run multiple activities in parallel by leveraging AI agents to accelerate execution, while personally validating results and completing selected tasks manually when needed. - Be on-call. - Practice sustainable incident response and blameless postmortems. Lead post-incident reviews (RCAs) and identify long-term fixes that improve stability, reliability, and developer experience. - Implement monitoring, logging, alerting, and SLA reporting. - Create and maintain technical documentation. - Implement, maintain and mature SRE best practices. - Lead incidents: Act as Incident Commander for incidents; coordinate cross-team response, manage communications, and ensure rapid service restoration. - Provide support for our planning and deployment teams to enable stability, predictability, and scale in our continued growth. - Collaborate with members of the Platform Engineering team to implement and support far-reaching strategic efforts, provide constructive feedback, and foster a collaborative environment. - Work cross-functionally with internal teams and vendors to manage our growth around the globe, with a strong focus on maintaining the high level of performance, availability, and reliability for our users. Qualifications - 5+ years in Site Reliability, Cloud, or DevOps Engineering, ideally in SaaS or large-scale production environments. - Experience designing and deploying large scale systems, multi-vendor platforms and globally distributed infrastructure. - Proven experience managing cloud infrastructure in AWS (multi-account, VPC, EC2, EKS) and Kubernetes at scale. - Strong hands-on experience with IaC and automation (Terraform, Ansible, or similar). - Familiarity with CI/CD pipelines and release automation (GitLab preferred, Jenkins acceptable). - Deep understanding of monitoring and observability using Datadog (or equivalent), including metric design, log pipelines, alerting, and dashboards. - Experience with incident management, on-call participation, escalation, and structured postmortems. - Scripting skills in Python, Bash, Java or equivalent for automation and diagnostics. - Curiosity, ownership, and a bias for action; you see a problem, you solve it, and you share the lessons learned. - Experience with Fedramp compliance is a strong asset. - Basic knowledge of Java- or .Net-based development required. - Strong English communication skills, both written and spoken, are essential for effective correspondence with customers, business partners and colleagues beyond the province of Quebec. Requirements - Escalation on-call rotation. - Occasional travel (quarterly offsites, conferences – less than 10%). Benefits - We understand that experience comes in many forms and that careers are not always linear. If you don't meet every requirement in this posting, we still encourage you to apply. - At Tecsys, we are committed to fostering a diverse and inclusive workplace where all employees feel valued, respected, and empowered. - We believe that diversity drives innovation and strengthens our ability to deliver exceptional solutions. - We welcome and encourage applicants from all backgrounds, experiences, and perspectives to join our team. - Tecsys is an equal opportunity employer. Accommodation is available for applicants selected for an interview. - NB: if you are applying to this position, you must be a Canadian Citizen or a Permanent Resident of Canada, OR, have a valid Canadian work permit.




