Insulet is a medical device company dedicated to improving the lives of people on insulin and other injectable drugs by enabling medicine delivery without the u

Senior Data Engineer, Reliability and Observability

DevOps EngineerDevOps EngineerFull Time Hybrid Senior Company Site

Location

Massachusetts

Posted

39 days ago

Salary

$118.8K - $178.2K / year

Seniority

Senior

Bachelor Degree

Job Description

Title: Senior Data Engineer, Reliability & Observability (Hybrid - Acton, MA) Location: Acton United States Full time Job Description: Job Title: Senior Data Engineer, Reliability & Observability FLSA Status: Exempt Company Overview Insulet started in 2000 driven to achieve our mission of enabling our customers to enjoy simplicity, freedom and healthier lives through the use of our Omnipod product platform. In the last two decades we have improved the lives of hundreds of thousands of patients who have insulin-requiring diabetes by using innovative technology that is wearable, waterproof, and lifestyle accommodating. We are on an exciting trajectory of significant growth and global expansion enabling us to reach more patients around the globe. We are looking for highly motivated, performance driven individuals who want to be part of building our Center of Excellence and be at the forefront of our rapidly growing global footprint. We are looking to hire amazing people who are guided by shared values and desire to exceed customer expectations. Our continued success depends on it. Team Overview The Pod Software Reliability (PSR) team is focused on improving the reliability, robustness, and observability of embedded software systems through scalable automation, data-driven insights, and close cross-functional collaboration. PSR defines reliability scenarios and metrics; automation and lab functions execute large-scale testing and aggregate data; DevOps integrates those capabilities into CI/CD; and leadership consumes dashboards and reports to make informed engineering and release decisions. Role Summary We are seeking a highly skilled Senior Data Engineer, Reliability & Observability to architect, build, and evolve the data foundation that powers PSR's reliability insights. This role will lead the design of scalable data models, resilient ingestion patterns, schema strategy, and observability architecture across automated testing, lab execution, and CI-integrated reliability workflows. This individual will operate with a high degree of independence and technical judgment, shaping how reliability and automation data is structured, governed, and consumed across engineering and leadership stakeholders. The role requires the ability to design solutions in areas where standards may be immature, fragmented, or still evolving, and to translate ambiguous reliability needs into durable, scalable data architectures. The Senior Data Engineer will serve as a technical advisor across PSR initiatives, influencing architecture decisions, guiding best practices, and helping other engineers adopt scalable approaches to data modeling, telemetry, and observability. This role is intended for someone with deep prior experience in data engineering and observability systems who can reduce operational fragility, improve long-term maintainability, and accelerate self-service visibility across the organization. Key Responsibilities - Architect and own scalable, maintainable, and extensible data models for reliability, automation, lab, and telemetry-generated data. - Lead the design and evolution of database schemas that support cross-test analytics, traceability, repeatability, and long-term platform growth. - Design and implement robust ingestion and transformation pipelines across automated test systems, CI/CD workflows, lab infrastructure, and supporting engineering tools. - Define and standardize shared identifiers, metadata strategies, and data contracts that enable reliable correlation across runs, sessions, devices, builds, environments, and programs. - Design complex or novel data solutions in areas where standards, tooling, or historical patterns are limited, fragmented, or outdated. - Provide technical guidance and advisory support to reliability engineers, lab engineers, software/test automation engineers, and DevOps partners on data architecture, observability design, and scalable reporting patterns. - Influence and help establish engineering standards for schema design, ingestion patterns, telemetry structure, data governance, and dashboard consumption. - Enable observability workflows by structuring and integrating metrics, logs, events, and related telemetry into fit-for-purpose systems that support both operational debugging and strategic analysis. - Support the development of dashboards and reporting experiences for both technical users and leadership stakeholders, including engineering deep dives, program health reporting, and executive-level views. - Drive improvements in data quality, performance, consistency, integrity, and usability across the PSR ecosystem. - Identify architectural bottlenecks and lead remediation strategies across current data storage, schema design, pipeline reliability, and visualization workflows. - Create and maintain documentation, standards, and best practices for schema design, ingestion patterns, data governance, and dashboard enablement. - Mentor and influence other engineers through design reviews, technical recommendations, and practical guidance on scalable data solutions. - Contribute to and help shape a longer-term data and observability strategy that scales with evolving Pod programs, new test types, and future platform needs. Required Leadership/Interpersonal Skills & Behaviors - Exceptional verbal and written communication skills, with the ability to influence technical and non-technical stakeholders at multiple levels of the organization. - Proven ability to work independently with minimal supervision while driving complex technical initiatives forward. - Strong technical leadership skills, including the ability to guide, mentor, and advise engineers beyond direct project collaboration. - Demonstrated ability to navigate ambiguity, evaluate tradeoffs, and make sound architectural decisions in evolving technical environments. - Strong time management, multitasking, and prioritization skills. - Proven ability to work collaboratively across cross-functional teams and build alignment without direct authority. Qualifications - Bachelor's degree in Computer Science, Software Engineering, Data Engineering, Information Systems, or a related technical field required; Master's degree preferred. - 7+ years of experience in data engineering, analytics engineering, platform engineering, or related roles; or 5+ years of experience with an advanced degree in a related field. - Strong experience architecting relational database schemas and modeling structured engineering or operational data for scale, maintainability, and long-term reuse. - Deep SQL expertise and hands-on experience with relational databases such as PostgreSQL or equivalent platforms. - Strong experience building and evolving ETL/ELT or other data ingestion and transformation pipelines in production environments. - Experience using Python or another programming language for data processing, automation, or integration tasks. - Strong understanding of observability and telemetry concepts, including metrics, logs, events, and time-series data. - Experience creating scalable reporting and visualization solutions in tools such as Grafana, Power BI, Tableau, or similar platforms. - Demonstrated ability to influence architecture decisions and partner effectively across engineering, infrastructure, and leadership stakeholders. - Strong written and verbal communication skills, including the ability to make complex technical concepts understandable and actionable for a broad audience. Preferred Qualifications - Experience working in test automation, CI/CD, software quality, reliability engineering, or related engineering environments. - Experience with observability tooling such as Grafana, Loki, Prometheus, InfluxDB, or similar technologies. - Experience handling engineering telemetry, event-based data, time-series data, or log-heavy systems at scale. - Experience supporting embedded systems, hardware/software integration environments, or device-adjacent data platforms. - Experience in regulated industries such as medical devices, healthcare technology, or other high-quality engineering environments. - Experience defining standards, governance approaches, retention strategies, lineage, and access control patterns for shared engineering data platforms. - Experience helping organizations move from fragmented or ad hoc reporting toward unified, scalable, and strategically governed data architectures. - Demonstrated history of providing technical leadership, architectural guidance, or mentorship to engineers across teams. Preferred Skills and Competencies - Strong systems thinking and architectural judgment - Ability to balance short-term delivery with long-term maintainability and scale - High attention to detail and commitment to data integrity - Comfort operating in ambiguous spaces and defining structure where one does not yet exist - Ability to translate unclear or evolving engineering needs into practical, scalable data solutions - Strong collaboration, advisory, and influence skills across teams without direct authority - Bias toward simplification, clarity, reuse, and technical durability - Sound judgment in selecting patterns, tradeoffs, and technologies appropriate for long-term platform evolution Additional Information - The position can be hybrid or in-person at our Acton, MA facility. The preferred location is Acton, MA. - Travel is estimated at 10% but will flex depending on business needs. NOTE: This position is eligible for hybrid working arrangements (requires on-site work from our Acton, MA office; may work remotely other days). #LI-Hybrid Additional Information: Compensation & Benefits: For U.S.-based positions only, the annual base salary range for this role is $118,800.00 - $178,200.00 This position may also be eligible for incentive compensation. We offer a comprehensive benefits package, including: - Medical, dental, and vision insurance - 401(k) with company match - Paid time off (PTO) - And additional employee wellness programs

Related Categories

DevOps Engineer

Related Job Pages

DevOps Engineer Jobs in Massachusetts Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevOps Consultant

Converge Technology Solutions

Converge Technology Solutions provides specialized IT services tailored to meet customers' individual needs. The company offers a wide range of services, including advanced analyti

DevOps Engineer39 days ago

Full Time Remote

Role Description The DevOps Consultant will work with customers to embrace DevOps philosophies to empower the implementation of tools and processes which enable the rapid development and deployment of software, continuous integration/delivery, automated quality checks, and operational metrics that can be consumed by development and product teams. This role will have a blend of development and operational experience that provides them with a good understanding of the developers they are working with, while also balancing customer satisfaction and maintaining critical systems uptime. - Design, implement, and maintain reliable, scalable, and secure infrastructure and CI/CD pipelines that enable continuous delivery of software. - Automating deployments, managing cloud environments, monitoring system performance, and ensuring operational stability. - Building and maintaining CI/CD pipelines. - Automating infrastructure using Infrastructure as Code (IaC). - Monitoring system health and responding to incidents. - Implementing security and compliance controls. - Supporting development teams with deployment and environment needs. - Specialized expertise in automation, cloud platforms, system architecture, and deployment strategies. Qualifications - Proven working experience in development using languages like Java, Python, Bash, etc. - Develop logic and write complex code. - Knowledge of DevOps process. - Working knowledge of any Cloud (AWS/Azure preferred). - Experience with Containers/Docker, Kubernetes. - Experience in Jenkins, Prometheus, ELK. - Experience in Ansible, Terraform. - Excellent written and verbal communication skills. - Ability to work in a dynamic, fast paced environment with interest and ability to learn. Requirements - Bachelor’s degree in Computer Science, Programming or related field. - 4-5 years of related experience preferred. Physical Requirements - Prolonged periods of sitting at a desk and working on a computer. - Must be able to lift up to 15 pounds at times.

View details: DevOps Consultant

Canada

Apply

Senior Platform Engineer – DevOps

OZmap

Discover the best solution for documenting fiber optic networks.

DevOps Engineer39 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Evolve and structure the observability stack (metrics, logs, and traces), ensuring operational visibility and reliability; • Build and improve CI/CD pipelines with a focus on security, predictability, and deployment efficiency; • Implement DevSecOps practices (SAST, DAST, hardening, and vulnerability analysis); • Manage and evolve AWS environments (EC2, networking, IAM, and multi-environment setups); • Work closely with the development team to raise the technical level of operations; • Perform advanced troubleshooting, incident analysis (RCA), and structured post-mortems; • Automate processes and drive continuous improvements in infrastructure; • Support the evolution of the architecture towards more modern and scalable environments.

AWS Cloud EC2 Grafana Kubernetes Linux Prometheus

View details: Senior Platform Engineer – DevOps

Brazil

R$20K / month

Apply

DevOps Engineer – Backend

Air Apps

DevOps Engineer39 days ago

Full Time RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Design, implement, and maintain CI/CD pipelines to automate build, test, and deployment processes. • Manage and optimize cloud infrastructure on AWS, Azure, or Google Cloud Platform (GCP). • Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or Pulumi. • Deploy and manage containerized applications using Docker and Kubernetes. • Monitor system performance, identify bottlenecks, and improve infrastructure reliability. • Establish and enforce security best practices, including identity management, logging, and vulnerability management. • Work with development teams to improve deployment efficiency and troubleshoot infrastructure issues. • Optimize cloud resource utilization to ensure cost-effectiveness and scalability. • Automate infrastructure provisioning, configuration management, and system maintenance. • Stay up-to-date with emerging DevOps technologies, tools, and best practices.

AWS Azure Cloud Docker Google Cloud Platform Grafana Jenkins Kubernetes Linux Prometheus Python Terraform Go

View details: DevOps Engineer – Backend

Switzerland

€50K - €62K / year

Apply

Job Closed

Senior Software Engineer, Site Reliability

Babylist

Babylist eases the path to parenthood, offering helpful content, a curated store, and a universal online baby registry through which new parents can discover, request, and buy prod

DevOps Engineer39 days ago

Full Time Remote

Who We Are Babylist is the leading registry, e-commerce, and content platform for growing families. More than 9 million people shop with Babylist every year, making it the go-to destination for seamless purchasing, trusted guidance, and expert product recommendations for new parents and the people who love them. What began as a universal registry has grown into a full ecosystem for new parents, including the Babylist Shop, Babylist Health, and a flagship showroom in Los Angeles. Hundreds of brands in baby and beyond partner with Babylist to engage meaningfully with families during one of life’s most important transitions. With over $1 billion in annual GMV, and more than $500 million in 2024 revenue, Babylist is reshaping the $320 billion baby product industry. We’re helping parents feel confident, connected, and cared for at every step. As we build the generational brand in baby, our mission remains simple: to connect growing families with everything they need to thrive.To learn more, visit www.babylist.com. Our Ways of Working Babylist is remote-first with team members across the U.S. and Canada who move fast, think smart, and use AI as part of how they work every day — not as an experiment, as an expectation. We come together twice a year to build the relationships behind the work, and we hire people who are genuinely excited about what's possible and prove it through how they show up. How We Build Babylist is in the middle of a fundamental shift in how software gets made, and we are not tiptoeing into it. We are rebuilding our engineering culture around a simple belief: AI changes everything. How teams are structured, how decisions get made, how fast ideas become working software. Our engineers own problems end to end, working directly with product, design, and business partners with short feedback loops and real stakeholder access. We ship, learn, and iterate fast. When something is not working, we throw it out and start over — project failure and personal failure are not the same thing here. AI tools are as natural to our workflow as an IDE or version control. We are not exploring this, we are living it. Our engineers use AI to explore tradeoffs, pressure-test designs, and move from problem to solution in hours instead of days. They generate code with AI so they can stay focused on the decisions that actually require human judgment — not the routine ones. More velocity means more time for craft: better test coverage, stronger architecture, and deeper customer understanding. We hold ourselves to a higher quality bar because of AI, not in spite of it. We are building this playbook in real time, and we are looking for people who want to build it with us. If you have already changed how you work because of AI — or you are ready to — and you care more about shipping something great than following a prescribed process, we should talk. Our Tech Stack - Ruby on Rails - React - AWS - Sidekiq - MySQL - Redis - Native iOS and Android What the Role Is Babylist is looking for a Senior Software Engineer, Site Reliability to join our Platform team. In this position, you will play a vital role in ensuring our systems and services' stability, scalability, and reliability. You will work closely with all Babylist Engineering teams to support shared infrastructure and developer tools. Your expertise in site reliability engineering, AWS cloud infrastructure, and modern DevOps practices will be instrumental in optimizing our systems and driving continuous improvement. Who You Are - 8+ years of experience as a Site Reliability Engineer or similar role, demonstrating a strong background in maintaining highly available and scalable systems - Experience supporting high-traffic consumer-facing websites, understanding the unique challenges and considerations in maintaining such systems - Proficiency with Terraform is a must, as you will be a member of the team responsible for managing and building our AWS infrastructure using Infrastructure as Code (IaC) practices - You possess strong experience working with AWS cloud-based infrastructure and services, ensuring their reliability, performance, and security - Proficiency with Docker and Kubernetes is essential, as you will contribute to the design, deployment, and management of containerized applications in our environment - You have a solid understanding of cloud-native systems design, including CDNs, load balancers, cloud networking, DNS, caching, and distributed systems - Troubleshooting and debugging are second nature to you, allowing you to quickly identify and resolve issues across various environments - Experience designing and supporting CI systems such as CircleCI, Jenkins, or GitHub Actions - You are familiar with monitoring and alerting best practices, utilizing tools like Datadog, Cronitor, Sentry, and PagerDuty to ensure proactive identification and resolution of issues - Proven experience in on-call management best practices, including effective incident response, escalation procedures, and post-incident reviews to drive continuous improvement and ensure system reliability - You have excellent verbal and written communication skills, and the ability to collaborate effectively with cross-functional teams - You're genuinely excited about what AI can do - not just as a concept, but as something you want to get your hands on. At Babylist, every team uses AI daily, and we're looking for people who lean in. How You Will Make An Impact - Manage and build our AWS infrastructure using Infrastructure as Code (IaC) tools like Terraform. You will ensure that our EKS clusters and databases are running up-to-date versions, optimizing performance and reliability - Improve the speed and reliability of our Continuous Integration (CI) systems to support the entire Engineering Team, enabling faster and more efficient development and deployment processes - Provide support to developers in troubleshooting issues across local development, staging, and production environments - Establish, communicate, and support best practices for monitoring and alerting. This will involve setting up effective monitoring systems and defining actionable alerts for proactive incident management About Compensation We use a market-based approach to compensation. The starting salary range for this role is: US: $186,818 to $224,183 Canada: $185,600 to $232,000 CAD Your starting salary will be based on your location, experience, and qualifications, with increases over time tied to performance, role growth, and internal pay equity. Why You Will Love Working At Babylist Our Culture - We work with focus and intention, then step away to recharge - We believe in exceptional management and invest in tools and opportunities to connect with colleagues - We build products that positively impact millions of people's lives - AI tools are as natural to how we work as your IDE or version control — we're not exploring this, we're living it. Growth & Development - Competitive pay and meaningful opportunities for career advancement - We believe technology and data can solve hard problems - We're committed to career progression and performance-based advancement Compensation & Benefits - Competitive salary with equity and bonus opportunities - Company-paid medical, dental, and vision insurance - Retirement savings plan with company matching and flexible spending accounts - Generous paid parental leave and PTO - Remote work stipend to set up your office - Perks for physical, mental, and emotional health, parenting, childcare, and financial planning Important Notices Recorded Interviews Babylist uses an interview recording tool to record and transcribe interviews for evaluation purposes in accordance with applicable privacy laws. By participating in an interview, you consent to this recording and transcription. Interview Integrity At Babylist, every team uses AI daily and we love it. During interviews though, we want to see you — your thinking, your problem-solving, your creativity. All interviews and assessments should be completed independently without AI tools or third-party assistance unless we tell you otherwise. We'll always be clear when AI is welcome. Misrepresentation during the process may result in removal from consideration. Protect Yourself from Scams All official communication comes from the Babylist Talent Team via @babylist.com email addresses. We will never ask for payment, bank information, or personal financial details. If you receive outreach via WhatsApp, Telegram, or a non-Babylist email, it's not us. Verify open roles on our careers page. Connections at Babylist In line with our conflict of interest policy, please let us know if you have a family member or close personal relationship with a current Babylist employee. This helps us keep our process fair for everyone. Text Message Updates You may opt in to receive SMS updates about your application. Opting out won't affect your status. Message and data rates may apply. Reply STOP to unsubscribe or HELP for assistance. See our Privacy Policy for details.

View details: Senior Software Engineer, Site Reliability

United States + 1 more

Apply

Job Closed

Senior Data Engineer, Reliability and Observability

Job Description

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Consultant

Senior Platform Engineer – DevOps

DevOps Engineer – Backend

Senior Software Engineer, Site Reliability