Pioneer of the Connected Operations Cloud
Senior Software Engineer – Data Platform
Location
United States
Posted
4 days ago
Salary
$130.9K - $220K / year
Seniority
Senior
Job Description
Senior Software Engineer – Data Platform
Samsara
• Design, build, and operate high-scale data ingestion and replication systems from Samsara’s primary production data stores, including RDS, DynamoDB, internal APIs, and event-driven systems, into our data lakehouse. • Build and maintain reliable, scalable, and modern data platform infrastructure capable of handling petabytes of data across Samsara’s analytics, AI, product, and operational use cases. • Improve the reliability, observability, scalability, security, and developer experience of Samsara’s Spark and Databricks-based data processing platform. • Develop internal libraries, APIs, frameworks, and tooling in languages such as Go and Python to help teams across Samsara move, process, discover, and access data safely and efficiently. • Work on foundational data lake and lakehouse technologies, including Delta Lake on S3, data catalogs, metadata services, orchestration systems, and platform automation. • Collaborate closely with infrastructure, product engineering, data science, analytics, security, and data engineering teams to understand platform needs and deliver durable, scalable solutions. • Stay connected to modern data platform technologies and help shape Samsara’s long-term data infrastructure roadmap, including support for AI, privacy, security, global scale, and customer-facing data products. • Champion, role model, and embed Samsara’s cultural principles as we scale globally and across new offices.
Job Requirements
- 4+ years of professional software engineering experience in production environments.
- 4+ years of experience building or maintaining large-scale production data infrastructure, data platforms, distributed systems, or data lake systems.
- Strong experience with Apache Spark or similar distributed data processing systems.
- Experience operating production infrastructure in AWS, including services such as S3, RDS, DynamoDB, SQS, Kinesis, Lambda, or similar.
- Experience designing, building, and operating reliable systems with strong ownership of scalability, observability, security, and operational excellence.
- Proficiency in at least one production programming language such as Go, Python, Scala, or Java.
- Ability to collaborate effectively with cross-functional partners, including software engineers, data scientists, analysts, security teams, and product stakeholders.
Benefits
- flexible, employee-led remote model
- professional development stipend
- comprehensive health and parental leave plans
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Design, build, and operate high-scale data ingestion and replication systems from Samsara’s primary production data stores, including RDS, DynamoDB, internal APIs, and event-driven systems, into our data lakehouse. • Build and maintain reliable, scalable, and modern data platform infrastructure capable of handling petabytes of data across Samsara’s analytics, AI, product, and operational use cases. • Improve the reliability, observability, scalability, security, and developer experience of Samsara’s Spark and Databricks-based data processing platform. • Develop internal libraries, APIs, frameworks, and tooling in languages such as Go and Python to help teams across Samsara move, process, discover, and access data safely and efficiently. • Work on foundational data lake and lakehouse technologies, including Delta Lake on S3, data catalogs, metadata services, orchestration systems, and platform automation. • Collaborate closely with infrastructure, product engineering, data science, analytics, security, and data engineering teams to understand platform needs and deliver durable, scalable solutions. • Stay connected to modern data platform technologies and help shape Samsara’s long-term data infrastructure roadmap, including support for AI, privacy, security, global scale, and customer-facing data products. • Champion, role model, and embed Samsara’s cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team) as we scale globally and across new offices.
• Design and implement scalable data platforms and pipelines across cloud environments • Develop reliable batch, streaming, and near-real-time pipelines using technologies such as Spark and Delta Lake • Build ingestion, transformation, and curation workflows for both structured and unstructured data • Implement modern data architectures including lakehouse patterns and medallion layering • Deliver high-quality datasets that support analytics, machine learning, causal modeling, and optimization systems • Enable data pipelines for GenAI use cases • Design scalable logical and physical data models • Orchestrate workflows using tools such as Airflow, dbt, Lakeflow, or equivalents • Apply modern architecture patterns including event-driven and streaming architectures • Establish strong data observability, including monitoring of data freshness, pipeline reliability, and SLA adherence • Enable data serving layers to support downstream systems
• Design and implement scalable data platforms and pipelines across cloud environments (Azure/Fabric, AWS, GCP, Databricks, Snowflake) • Deliver high-quality datasets that support analytics, machine learning, causal modeling, and optimization systems • Design scalable logical and physical data models for analytical and operational use cases • Apply modern architecture patterns including event-driven and streaming architectures • Establish strong data observability, including monitoring of data freshness, pipeline reliability, and SLA adherence • Enable data serving layers (APIs, feature inputs, analytical endpoints) to support downstream systems
• Design and implement scalable data platforms and pipelines across cloud environments (Azure/Fabric, AWS, GCP, Databricks, Snowflake) • Developing reliable batch, streaming, and near-real-time pipelines • Ensure adherence to best practices in data governance, lineage, quality, and access control (RBAC/ABAC) • Establish strong data observability, including monitoring of data freshness and pipeline reliability • Work closely with data scientists, ML engineers, analysts, and business stakeholders to translate requirements into robust data solutions

