Airbnb is a community based on connection and belonging.
Software Engineer, Unified Data Store Substrate Team
Location
Brazil
Posted
61 days ago
Salary
0
Seniority
Mid Level
Job Description
Software Engineer, Unified Data Store Substrate Team
Airbnb
Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. The Community You Will Join: The Unified Data Store (UDS) team is on a mission to build a reliable, scalable, and globally distributed system-of-record storage platform for Airbnb. We design, build, and operate the infrastructure that powers all critical Airbnb data, including user, listing, reservation, and financial data. Supporting over 150 million users and hosts worldwide, our work delivers a world-class user and developer experience grounded in exceptional reliability, scalability, efficiency, and security. As a member of this team, you’ll collaborate with top-tier engineers to build and evolve a modern distributed infrastructure service. You will be the expert on data storage systems, high performance infrastructure service APIs, as well as provide guidance to Airbnb product teams on the effective use of technologies in large scale systems and performance optimization. The Difference You Will Make: We’re looking to hire Engineers or Senior Engineers who are hands-on and excited to tackle broad technical challenges in the following areas: UDS Client Stack - Design, build, and operate a high-performance, highly available, and scalable data access layer that provides a seamless and unified interface for accessing online product data. You’ll abstract away the underlying complexity, such as storage, indexing, replication, security, and lifecycle management, so product developers can move faster with confidence. Tooling, Automation and Developer Experience - Empower engineers across Airbnb by simplifying how they work with data. You’ll build tools and automation to define, test, and deploy data schemas, as well as solutions to monitor, migrate, and debug production data systems, ultimately improving developer productivity and system reliability. Your Expertise: - 3-5 years of relevant industry experience - Hands-on experience in building and operating distributed systems - Good understanding of systems and infrastructure fundamentals - Ability to own and dive deeply in a complex code base - Commitment for writing clean, readable, testable, maintainable code - Excellent collaboration and communication skills in a remote-working environment - Interest in leveraging cutting-edge technologies, including AI, to build innovative solutions - Professional English fluency is required Our Commitment To Inclusion & Belonging: Airbnb is committed to working with the broadest talent pool possible. We believe diverse ideas foster innovation and engagement, and allow us to attract creatively-led people, and to develop the best products, services and solutions. All qualified individuals are encouraged to apply. We strive to also provide a disability inclusive application and interview process. If you are a candidate with a disability and require reasonable accommodation in order to submit an application, please contact us at: reasonableaccommodations@airbnb.com. Please include your full name, the role you’re applying for and the accommodation necessary to assist you with the recruiting process. We ask that you only reach out to us if you are a candidate whose disability prevents you from being able to complete our online application.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Data Engineer
Clever Digital MarketingBuilt to unlock the next stage of growth for home improvement companies.
• Implement and manage robust, scalable ETL/ELT pipelines to ingest, transform, and load data from Google Ads, Meta Ads, and Microsoft/Bing Ads APIs — as well as CRM, call tracking, and lead event sources — into our BigQuery data ecosystem • Own Bronze layer ingestion with full fidelity, audit trails, and zero silent failures • Build and maintain the permanent Cloud Run connector for the Microsoft Ads Reporting API (SOAP/OAuth2), replacing interim Dataslayer bridges as the platform matures • Integrate Pub/Sub-based event streaming for real-time lead data flows and speed-to-lead use case • Translate designs and specifications from the Data Architect into functional, production-grade data infrastructure and code • Extend the Medallion Architecture (Bronze → Silver → Gold), owning Silver layer transformation logic and Gold layer Dataform models serving Command Centre, client reporting, and future AI/ML consumers • Own the CDMID join logic that unifies CRM, ad platform, and call tracking data into a coherent lead record across all client accounts • Develop and manage data models within BigQuery, ensuring data is organized efficiently for analytics and AI/ML workloads • Apply dimensional modeling and star schema design principles to build a single source of truth for all reporting and analytics across our home improvement advertiser base • Deliver the clean, trusted, cost-metric-driven datasets that allow our team to move beyond vanity metrics to cost-per-issued-lead and cost-per-demo insights that drive real client decisions • Write Python-based automation scripts and leverage GCP services — Cloud Run, Cloud Functions, Pub/Sub, and Dataflow — to orchestrate data workflows and eliminate manual processes • Graduate CRM push workflows off Zapier and onto a reliable, auditable Cloud Run pipeline • Proactively monitor and tune data pipelines and BigQuery queries for performance and cost-efficiency • Implement data quality checks, validation rules, and monitoring across the full pipeline to ensure accuracy, completeness, and timeliness of all data assets • Build the observability and alerting layer so the team knows about data issues before clients do • Maintain living architecture documentation as a reliable source of truth across workstreams
• Own and manage the data pipeline for updating and adding records to our internal database of contact and company information • Complete data analysis requests for the engineering and product team • Evaluate and configure new data services • Own the overall improvement of our internal database of contacts and company records
Data Engineer – Databricks
AddeptoBig Data, Machine Learning, Data Science, AI, MLOps and DataOps Services
• Design scalable data processing pipelines for streaming and batch processing using Big Data technologies like Databricks, Airflow and/or Dagster. • Contribute to the development of CI/CD and MLOps processes. • Develop applications to aggregate, process, and analyze data from diverse sources. • Collaborate with the Data Science team on Machine Learning projects, including text/image analysis and predictive model building. • Develop and organize data transformations using Databricks/DBT and Apache Airflow. • Translate business requirements into technical solutions and ensure optimal performance and quality.
• Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability. • Design and implement data pipelines that process large volumes of data in both streaming and batch modes. • Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow. • Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently. • Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources. • Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing. • Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads. • Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.



