Job Closed

This listing is no longer active.

BLACKBIRD.AI

Deception Detection for the Information Age.

Staff Data Engineer

Data EngineerData EngineerOther Remote LeadTeam 11-50H1B No SponsorCompany Site LinkedIn

Location

New York + 2 more

Posted

126 days ago

Salary

$160K - $190K / year

Seniority

Lead

8 yrs expEnglishApache HTTP Server AWS Azure Elasticsearch Python Apache Spark SQL

Job Description

• Design and implement scalable data platform architecture on Databricks, supporting both batch and streaming ingestion • Build robust, fault-tolerant data ingestion pipelines that integrate with multiple third-party APIs and data providers • Design and implement AI-powered enrichment stages within pipelines—applying ML clustering, generative AI summarization, classification, and entity extraction to transform raw data into actionable intelligence • Build analytical systems with full-text search capabilities using Elasticsearch for rapid querying and analysis of enriched data • Work with AI/ML researchers to implement, integrate and scaling AI processing • Expose data platform capabilities as APIs and other interfaces for downstream consumption by applications and services • Optimize data lake and lakehouse architecture for performance, cost-efficiency, and scalability • Design and implement data quality frameworks, monitoring, and alerting systems • Design efficient architectures for calling external AI APIs and managing rate limits, costs, and reliability • Architect solutions with cost-efficiency as a first-class concern, implementing monitoring and optimization strategies for compute and storage • Make critical build-vs-buy decisions and establish architectural standards for the data organization • Mentor engineers and elevate the team's technical capabilities through code reviews, design discussions, and knowledge sharing

Job Requirements

8+ years of software engineering experience with 5+ years focused on data platforms or data engineering
Deep expertise with Databricks, Apache Spark, and data lakehouse architectures
Strong experience building and operating data pipelines at scale (handling TBs+ of data)
Experience integrating AI/ML capabilities into data pipelines (clustering, LLM APIs, classification, summarization)
Proficiency in Python, DBT, and SQL for data processing and pipeline development
Experience with both batch and streaming large scale data processing patterns
Strong understanding of cloud platforms (AWS, Azure)
Excellent communication skills and ability to mentor engineers
Preferred Qualifications:**
Experience designing both batch and streaming/near real-time data architectures
Proficiency with Elasticsearch for building analytical systems with full-text search capabilities
Hands-on experience with LLM APIs and understanding of rate limiting and cost optimization
Experience with Agentic AI, context engineering, and evaluation
Background in trust & safety, security, or content moderation domains
Experience with data observability tools and building comprehensive monitoring systems
Prior experience at a startup or fast-paced environment
Apply agentic coding tools for day to day development
Familiarity with Databricks' Lakeflow, Agent Bricks, and vector databases

Benefits

Competitive compensation package, 401(k), and equity -** everyone has a stake in our growth! **
Comprehensive health benefits for you and your loved ones, including wellness days and monthly wellness reimbursements - **an apple a day doesn't always keep the doctor away! **
Generous vacation policy, encouraging you to take the time you need - we trust you to strike the right work/life balance!
A flexible work environment with opportunities to collaborate with your team in person -** you can have it all! **
Inclusion and Impact **- soar to new heights! **
Professional development stipend -** never stop learning! **

Related Categories

Data Engineer

Related Job Pages

Data Engineer Jobs in New York Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Senior Data Engineer

SmarterDx

SmarterDx, founded in 2020 in New York, New York, is a health technology company focused on clinical AI solutions that enhance hospital revenue integrity and ca

Data Engineer126 days ago

Other Remote

Company Site

• Design, develop, and maintain dbt data models that support our healthcare analytics products. • Integrate and transform customer data to conform to our data specifications and pipelines. • Design and execute initiatives that improve data platform and pipeline automation and resilience. • Participate in a rotation of engineers that diagnose, triage, and solve production data issues. • Apply industry standards and best practices to data testing, observability, and platform stability.

Airflow Apache HTTP Server AWS Informatica Apache Spark SQL

View details: Senior Data Engineer

United States

$200K - $220K / year

Apply

Job Closed

Data Engineer

Mento

Data Engineer126 days ago

Other Remote

• Build data pipelines for coaching and user analytics • Create data systems that power product features • Establish data infrastructure and architecture • Process and analyze AI/LLM outputs • Support business operations and analytics • Create accessible analytics infrastructure • Work closely with product on instrumentation and data collection

ETL PostgreSQL Python SQL

View details: Data Engineer

United States

Apply

Job Closed

Data Platform Engineer II

Jamf

The Standard in Apple Enterprise Management

Data Engineer126 days ago

Other RemoteTeam 1,001-5,000Since 2002H1B Sponsor

Company Site LinkedIn

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description Business Intelligence at Jamf powers data-driven decision-making across the organization. As a Data Platform Engineer II, you’ll be responsible not just for building & transforming data, but for owning critical data infrastructure: from ingestion and storage, to governance, quality, and consumption by analytics/ML tools. You will partner with analysts, data scientists, product owners and engineers to ensure that Jamf’s data assets are reliable, high-performance, secure, and scalable. For those candidates who live near a Jamf office, you may be expected to work periodically in-office or collaborative work location with other Jamf employees in your area for certain events or moments that matter. What you can expect to do in this role: - Design, build, maintain, and improve the data platform infrastructure (Snowflake environments, airflow workflows, orchestration, CI/CD pipelines for dbt / transformations) - Develop and maintain Terraform (or equivalent IaC) definitions for provisioning data infrastructure (compute, storage, permissions, networking where needed) - Automate deployment of data transformations (e.g. dbt CI/CD, staging / production pipelines) - Ensure data platform availability, reliability, security and performance (e.g. enforce roles & permissions in Snowflake, resource monitoring, concurrency/usage optimisation) - Instrument monitoring, logging and alerting of data workflows (Airflow / Kubernetes / dbt jobs) - Collaborate with Data Engineers / Analysts / Architects to define platform capabilities, set standards & best practices around schema design, governance, version control, and performance - Run capacity planning, ensure cost-efficiency, scaling strategy (e.g. concurrency limits in Snowflake warehouse sizing, cluster autoscaling, etc) - Facilitate onboarding of teams to the data platform: document usage patterns, create templates or utilities (for example dbt macros, shared libraries) - Participate in architecture reviews, evaluate new platform tooling (e.g. enhancements to orchestration, transformation frameworks, security strategy, etc) - Troubleshoot critical incidents and participate in incident / post-mortem cycles for platform issues Qualifications - Minimum of 3 years experience building data pipelines with Python (Required) - Minimum of 3 years experience working with data warehouse or other cloud based database technology, with strong proficiency in SQL. (Required) - Experience with Docker / Kubernetes (Required) - Exposure to Infrastructure-as-Code (IaC) such as Terraform or DevOps (Required) - Experience working with dbt (Preferred) - Strong experience with cloud infrastructure: AWS (EC2, ECR, S3, Glue, RDS, etc) or equivalent public cloud provider - Hands-on experience in CI/CD, version control, unit / integration testing for data pipelines - Comfortable working in agile teams, and mentoring others - Strong Communication Skills - Excellent Interpersonal Skills - Excellent Organizational Skills - Proven Analytical Skills - Ability to communicate complex technical terms in an easy to understand, non-technical manner - Ability to interact effectively with co-workers in a result driven culture - Self-starter, energetic multi-tasker, highly motivated and team player - Ability to engage with and establish trust and rapport with all levels of customers and employees - Agile practitioner experienced in Scrum or Kanban - General knowledge of Apple products and eco-systems - Bachelor's Degree in Mathematics, Computer Science or related field (Required) - A combination of relevant experience and education may be considered Requirements - Participation in ongoing security training is mandatory - Established security protocols will be adhered to, sensitive data will be handled responsibly, and data protection practices are followed, including understanding relevant privacy regulations and reporting breaches - Acknowledging the Jamf Code of Conduct, where applicable security and privacy policies can be found, is a requirement of all roles at Jamf Benefits - Named a 2025 Best Companies to Work For by U.S. News - Named a 2024 Best Technology Company to Work For by U.S. News - Named one of Forbes Most Trusted Companies in 2024 - Named a 2024 Best Companies to Work For by U.S. News - Opportunity to make a real and meaningful impact for more than 75,000 global customers - Support for new innovations and OS releases the moment they are made available by Apple - Work with a small and empowered team where the culture is based on trust, ownership, and respect - Clear career path that enables you to grow under supportive leadership and management - Access to the Jamf Engineering blog for insights on innovative projects - Pay Transparency Range: $85,100 — $181,700 USD

Python Snowflake SQL Docker Kubernetes Terraform AWS dbt CI/CD

View details: Data Platform Engineer II

United States

$85.1K - $181.7K / year

Apply

Job Closed

Data Architect, AWS, DataBricks, MySQL

Solvd, Inc.

Get things Solvd. | Software Development & QA

Data Engineer126 days ago

Full Time RemoteTeam 501-1,000Since 2010H1B No Sponsor

Company Site LinkedIn

• Define the target architecture for Customer360 on Snowflake or Databricks, including ingestion patterns, modeling standards, and governance. • Design and lead the Golden Record / identity resolution approach (deterministic matching first), including identifiers, survivorship rules, confidence scoring. • Create the canonical customer model (core entities/relationships) and align marts/domains (e.g., insurance, cards, loans) into a unified customer layer. • Establish data quality frameworks: checks (null/uniqueness/RI/thresholds), monitoring/alerts, lineage/source-of-truth mapping, and data SLAs. • Define activation-ready outputs (customer attributes, segments, eligibility indicators) and support low-latency enablement patterns where needed.

Apache HTTP Server AWS Apache Kafka MySQL NoSQL SQL

View details: Data Architect, AWS, DataBricks, MySQL

Argentina

Apply

Job Closed

Staff Data Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Senior Data Engineer

Data Engineer

Data Platform Engineer II

Data Architect, AWS, DataBricks, MySQL