Senior Data Engineer

Data EngineerData EngineerFull TimeRemoteSeniorTeam 1,001-5,000Since 1980H1B SponsorCompany SiteLinkedIn

Location

Illinois

Posted

3 days ago

Salary

$108.1K - $125.3K / year

Seniority

Senior

Job Description

Senior Data Engineer

CCC Intelligent Solutions

• Senior Data Engineers for various and unanticipated worksites throughout the U.S. (HQ: Chicago, IL). • Develop large scale end to end data pipeline applications, covering multiple data sources spread across data center and AWS cloud. • Use developed software applications to locate and analyze source data; create data flows to extract, profile, and store ingested data; define and build data cleansing and imputation; map to a common data model; transform to satisfy business rules and statistical computations; and validate data content. • Produce software data building blocks, data models, and data flows, such as dimensional data, data feeds, dashboard reporting, and data science research and exploration. • Produce automated software tests of data flow components and for data content quality. • Automate orchestration and error handling for use by production operation teams. • Provide technical expertise to diagnose errors from production support teams. • Guide junior team members in performance tuning applications in distributed computing environments. • Perform root cause analysis on all data and processes and identify opportunities for improvement. • Develop metadata-driven and fully parameterized data processing tools. • Mentor junior engineers.

Job Requirements

  • Master’s degree in Computer Science, Computer Engineering, Management Information Systems or related field plus 2 years of experience in software development/data processing or analysis required.
  • Hands-on experience with: Programming using Python & PySpark; Hadoop; HDFS, map-reduce, YARN, AWS EMR, Redshift, Terraform; Hive, HBase, parquet, ORC, Spark SQL, Sqoop, Apache Hudi; Orchestrating ETL pipelines involving data sourcing, transformations & publishing using Apache Airflow; Performance tuning applications in distributed computing environments; Designing & developing data pipeline applications with Apache Kafka; Advanced SQL for data profiling & data validation; Unix commands & scripting; performing root cause analysis on internal & external data & processes to identify opportunities for improvement; JIRA, Gitlab, Subversion; Development of metadata-driven & fully parameterized data processing tools; AWS.

Benefits

  • 401K Match
  • Paid time off
  • Annual Incentive Plan
  • Performance Bonus
  • Comprehensive health insurance
  • Adoption Assistance
  • Tuition Reimbursement
  • Wellness Programs
  • Stock Purchase Plan options
  • Employee Resource Groups

Related Categories

Related Job Pages

More Data Engineer Jobs

EY logo

Data Engineering Manager

EY

Building a #BetterWorkingWorld by providing trust through assurance and helping organizations grow, transform & operate.

Data Engineer3 days ago
Full TimeRemoteTeam 10,001+Since 1989H1B Sponsor

• Lead and mentor a team of data engineers • Define and drive the overall data architecture strategy • Oversee the design and implementation of data ingestion frameworks and integration solutions • Develop and manage CI/CD pipelines • Collaborate closely with clients and internal stakeholders • Act as a trusted advisor to clients • Ensure adherence to data governance and security standards • Drive the adoption of DevOps/DataOps principles within the team • Manage project priorities and delivery timelines

India
Full TimeRemoteTeam 501-1,000H1B Sponsor

• Support scalable data operations through development of ETL processes, SQL-based integrations, Power Platform solutions, and Power BI reporting capabilities. • Design, build, maintain, monitor, and troubleshoot data-processing automations. • Develop and maintain data ingestion pipelines from external sources into SQL databases. • Manage automated flows to trigger Logic Apps and handle lightweight processes. • Perform full-stack BI development including data modeling, DAX development, and report publishing. • Leverage Microsoft Fabric as the unified access platform. • Ensure alignment with security and compliance requirements. • Conduct root-cause analysis to reconcile discrepancies between systems.

Washington
$125K - $150K / year
Full TimeRemoteTeam 501-1,000

Role Description We are looking for a Senior Data Engineer to join the Innovation team as a core member of the PF-LLM programme — our initiative to build a from-scratch multivariate time-series foundation model across a fleet of ~1,000 wind and PV sites. You will be the connective tissue of the entire programme: - Owning the data foundation that makes state-of-the-art model training possible. - Managing the inference service that makes model outputs usable. - Overseeing platform integration that puts those outputs in front of pilot customers. - From production ETL through to shadow-mode validation pipelines, you will be the engineer who keeps every track moving. This role is critical-path from day one. Qualifications - 6+ years of back-end and data engineering experience, with a proven track record of shipping production systems. - Production-grade ETL/ELT pipeline design at scale: idempotency, retry logic, backfill jobs, incremental loading, and cost-controlled warehouse compute. - Schema design and data modelling across heterogeneous sources — experience reconciling signals from disparate systems into a canonical, queryable format. - Data quality engineering: automated quality gates (sparsity, flatline detection, outlier flagging, freshness checks), alerting pipelines, and dataset versioning for ML reproducibility. - API design and development: RESTful inference services with contract testing, latency and throughput budgeting, and structured observability (logs, metrics, traces). - Experience integrating ML model outputs into SaaS product surfaces: auth and authorisation, customer isolation, and feature flag management. - Cloud infrastructure proficiency (AWS preferred), containerisation (Docker, Kubernetes), and CI/CD pipeline ownership. - Python and SQL as core tools; hands-on experience with modern warehouse technologies (Snowflake, BigQuery, or Databricks). - Pipeline orchestration with Airflow, Prefect, Dagster, or equivalent. - Excellent written and verbal communication skills in English. Requirements - Design and build the production ETL pipeline from source systems to warehouse and feature store at fleet scale, covering thousands of wind and PV sites across multiple OEMs. - Own canonical signal schema design across wind and PV asset classes and OEMs — the deepest technical unknown in the programme and the foundation everything else depends on. - Implement automated data quality gates: sparsity and missingness checks, flatline detection, outlier flagging, and freshness validation, with alerting that generates tickets automatically. - Implement dataset versioning sufficient to reproduce every trained model from scratch. - Build and maintain backfill jobs, idempotency guarantees, and retry logic that survive mid-run failure without duplicating data. - Govern storage and compute costs on the warehouse from day one. - Build the batch and on-demand inference API with contract tests, sized for fleet-wide daily runs. - Establish latency and throughput baselines; own the cold-start and model-loading strategy. - Instrument the service with structured logs and metrics from the outset. - Integrate forecasts into the Power Factors product platform: auth and authorisation with customer isolation, observability hooked into the existing stack, and feature flags per customer and per site. - Build and maintain the shadow validation pipeline: run live inference in parallel with the existing forecast path, log predictions and actuals, and produce weekly validation reports broken down by asset class, OEM, and region. - Support the pilot customer rollout: enable the product for friendly customers behind flags and own incoming data and integration tickets throughout the pilot window. - Work closely with the ML Engineer to align on data quality requirements, feature store interfaces, and the handoff between the data platform and training pipeline. - Partner with the Tech Lead and Frontend Engineer during platform integration to ensure a clean, maintainable integration surface. - Contribute to architectural decisions across the programme and document data flows, schemas, and pipeline runbooks to a standard that supports the broader team. Benefits - Comprehensive benefits package including health, dental, and vision coverage, plus dedicated wellness support. - Generous paid vacation policy. - Employer RRSP matching program. - Work-from-abroad opportunities with manager approval. - Exposure to a global team operating across multiple countries and time zones. - A humble cause with a clear purpose — you will help us fight climate change with code every day at work.

Worldwide
Truelogic Software logo

Senior Data Engineer – Enterprise B2B Marketplace

Truelogic Software

Premium boutique software development company that helps brands with big ideas to make a difference in people’s lives.

Data Engineer3 days ago
Full TimeRemoteTeam 501-1,000Since 2004H1B No Sponsor

• Data Platform Evolution: Guide the foundational architecture, scaling strategies, and long-term roadmap of the enterprise data platform. • Pipeline Engineering: Design and lead the development of highly scalable data pipelines using Airflow, dbt, and Python. • Modern Stack Integration: Build and maintain high-throughput integrations across core modern data stack tools, including Fivetran, Redshift, and Sigma. • Serverless Architecture: Develop and optimize serverless data services and ingestion layers leveraging AWS infrastructure (e.g., AWS Lambda). • Advanced Data Modeling: Partner with cross-functional stakeholders to define reliable, performant data warehouse architectures and analytical datasets. • Observability & Reliability: Implement automated testing, rigorous monitoring frameworks, and tracing to maximize pipeline reliability and minimize operational downtime. • Technical Leadership & Governance: Mentor data engineers and analysts on engineering best practices, while driving continuous improvements in data governance and documentation.

Mexico