Job Closed
This listing is no longer active.
Boutique Recruitment Agency sourcing Leaders for growth businesses.
Lead Data Engineer
Location
United States
Posted
65 days ago
Salary
$180K - $200K / year
Seniority
Senior
Job Description
Lead Data Engineer
Serv Recruitment Agency
• Lead the architecture and evolution of scalable, distributed data pipelines, ensuring high availability and performance at scale • Design and implement robust data models to support reporting and advanced data applications • Build and maintain distributed web scraping systems using tools such as Playwright, Selenium, and BeautifulSoup • Develop systems capable of handling anti-scraping measures, proxy rotation, and high-volume data extraction • Integrate AI and LLMs into engineering workflows for code generation, automation, and optimization • Apply prompt engineering techniques to improve data processing, documentation, and troubleshooting • Identify and implement system and process improvements to optimize performance and efficiency • Manage and scale cloud-based data infrastructure, including data warehouses, object storage, and search systems • Deploy and maintain containerized workloads using Kubernetes • Implement data quality monitoring and governance processes to ensure accuracy and reliability • Mentor junior engineers through code reviews, documentation, and knowledge sharing • Communicate technical concepts clearly and provide business context for engineering decisions
Job Requirements
- 5+ years of experience in Data Engineering with a track record of scaling systems
- Expert proficiency in Python and advanced SQL, including performance tuning and optimization
- Strong experience with workflow orchestration tools such as Airflow or Prefect and transformation tools such as dbt
- Proven experience building resilient web scraping systems using Playwright, Selenium, and BeautifulSoup
- Deep understanding of relational and NoSQL databases including Postgres, MongoDB, and ElasticSearch
- Experience working with large-scale data systems such as BigQuery
- Strong proficiency with CI/CD pipelines, Git, and Docker
- Experience designing and maintaining distributed systems with high availability and fault tolerance
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Principal Engineer – PCS Data Fabric
GE HEALTHCAREGE HealthCare is a leading global medical technology and digital solutions innovator. Our purpose is to create a world where healthcare has no limits. Unlock your ambition, turn ideas into world-changing realities, and join an organization where every voice makes a difference, and every difference builds a healthier world.
Job Description Summary Join us in building clinical grade data platforms that power meaningful insights across connected devices, diagnostics, and digital health. In this role, you will shape scalable data architecture on AWS that enables secure ingestion, governance, analytics, and responsible AI/ML. Your work will have a direct impact on improving patient outcomes and supporting clinicians, product teams, and partners around the world. We welcome applicants from all backgrounds—especially those historically underrepresented in tech—and encourage candidates who meet most, but not all, of the qualifications to apply. We value curiosity, collaboration, and a growth mindset. Job Description What You’ll Do Data Platform Architecture - Design and evolve cloud‑native data platforms using S3, Lake Formation, Glue (catalog/ETL), Athena, EMR/EKS‑Spark, Redshift (including serverless), and Kinesis/MSK for streaming. - Define lake and lakehouse patterns, real‑time and batch pipelines, and governed self‑service analytics capabilities. Governance & Privacy - Implement PHI tokenization/pseudonymization, fine‑grained access controls (column/row level), Macie discovery, encrypted storage (KMS), and data retention/lineage strategies using Glue and tags. - Apply DLP and other privacy‑preserving controls aligned with HIPAA, GDPR, HITRUST, and FDA/ISO frameworks. Interoperability - Enable data exchange using FHIR, DICOM, HL7, and device telemetry through IoT Core into streaming and lake layers. ML & MLOps - Build governed ML workflows with SageMaker pipelines, model registry, lineage tracking, explainability, and bias reporting. - Support dataset versioning and incorporate human‑in‑the‑loop processes when needed. Self‑Service Data & Data Products - Lead data mesh/product governance, enable Redshift/Athena consumption, support DataZone cataloging and access workflows, and utilize Clean Rooms for privacy‑preserving collaboration. Reliability & Performance - Architect for resiliency across multi‑AZ/multi‑region deployments, including S3 replication, lifecycle management, partitioning/compaction, and cost‑efficient performance tuning. Validation & Auditability - Maintain validation packages for regulated analytics and AI pipelines, including traceable lineage and CFR Part 11 evidence. Required Qualifications This role will need to work out of the central time zone. - 12+ years of experience in data or analytics platforms. - 6+ years leading AWS data architecture at scale. - Deep expertise with S3, Lake Formation, Glue, Athena, EMR, Redshift, Kinesis/MSK, and SageMaker. - Experience governing PHI and regulated ML workflows. Preferred Qualifications - Experience with table formats such as Apache Iceberg, Delta Lake, or Hudi, and ACID‑on‑lake patterns. - Knowledge of CDC ingestion (DMS). - Familiarity with curated imaging pipelines (DICOM) and vector search for clinical text/notes. - FinOps practices for data platforms (tiering, compression, query optimization). What We Offer - A collaborative environment where diverse perspectives are valued. - Opportunities for ongoing learning, mentorship, and professional growth. - Flexibility, autonomy, and support from peers across engineering and product teams. - The chance to build solutions that have a real impact in healthcare. #LI-LRG1 #LI-Onsite #LI-Hybrid #LI-Remote We will not sponsor individuals for employment visas, now or in the future, for this job opening. For U.S. based positions only, the pay range for this position is $188,000.00-$282,000.00 Annual. It is not typical for an individual to be hired at or near the top of the pay range and compensation decisions are dependent on the facts and circumstances of each case. The specific compensation offered to a candidate may be influenced by a variety of factors including skills, qualifications, experience and location. In addition, this position may also be eligible to earn performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI). GE HealthCare offers a competitive benefits package, including not but limited to medical, dental, vision, paid time off, a 401(k) plan with employee and company contribution opportunities, life, disability, and accident insurance, and tuition reimbursement. Additional Information GE HealthCare offers a great work environment, professional development, challenging careers, and competitive compensation. GE HealthCare is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, national or ethnic origin, sex, sexual orientation, gender identity or expression, age, disability, protected veteran status or other characteristics protected by law. GE HealthCare will only employ those who are legally authorized to work in the United States for this opening. Any offer of employment is conditioned upon the successful completion of a drug screen (as applicable). While GE HealthCare does not currently require U.S. employees to be vaccinated against COVID-19, some GE HealthCare customers have vaccination mandates that may apply to certain GE HealthCare employees. Relocation Assistance Provided: No Application Deadline: May 29, 2026
Data Engineer
GFT TechnologiesAs a pioneer for digital transformation GFT develops sustainable solutions across new technologies.
• Design, build, and maintain scalable, resilient data pipelines; • Work with large volumes of structured and unstructured data; • Ensure data quality, consistency, and data governance; • Collaborate with software engineers, analysts, and data scientists; • Participate in technical decisions about data architecture and tooling;
Senior Data Engineer
GFT TechnologiesAs a pioneer for digital transformation GFT develops sustainable solutions across new technologies.
• The professional will work on building data pipelines, developing CDS Views and integrating SAP → Azure Data Factory → Azure SQL, ensuring data quality, consistency and availability for analytics and strategic dashboards. • The Senior Data Engineer will be responsible for developing and evolving data pipelines and integrations between the SAP environment and the corporate Data Warehouse in Azure, ensuring data quality, consistency and scalability. • Will work directly on extracting data via CDS Views, integrating with Azure Data Factory and building data layers in Azure SQL, supporting the definition of the final Data Warehouse architecture. • The professional will also help ensure the data environment is prepared to support strategic dashboards and the migration of the SICOF system to SAP. • Create and adjust CDS Views in SAP; • Extract data via Azure Data Factory; • Enrich the Data Warehouse (Azure SQL); • Ensure data quality and consistency; • Support the construction of the final data architecture. • Develop advanced CDS Views in SAP (associations, annotations, performance tuning); • Adjust and optimize existing views; • Integrate data SAP → Azure Data Factory → Azure SQL; • Implement ingestion pipelines and incremental loads; • Ensure data quality, consistency and integrity; • Support the definition of the Data Warehouse architecture; • Assess the need for dimensional modeling (Star Schema); • Support the evolution of data governance; • Collaborate with data, BI and architecture teams to ensure availability and reliability of information.
Principal Data Engineer (Research, Manufacturing, Supply Chain) - REMOTE
Vertex Inc.Vertex is a global biotechnology company that invests in scientific innovation.
Job Description We are seeking a dedicated Principal Data Engineer with a passion for pharmaceutical research, to lead and manage our critical Data Engineering team as they strategically enable scientists across Vertex with data. As part of the Data & Software Engineering (DSE) Team, you will be responsible for developing, curating, and maintaining data assets that enable our scientists and manufacturing and supply chain teams to make timely, informed, and critical decisions.. You will work closely with data scientists, data engineers, and platform engineers to scale the Vertex Data Platform, which is Vertex’s cutting edge technology ecosystem for Data Engineering, Data Science, and Advanced Analytics. Using the VDP, you will help ensure that our scientists are equipped with information to drive their analysis, models, and investigations. Vertex Pharmaceuticals is in a transformational period where we are accelerating our capabilities, technologies, and data to augment our scientific mission, enable Vertex to grow in scale, and continue to be on the forefront of science, medicine, and technology. As part of this effort, it is critical to ensure that the Vertex’s research data platform continues to accelerate and enable our data scientists to drive innovation and insight. Key Responsibilities: - Data Engineering – Integrate & curate data from research systems & artifacts to support analytics, modelling, machine learning, and investigation. Collaborate closely with our research engagement teams to understand requirements, and translate them into data solutions - Data Engineering management - Manage, maintain, and improve the Vertex Data Platform solutions that support & enable research scientists and manufacturing and supply chain analytics teams - Delivery management – Estimate, architect, and execute on delivery of critical data solutions across research, manufacturing, and supply chain domains in partnership with the DSE leadership team - Operations Management – Manage a team of data engineers to maintain compliant, timely, secure, and reliable data workloads for Research. Work alongside our DSE MLOps team to support complex workloads that leverage curated & model ready data - Innovation champion - Advocate for process enhancements and opportunities to improve our capabilities with a focus on efficiency, scale, and data connectivity Qualifications: - Minimum of 9 years of development experience using Snowflake, Databricks, Spark, Redshift, or equivalent data technologies - Minimum of 9 years of experience in pharmaceutical research, with an emphasis on data engineering, data science, data integrity, and data governance - Prior experience leading Data Engineering projects and teams - 3+ years leveraging Databricks, Snowflake, AWS, or equivalent cloud data platforms - Demonstrated experience with pipeline technologies like Astronomer / Airflow, MLFlow, etc. - Demonstrated ability to work independently and manage multiple projects that require collaboration across functional areas. - Skillful, collaborative team player able to develop rapport and credibility with stakeholders. - Demonstrated ability and willingness to teach, engage and support others as they learn new technologies and concepts. - Enthusiasm for and the ability to quickly learn new technologies and tackle difficult problems - Strong presentation, verbal, and written communication skills - Working knowledge of key workflow tools, including JIRA and Confluence #LI-REMOTE Pay Range: $148,000 - $222,000 Disclosure Statement: The range provided is based on what we believe is a reasonable estimate for the base salary pay range for this job at the time of posting. This role is eligible for an annual bonus and annual equity awards. Some roles may also be eligible for overtime pay, in accordance with federal and state requirements. Actual base salary pay will be based on a number of factors, including skills, competencies, experience, and other job-related factors permitted by law. At Vertex, our Total Rewards offerings also include inclusive market-leading benefits to meet our employees wherever they are in their career, financial, family and wellbeing journey while providing flexibility and resources to support their growth and aspirations. From medical, dental and vision benefits to generous paid time off (including a week-long company shutdown in the Summer and the Winter), educational assistance programs including student loan repayment, a generous commuting subsidy, matching charitable donations, 401(k) and so much more. Flex Designation: Remote-Eligible Flex Eligibility Status: In this Remote-Eligible role, you can choose to be designated as: 1. Remote: work remotely five days per week and come into the office on occasion – you’re always welcome on-site; or select 2. Hybrid: work remotely up to two days per week; or select 3. On-Site: work five days per week on-site with ad hoc flexibility. Note: The Flex status for this position is subject to Vertex’s Policy on Flex @ Vertex Program and may be changed at any time. #LI-Remote Company Information Vertex is a global biotechnology company that invests in scientific innovation. Vertex is committed to equal employment opportunity and non-discrimination for all employees and qualified applicants without regard to a person's race, color, sex, gender identity or expression, age, religion, national origin, ancestry, ethnicity, disability, veteran status, genetic information, sexual orientation, marital status, or any characteristic protected under applicable law. Vertex is an E-Verify Employer in the United States. Vertex will make reasonable accommodations for qualified individuals with known disabilities, in accordance with applicable law. Any applicant requiring an accommodation in connection with the hiring process and/or to perform the essential functions of the position for which the applicant has applied should make a request to the recruiter or hiring manager, or contact Talent Acquisition at ApplicationAssistance@vrtx.com

