Provectus logo
Provectus

We help businesses leverage cloud, data, and AI to reimagine the way they operate, compete, and deliver customer value.

Senior Data Engineer, AI

Data EngineerData EngineerFull TimeRemoteSeniorTeam 501-1,000Since 2012H1B SponsorCompany SiteLinkedIn

Location

Poland

Posted

63 days ago

Salary

0

Seniority

Senior

Job Description

Senior Data Engineer, AI

Provectus

• Design, build, and maintain robust data pipelines and ML systems for production environments. • Develop and deploy ML and LLM-based solutions addressing real client business challenges. • Build and maintain ETL/ELT workflows using modern orchestration and distributed computing tools. • Implement MLOps practices: CI/CD, automated testing, model monitoring, and experiment tracking. • Architect and implement cloud-native data and AI/ML solutions, primarily on AWS. • Collaborate closely with Data Scientists, AI/ML Engineers, Backend Engineers, and client stakeholders. • Participate in code reviews, contribute to technical documentation, and share knowledge within the team. • Engage in client-facing discussions to understand requirements and propose technical solutions.

Job Requirements

  • 6+ years of hands-on engineering experience with production systems, not just building POCs.
  • Full-stack mindset, comfortable across AI, Backend development, Data, and cloud infrastructure.
  • Autonomous working style — you drive work forward without needing heavy process overhead.
  • Experience adopting AI tools in day-to-day workflows (e.g. Claude Code, GitHub Copilot, or similar).
  • Strong sense of ownership and proactivity; you spot problems before they're handed to you.
  • Openness to broadening skills into adjacent areas.
  • B2+ English, comfortable collaborating across distributed, multicultural teams.
  • Strong Python and SQL skills and solid software engineering fundamentals.
  • Hands-on experience with Apache Spark for large-scale data processing.
  • Proficiency with cloud data warehouse technologies: Snowflake, Redshift, or ClickHouse.
  • Proven experience building batch data workflows with Apache Airflow or similar orchestration tools.
  • Experience with real-time data processing using Kafka and streaming frameworks.
  • Experience with LLM-based application patterns, including RAG architectures, prompt design, and agentic workflows.
  • Basic understanding of embedding models, vector databases, and semantic search.
  • Awareness of LLM evaluation techniques and quality assurance approaches.
  • Experience deploying and maintaining ML models in production environments.
  • Understanding of CI/CD practices applied to ML pipelines.
  • Hands-on experience with AWS (SageMaker, Bedrock, Lambda, Glue, S3, ECR, or similar); GCP considered. Relevant cloud certifications are a plus.

Benefits

  • Impactful work: projects span GenAI, MLOps, and NextGen data platforms for global enterprises across multiple industries.
  • Senior-calibre peers: collaborate with top ML and Data professionals across North America, LATAM, and EMEA.
  • Career growth: a clear path toward Tech Lead if you have the ambition — we actively develop our engineers.
  • Recognised expertise: AWS Premier Consulting Partner featured in Forrester’s AI Technical Services Landscape.

Related Categories

Related Job Pages

More Data Engineer Jobs

Keyrus logo

Data Engineer, Azure

Keyrus

#MakeDataMatter #HumanizingTheFuture

Data Engineer63 days ago
Full TimeRemoteTeam 1,001-5,000Since 1996H1B Sponsor

• Design and implement scalable data architectures on Azure, ensuring data is integrated and made available efficiently to the company’s various departments. • Develop, manage, and optimize ETL pipelines, monitor databases, ensure information security and integrity, and automate processes to increase operational efficiency. • Work in partnership with data scientists and analysts to ensure data is clean, structured, and ready for analysis. • Implement data governance policies, ensure compliance with regulations, maintain systems in proper working order, and perform continuous improvements as needed.

Brazil
General Motors logo

Staff Data Engineer

General Motors

Join us on our journey toward a world with zero crashes, zero emissions, and zero congestion.

Data Engineer63 days ago
Full TimeRemoteTeam 10,001+Since 1908H1B Sponsor

Description The Role As a Staff Data Engineer within the Marketing Applied Sciences organization, you will be responsible for designing and developing high-quality, well-managed, and reusable data and analytics products (including AI/ML, Data, and Marketing solutions) for applied analytics solutions and insights. You will design, build, and optimize data engineering pipelines and marketing data products, as well as contribute to AI agents, LLM-based solutions, and Ad Ops pipelines, while owning and enforcing data and AI standards and ensuring security and compliance. Additionally, the Data Engineer will contribute to building and maintaining LLM-based AI agents using both platform-native capabilities and custom agent frameworks, integrating them with enterprise-scale AI models and data systems. Agents will include wrappers and UI to seamlessly integrate sub agents. You will also be responsible for monitoring, maintaining, and enhancing data and ML pipelines and associated agents. This role will also be responsible for integrating with 3rd party data providers and marketing platforms, APIs, and providing marketing insights. As a Staff Data Engineer you will collaborate closely with product managers, data engineers, data scientists, and other partners to develop state-of-the-art AI and Data Engineering solutions that enable the future of marketing. Success in this role requires a blend of technical aptitude, marketing know-how, and cross-functional collaboration. This role is ideal for someone who thrives at the intersection of data engineering, data science, and marketing strategy, and who is eager to shape the future of customer experiences at one of the world's most iconic automotive brands. What You'll Do - Define, own, and enforce data and AI/ML standards and industry best practices and dedicate efforts to mentoring and overseeing AI/ML efforts across the Marketing Applied Sciences portfolio - Build and maintain scalable data pipelines, services, and AI agents to accelerate and optimize business processes and tooling - Identify and implement optimizations that improve the runtime, performance, scalability, stability, and cost efficiency of data platforms, pipelines, and AI/ML agents and workflows - Own data stewardship for key marketing datasets, driving data quality, documentation, access controls, and responsible use in partnership with data governance, privacy, and marketing stakeholders. - Help design and evolve marketing and Ad Ops data infrastructure, including privacy-safe environments such as data clean rooms, to enable advanced audience insights, activation, and measurement. - Design and implement robust data models, ingestion, and transformation workflows that ensure high-quality, well-documented, and discoverable data for analytics and AI use cases. - Collaborate with the Engineering community to highlight best practices, implement tactics, and provide feedback to the community - Build unit and integration tests, data quality checks, logging and observability, and all industry-standard best practices to ensure products are production ready - Elevate data and system design, diagnostics, and operational excellence to higher standards Your Skills & Abilities (Required Qualifications) - Bachelor's degree in Computer Science, ML Engineering, Data Engineering, Information Systems, Mathematics, or a related technical field. - 8+ years of experience in Software Engineering, Data Engineering, Data Science or related field, building, maintaining and optimizing distributed systems - Experience building high performance data and AI solutions - Experience in Python and/or other Object-Oriented programming languages (Java, C++ etc.) - Proficiency in SQL and distributed data processing frameworks such as PySpark - Proficient with modern data platforms and tools such as Databricks, Airflow, dbt, Snowflake, Kafka, or similar - Proficient with cloud architecture systems such as Azure, AWS, GCP, etc. - Ability to communicate complex solutions with fellow Engineers and non-technical business stakeholders alike - Demonstrated ability to lead projects that bridge marketing, data science, and technology to drive measurable outcomes - Strong understanding of modern data architectures and pipelines, and familiarity with LLMs, AI agents, automation, and related best practices - Demonstrated success in collaborating with cross-functional teams and translating business requirements into scalable data and AI/ML solutions. What Can Give You a Competitive Advantage (Preferred Qualifications) - Master's degree in Computer Science, ML Engineering, Data Engineering, Information Systems, Mathematics, or a related technical field. - 2+ years of experience with paid, earned, and owned media or marketing analytics The salary range for this role is ($160,200 - $246,300). The actual base salary a successful candidate will be offered within this range will vary based on factors relevant to the position. Bonus Potential: An incentive pay program offers payouts based on company performance, job level, and individual performance. Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more. GM does not provide immigration-related sponsorship for this role. Do not apply for this role if you will need GM immigration sponsorship now or in the future. This includes direct company sponsorship, entry of GM as the immigration employer of record on a government form, and any work authorization requiring a written submission or other immigration support from the company (e.g., H1-B, OPT, STEM OPT, CPT, TN, J-1, etc.) This role is categorized as remote. This means the selected candidate may be based anywhere in the country of work and is not expected to report to a GM worksite unless directed by their manager. The selected candidate will be required to travel <25% for this role. About GM Our vision is a world with Zero Crashes, Zero Emissions and Zero Congestion and we embrace the responsibility to lead the change that will make our world better, safer and more equitable for all. Why Join Us We believe we all must make a choice every day - individually and collectively - to drive meaningful change through our words, our deeds and our culture. Every day, we want every employee to feel they belong to one General Motors team. Total Rewards | Benefits Overview From day one, we're looking out for your well-being-at work and at home-so you can focus on realizing your ambitions. Learn how GM supports a rewarding career that rewards you personally by visiting Total Rewards resources. Non-Discrimination and Equal Employment Opportunities (U.S.) General Motors is committed to being a workplace that is not only free of unlawful discrimination, but one that genuinely fosters inclusion and belonging. We strongly believe that providing an inclusive workplace creates an environment in which our employees can thrive and develop better products for our customers. All employment decisions are made on a non-discriminatory basis without regard to sex, race, color, national origin, citizenship status, religion, age, disability, pregnancy or maternity status, sexual orientation, gender identity, status as a veteran or protected veteran, or any other similarly protected status in accordance with federal, state and local laws. We encourage interested candidates to review the key responsibilities and qualifications for each role and apply for any positions that match their skills and capabilities. Applicants in the recruitment process may be required, where applicable, to successfully complete a role-related assessment(s) and/or a pre-employment screening prior to beginning employment. To learn more, visit How we Hire. Accommodations General Motors offers opportunities to all job seekers including individuals with disabilities. If you need a reasonable accommodation to assist with your job search or application for employment, email us [email protected] or call us at 1-800-865-7580. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.

United States
$160.2K - $246.3K / year
Job Closed
The US Oncology Network logo

Senior Data Engineer - Precision Health Informatics - Dallas, Tx

The US Oncology Network

The US Oncology Network is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

Data Engineer63 days ago
Full TimeRemoteTeam 10,001

Overview The US Oncology Network is looking for a Remote Senior Data Engineer to join our Precision Medicine Informatics Team. Some travel may be necessary. As a part of The US Oncology Network, Texas Oncology delivers high-quality, evidence-based care to patients close to home. Texas Oncology is the largest community oncology provider in the country and has approximately 600+ providers in 300+ sites across Texas, our founders pioneered community-based cancer care because they believed in making the best available cancer care accessible to all communities, allowing people to fight cancer at home with the critical support of family and friends nearby. Our mission is still the same today—at Texas Oncology, we use leading-edge technology and research to deliver high-quality, evidence-based cancer care to help our patients achieve “More breakthroughs. More victories.” ® in their fight against cancer. Today, Texas Oncology treats half of all Texans diagnosed with cancer on an annual basis. The US Oncology Network is one of the nation’s largest networks of community-based oncology physicians dedicated to advancing cancer care in America. The US Oncology Network is supported by McKesson Corporation focused on empowering a vibrant and sustainable community patient care delivery system to advance the science, technology, and quality of care. Why work for us? Come join our team that is responsible for helping lead Texas Oncology in treating more patient diagnosed with cancer than any other provider in Texas. We offer our employees a competitive benefits package that includes Medical, Dental, Vision, Life Insurance, Short-term and Long-term disability coverage, a generous PTO program, a 401k plan that comes with a company match, a Wellness program that rewards you practicing a healthy lifestyle, and lots of other great perks such as Tuition Reimbursement, an Employee Assistance program and discounts on some of your favorite retailers. What does this position do? The Senior Data Engineer is pivotal in designing, building, and maintaining robust data pipelines that ingest, transform, and manage large-scale clinical and molecular datasets from a diverse range of sources—including somatic and germline molecular results, lines of therapies, pharmaceutical treatments, treatment timelines, and more. You will play a key role in enabling data-driven insights for oncology care and research by ensuring seamless integration and high-quality management of complex healthcare data, primarily sourced from clinical lab service providers in a variety of clinical data formats. Supports and adheres to the US Oncology Compliance Program, to include Code of Ethics and Business Standards, and US Oncology's Shared Values. Responsibilities The essential duties and responsibilities (including but not limited to): Data Pipeline Development - Build and optimize scalable ETL/ELT pipelines for ingesting and transforming clinical, molecular, and lab data. - Improve data architectures and ensure efficient, accurate data integration. Analytics & Enablement - Run advanced data queries to support analytics, reporting, and validation activities. - Document data assets, pipelines, and logic for transparency and compliance. - Support regulatory reporting and research workflows, including PHI protection and risk modeling. - Translate clinical and research needs into scalable technical solutions. AI and LLM Integration - Integrate AI and LLM capabilities into data platforms and collaborate with AI teams on implementation. - Stay current with emerging AI technologies to enhance platform functionality. Monitoring and Maintenance - Ensure data quality and integrity through monitoring, validation, and reconciliation. - Implement alerting systems and resolve issues proactively. - Perform regular cloud infrastructure maintenance and improvements. Collaboration and Support - Partner with cross-functional teams to deliver robust, end-to-end data solutions. - Manage multiple priorities in a fast-paced environment. Documentation and Best Practices - Maintain clear documentation of architectures, processes, and procedures. - Promote best practices in cloud engineering, ETL development, and AI integration. Qualifications The ideal candidate will have the following background and experience:     Education - Bachelor’s, Master’s, or Ph.D. in Computer Science, Engineering, or related field. Experience - 7–10 years of data engineering experience focused on ETL/ELT development. - 5+ years in healthcare data engineering or equivalent certifications. - Hands-on experience integrating AI/LLM capabilities into data platforms. Technical Skills - Expertise in scalable ETL/ELT pipeline design and maintenance. - Strong SQL and relational database skills (e.g., SQL Server). - Proficiency in scripting/automation (Python, Perl, PHP, Bash). - Experience with cloud data platforms (Azure, AWS, GCP) and big‑data tools (Databricks). - Skilled in data modeling and schema design for clinical/molecular data. - Knowledge of AI technologies and large language models. Healthcare & Oncology Domain Knowledge - Understanding of precision oncology workflows and clinical data formats. - Familiarity with molecular/genomic data (NGS, variants, biomarkers). - Experience integrating lab, pathology, and molecular testing data. - Knowledge of healthcare standards (HL7, FHIR, ICD‑10, LOINC, SNOMED). - Experience with EHR systems (iKnowMed, Epic, Orchard Enterprise Labs). Collaboration & Communication - Strong communication skills with ability to work independently and collaboratively. - Excellent problem‑solving, prioritization, and stakeholder‑management skills. Preferred Qualifications - Relevant certifications such as SQL Certified Associate or Azure Data Fundamentals. - Experience with code repositories (e.g., GitHub) and knowledge‑management tools (e.g., Confluence). - Strong understanding of data‑warehousing concepts and technologies. - Ability to design, deploy, and manage cloud infrastructure across AWS, Azure, or GCP. PHYSICAL DEMANDS The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations will be offered to enable individuals with disabilities to perform the essential functions. Requires sitting for long periods of time. Some bending and stretching are required. Adequate finger dexterity and feeling to perform keyboarding and substantial repetitive motions involving the wrists, hands and/or fingers. Requires vision and hearing corrected to normal range. Must be able to view computer screens and printed material accurately. Occasionally lifts and carries items weighing up to 40 lbs. WORK ENVIRONMENT The work environment characteristics described here are representative of those an employee encounters while performing the essential functions of this job. Reasonable accommodations will be offered to enable individuals with disabilities to perform essential functions. The work environment is typical of an office setting. #USONTX

United States
Job Closed
Syndigo logo

Data Capturing Specialist

Syndigo

Products move when content flows™

Data Engineer63 days ago
Full TimeRemoteTeam 1,001-5,000Since 2013H1B Sponsor

Title: Data Capturing Specialist Location: Remote - Germany Job Description: Syndigo powers the continual flow of data and content throughout the entire commerce ecosystem&mdash; accelerating delivery of accurate and compelling information that increases sales on every shelf. We are the recognized leader in software and services for the management of master data, product information, digital assets, and content syndication and analytics across industries including grocery, foodservice, hardlines, home improvement, oil & gas, pet, health and beauty, automotive, apparel, and healthcare products. Syndigo serves the industry&rsquo;s largest two-sided network, connecting more than 50,000 global users across 12,000+ global brands with more than 1,750 global retailers. Basically, we're the people that deliver the rich, accurate product content that helps consumers shop online with confidence, and helps brands and retailers operate efficient product supply chains. We cannot do all of this without our amazing employees who make the magic happen here at Syndigo. As we continue to grow, we&rsquo;re always looking to identify talented individuals to join our team. **This role is open to candidates currently residing in Germany** The Data Capturing Specialist is responsible for capturing, maintaining, and managing product data within the GDSN data pool. The role includes handling customer orders, ensuring data quality and correct billing, maintaining customer communication, and supporting administrative and project-related tasks. HOW WE&rsquo;LL BE WINNING TOGETHER DAY TO DAY - Responsible for data capturing within the GDSN (Global Data Synchronization Network) - Handling orders for a defined customer portfolio within the GDSN data pool - Documenting and archiving customer orders in the ticketing system - Managing individual customer communication - Ensuring correct billing of services provided according to the defined pricing structure - Maintaining customer master data in the CRM system - Contributing to defined projects and continuously expanding knowledge of GDSN data maintenance requirements - Performing general administrative and commercial tasks WE SHOULD TALK IF THIS SOUNDS LIKE YOU - 1+ years of Data Management Experience. GDSN experience is preferred. - German: Full professional proficiency is non‑negotiable. All customer communication, systems, and documentation are in German. - Completed a commercial apprenticeship, degree, or career changer with a commercial background - Ability to quickly and independently familiarize yourself with complex topics - Understanding of and interest in technical processes - Ideally, prior experience in customer service or customer-oriented work Diversity, Equity & Inclusion To achieve the best version of our organization, we know it takes new ideas, new approaches, new perspectives and new ways of thinking. A purpose we are 100% committed to cultivating. Diversity is woven into our fabric at Syndigo and it&rsquo;s how we stay an industry leader, innovating technology solutions that equip our customers with everything they need to be successful! All are welcome here and we invite you to join our team if you are ready to help us continue that growth! GDPR/CCPA Syndigo, to process applications, holds onto data for a "reasonable time" after applications are submitted. This data is stored for Syndigo's internal use by HR/Recruiting Staff only. Verified requests for data deletion and exports will be completed upon request. Syndigo Job Applicant Privacy Notice At Syndigo, we care about your privacy. As you go through our recruitment process, we are committed to being transparent about how we process your personal data. To learn more about how Syndigo processes your personal data, go to our Job Applicant Privacy Notice.

Germany