Vulcury logo
Vulcury

Vulcury invests in early stage startups and advises companies of all sizes on strategy, growth, and efficiency

Data Engineer – Pipelines, Structured Markup

Data EngineerData EngineerFull TimeRemoteSeniorTeam 1-10Since 2022H1B No SponsorCompany SiteLinkedIn

Location

India

Posted

123 days ago

Salary

0

Seniority

Senior

Bachelor Degree4 yrs expEnglishAirflowETLPostgreSQLPythonSQL

Job Description

Data Engineer – Pipelines, Structured Markup

Vulcury

• Design and maintain ingestion pipelines (Python-based ETL/ELT) • Design structured transformation workflows using dbt, SQLMesh, or equivalent • Convert unstructured transcripts and documents into normalized database records • Maintain PostgreSQL architecture (structured tables, JSONB, indexing strategy) • Develop attribute extraction frameworks for technical, commercial, and risk signals • Ensure data quality, consistency, and lineage from raw interaction to structured output • Collaborate with AI/ML engineers to ensure clean model inputs

Job Requirements

  • 4-5 years of experience
  • Strong Python (data pipelines, orchestration)
  • Advanced SQL (PostgreSQL preferred)
  • Experience with ETL/ELT frameworks (dbt, Airflow, SQLMesh, etc.)
  • Experience handling semi-structured data (JSON, transcripts, document parsing)
  • Strong schema design and normalization skills
  • Familiarity with cloud storage systems (S3 or equivalent)

Benefits

  • Competitive salary
  • Health insurance
  • Paid time off
  • Flexible work arrangements
  • Professional development opportunities

Related Categories

Related Job Pages

More Data Engineer Jobs

Koala Health logo

Data Engineer

Koala Health

Koala Health works to provide people with all of the medications and health products their pets need. The company packages medication and health products by date and time to help m

Data Engineer123 days ago

• Own and evolve Koala Health’s end-to-end data infrastructure, including ingestion, transformation, modeling, and delivery. • Design and maintain reliable data pipelines from production systems (e.g., application databases, third-party tools, vendors). • Build and manage data models that support analytics, reporting, and operational use cases. • Establish and enforce best practices for data quality, testing, monitoring, and documentation. • Partner with stakeholders across product, operations, finance, and marketing to understand data needs and translate them into scalable solutions. • Improve the reliability, performance, and cost-efficiency of the data stack as the business grows. • Own incident response and debugging for data issues, proactively identifying and resolving root causes. • Create and maintain clear documentation so data assets are understandable and usable across the company. • Evaluate and implement tooling improvements where it meaningfully improves developer velocity or data quality. • Act as a thought partner to leadership on how data can better support decision-making and operational efficiency.

United States
$125K - $150K / year
Data Engineer123 days ago
OtherRemoteTeam 51-200Since 2016H1B No Sponsor

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description A ROIT é uma empresa disruptiva e inovadora. Desenvolvemos e disponibilizamos ao mercado soluções de tecnologia disruptivas, baseadas no uso humano como exceção. Oferecemos uma transformação inteligente para empresas que desejam evoluir e alcançar resultados surpreendentes com Robotização, Inteligência Artificial e Analytics. Venha fazer parte dessa transformação! Responsibilities - Implementar sistemas e rotinas de monitoramento (dados, aplicações, queries etc); - Evoluir modelos de dados, arquitetura e construção de pipeline de dados para atender novos requisitos de engenharia e negócios; - Implementar rotinas de migração, tratamento e armazenamento de dados (ETL); - Desenvolver integrações entre diferentes fontes de dados (RDS, APIs externas e etc) a fim de centralizar em um datalake; - Implementar e conduzir testes de carga; - Monitorar pipeline de dados em execução. Qualifications - Experiência com Python; - Experiência com GCP; - Experiência com integração continua e deploy em cloud; - Experiência com Datalake, Data Warehouse e Data Marts; - Conhecimento em banco de dados SQL e NoSQL; - Conhecimento em recurso das plataformas cloud; - Conhecimento em arquitetura orientada a eventos. Requirements - Será um diferencial ter experiência em: - Vivência em projetos de streaming; - Experiência com Airflow; - Processamento distribuído de dados (Spark ou similares).

United States + 1 moreAll locations: United States | Canada
Job Closed
Enode logo

Data Engineer

Enode

APIs for connecting to EVs, thermostats and other energy hardware. Building the technology behind a green energy system

Data Engineer123 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

• Deliver Flex-critical data needs: build and maintain reliable pipelines and datasets that enable Flex models (e.g., demand/availability signals, aggregations, monitoring). • Evolve the data platform: assess what we have today and drive pragmatic improvements in architecture, tooling, and operating practices. • Own data quality and trust: implement testing, lineage/definitions, and guardrails (e.g., dbt tests, anomaly detection, freshness checks) so stakeholders can trust outputs. • Enable self-serve analytics: produce well modeled datasets and documentation that make it easy for others to answer questions without bespoke work. • Partner on data science work: collaborate on data readiness for modelling, feature pipelines, evaluation workflows, and productionization concerns (even if you’re not the primary model builder). • Make high-leverage tech choices: propose and justify changes (or non-changes) to tools and processes, prioritizing impact and delivery over long platform rewrites.

Europe
Job Closed
Theoria Medical logo

Senior Data Engineer

Theoria Medical

We don’t meet the standards, we set them.

Data Engineer123 days ago
OtherRemoteTeam 1,001-5,000H1B Sponsor

• Design, build, and maintain scalable data pipelines using Microsoft Fabric and Apache Airflow • Ingest, transform, and integrate data from a variety of sources, including relational systems, APIs, and MongoDB • Implement and manage data solutions aligned to Medallion architecture principles (Bronze, Silver, Gold) • Design and maintain analytical data models, including fact and dimension tables, to support reporting and analytics • Optimize data storage, performance, and reliability across lakehouse and warehouse environments • Ensure data quality, observability, and lineage through validation, monitoring, and documentation • Collaborate with data analysts and BI developers to enable performant, well-modeled datasets for Power BI • Partner with clinical, operational, and technical stakeholders to understand data requirements and constraints • Support data governance, security, and compliance efforts, including HIPAA-related controls • Mentor junior data engineers and contribute to engineering standards and best practices

Michigan
Job Closed