Keyrus

#MakeDataMatter #HumanizingTheFuture

Data Engineer, Google Cloud, AI & Machine Learning

Data EngineerData EngineerFull Time Remote SeniorTeam 1,001-5,000Since 1996H1B SponsorCompany Site LinkedIn

Location

Brazil

Posted

145 days ago

Salary

Seniority

Senior

Bachelor DegreePortugueseBigQuery GCP Python

Job Description

• Design, develop, and maintain scalable, reliable data pipelines on GCP to support Analytics, Machine Learning, and AI initiatives. • Build and orchestrate data ingestion, transformation, and processing pipelines using Python, Jupyter Notebooks, and Dataproc. • Prepare, organize, and make data available for Machine Learning and Generative AI models. • Work with BigQuery for data analysis, transformation, modeling, and performance optimization. • Support the development of Generative AI applications by integrating data into solutions based on LangChain, Google ADK, and Vertex AI. • Manage data artifacts, models, and experiments using Artifact Registry. • Use Google Vector Database for solutions involving embeddings, semantic search, and RAG (retrieval-augmented generation) use cases. • Version code, pipelines, and notebooks using GitLab, following engineering best practices. • Collaborate with multidisciplinary teams in an agile environment, supporting data-driven architectures. • Contribute to data governance, quality, security, and observability practices.

Job Requirements

Strong proficiency in Python for data engineering.
Solid experience with BigQuery (modeling, performance, and optimization).
Experience with Jupyter Notebooks and distributed processing with Dataproc.
Experience with Vertex AI, especially supporting ML pipelines.
Knowledge of LangChain and Generative AI applications.
Experience with Google ADK and Google AI Workspace.
Practical knowledge of Google Vector Database.
Experience with version control of code and pipelines using GitLab.
Familiarity with data engineering best practices, automated testing, and documentation.

Benefits

Diverse and inclusive hiring practices
Agile, multidisciplinary work environment

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Data Engineer

LMI

Innovation at the Pace of Need™

Data Engineer145 days ago

Other RemoteTeam 1,001-5,000Since 1961H1B Sponsor

Company Site LinkedIn

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description LMI seeks an experienced Data Engineer to support the U.S. Army’s Holistic Health & Fitness (H2F) initiative as a member of the Analytics functional team within the H2F Program Support Team. The Data Engineer is responsible for designing, building, and maintaining data pipelines and data services that enable scientific analysis, analytics, and user engagement activities within the Holistic Health and Fitness Management System (H2FMS). This role focuses on data ingestion, transformation, storage, and accessibility, ensuring that data supporting research, analytics, and decision support is reliable, well-structured, and available for authorized use. The Data Engineer works closely with the Technical Project Manager, data governance specialists, epidemiologists, research psychologists, tactical sports scientists, data scientists, and software teams to translate analytic and research requirements into scalable data solutions under Government direction. This role does not set independent data policy or analytic strategy. Responsibilities - Design, develop, and maintain data ingestion and transformation pipelines supporting H2FMS analytics and research use cases. - Integrate data from multiple sources, including surveys, wearable and performance data, health and injury datasets, and operational systems as directed. - Ensure pipelines are reliable, scalable, and support downstream analytic and reporting needs. - Support development of analytic data models optimized for reporting, dashboards, and advanced analytics. - Work with data governance staff to ensure data models align with approved data definitions and standards. - Assist with management of structured and semi-structured data stores used within H2FMS. - Enable data access and preparation for data scientists and AI/ML engineers, ensuring data is usable for modeling and analysis. - Support feature preparation and data validation activities under Government and senior analytic direction. - Assist in troubleshooting data-related issues impacting analytics or models. - Support monitoring and resolution of data quality issues, including completeness, consistency, and timeliness. - Implement basic validation, logging, and error-handling mechanisms within data pipelines. - Coordinate with data governance and analytics teams to address recurring data issues. - Collaborate with epidemiologists, research psychologists, and tactical sports scientists to understand analytic data needs. - Coordinate with software teams to support integration between data services and application components. - Support documentation and communication of data pipeline designs and dependencies. Qualifications - Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a related field. - Demonstrated experience building and maintaining data pipelines and data integration solutions. - Familiarity with data transformation, storage, and analytics enablement concepts. - Experience working with structured and semi-structured data. - Ability to collaborate effectively within multidisciplinary teams spanning analytics, research, and software. - Strong problem-solving and communication skills. - Ability to obtain and maintain a Secret security clearance. Requirements - Experience supporting analytics or research-driven data environments. - Familiarity with cloud-based data services or analytics platforms. - Experience preparing data for dashboards, reporting, or AI/ML workflows. - Prior experience supporting DoW or federal customers. Location & Travel - Duty Location: This position may be performed in a remote or hybrid capacity. - Travel: Limited travel to Fort Eustis, Virginia or LMI Headquarters may be required to support planning, integration, or stakeholder engagement. Salary Target salary range: $101,986 - $170,154. The salary range displayed represents the typical salary range for this position and is not a guarantee of compensation. Individual salaries are determined by various factors including, but not limited to location, internal equity, business considerations, client contract requirements, and candidate qualifications, such as education, experience, skills, and security clearances. Applicants must meet eligibility requirements for a U.S. Government security clearance. Only US Citizens are eligible for a security clearance. For this position, LMI will only consider applicants with security clearances or applicants who are eligible for security clearances, due to the nature of the work.

Python SQL ETL Git AWS Apache Spark

View details: Data Engineer

United States

$102.0K - $170.2K / year

Apply

Job Closed

Postdoctoral Fellow, Data Engineering, Pipelines, PySpark

Sistema Fibra

Pelo Futuro da Indústria | Pelo Futuro do Trabalho

Data Engineer145 days ago

Full Time RemoteTeam 1,001-5,000Since 1972H1B No Sponsor

Company Site LinkedIn

• Plan and align the project with the Androidization strategy • Gather and validate functional and technical requirements • Design the solution architecture and the data model • Automate ingestion and processing of operational data • Model and automate refined tables • Implement monitoring and proactive alerts • Publish and validate tables in the production environment • Document the entire technical and functional structure of the platform • Conduct training sessions and formalize the technical handover

Python

View details: Postdoctoral Fellow, Data Engineering, Pipelines, PySpark

Brazil

R$9K / month

Apply

Senior Data Engineer – Especialista en Databricks

Inetum

Data Engineer145 days ago

Full Time RemoteTeam 10,001+H1B No Sponsor

Company Site LinkedIn

• Diseñar y desarrollar pipelines de datos eficientes y escalables utilizando Databricks, Spark y tecnologías relacionadas. • Implementar procesos de ingestión, transformación y almacenamiento de datos en entornos cloud (Azure, AWS o GCP). • Optimizar el rendimiento de clusters Spark y gestionar costos en plataformas Databricks. • Garantizar la calidad, gobernanza y seguridad de los datos en todas las etapas del ciclo de vida. • Colaborar con equipos de Data Science y BI para habilitar modelos predictivos y dashboards. • Automatizar procesos mediante CI/CD y herramientas de orquestación (Airflow, Azure Data Factory, etc.). • Documentar arquitecturas y flujos de datos, asegurando buenas prácticas y estándares.

Airflow Apache HTTP Server AWS Azure GCP Apache Kafka Python Scala Apache Spark SQL

View details: Senior Data Engineer – Especialista en Databricks

Spain

Apply

Job Closed

Senior Data Engineer

Hiflylabs

We create business value in a world full of data.

Data Engineer145 days ago

Full Time RemoteTeam 51-200Since 2013H1B No Sponsor

Company Site LinkedIn

• Understanding and aligning with business needs, identifying related data requirements • Participating in technical pre-sales activities, including effort estimation, and proposal creation • Architecting and planning robust data models and modern data platforms. • Acting as the senior technical lead for the development of cloud data solutions, with a focus on lakehouse architectures • Planning and leading testing activities • Leading and mentoring project teams (3-6 members)

AWS Azure

View details: Senior Data Engineer

Hungary

Apply

Data Engineer, Google Cloud, AI & Machine Learning

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Data Engineer

Postdoctoral Fellow, Data Engineering, Pipelines, PySpark

Senior Data Engineer – Especialista en Databricks

Senior Data Engineer