#MakeDataMatter #HumanizingTheFuture
Data Engineer, Google Cloud, AI & Machine Learning
Location
Brazil
Posted
145 days ago
Salary
0
Seniority
Senior
Job Description
Data Engineer, Google Cloud, AI & Machine Learning
Keyrus
• Design, develop, and maintain scalable, reliable data pipelines on GCP to support Analytics, Machine Learning, and AI initiatives. • Build and orchestrate data ingestion, transformation, and processing pipelines using Python, Jupyter Notebooks, and Dataproc. • Prepare, organize, and make data available for Machine Learning and Generative AI models. • Work with BigQuery for data analysis, transformation, modeling, and performance optimization. • Support the development of Generative AI applications by integrating data into solutions based on LangChain, Google ADK, and Vertex AI. • Manage data artifacts, models, and experiments using Artifact Registry. • Use Google Vector Database for solutions involving embeddings, semantic search, and RAG (retrieval-augmented generation) use cases. • Version code, pipelines, and notebooks using GitLab, following engineering best practices. • Collaborate with multidisciplinary teams in an agile environment, supporting data-driven architectures. • Contribute to data governance, quality, security, and observability practices.
Job Requirements
- Strong proficiency in Python for data engineering.
- Solid experience with BigQuery (modeling, performance, and optimization).
- Experience with Jupyter Notebooks and distributed processing with Dataproc.
- Experience with Vertex AI, especially supporting ML pipelines.
- Knowledge of LangChain and Generative AI applications.
- Experience with Google ADK and Google AI Workspace.
- Practical knowledge of Google Vector Database.
- Experience with version control of code and pipelines using GitLab.
- Familiarity with data engineering best practices, automated testing, and documentation.
Benefits
- Diverse and inclusive hiring practices
- Agile, multidisciplinary work environment
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description LMI seeks an experienced Data Engineer to support the U.S. Army’s Holistic Health & Fitness (H2F) initiative as a member of the Analytics functional team within the H2F Program Support Team. The Data Engineer is responsible for designing, building, and maintaining data pipelines and data services that enable scientific analysis, analytics, and user engagement activities within the Holistic Health and Fitness Management System (H2FMS). This role focuses on data ingestion, transformation, storage, and accessibility, ensuring that data supporting research, analytics, and decision support is reliable, well-structured, and available for authorized use. The Data Engineer works closely with the Technical Project Manager, data governance specialists, epidemiologists, research psychologists, tactical sports scientists, data scientists, and software teams to translate analytic and research requirements into scalable data solutions under Government direction. This role does not set independent data policy or analytic strategy. Responsibilities - Design, develop, and maintain data ingestion and transformation pipelines supporting H2FMS analytics and research use cases. - Integrate data from multiple sources, including surveys, wearable and performance data, health and injury datasets, and operational systems as directed. - Ensure pipelines are reliable, scalable, and support downstream analytic and reporting needs. - Support development of analytic data models optimized for reporting, dashboards, and advanced analytics. - Work with data governance staff to ensure data models align with approved data definitions and standards. - Assist with management of structured and semi-structured data stores used within H2FMS. - Enable data access and preparation for data scientists and AI/ML engineers, ensuring data is usable for modeling and analysis. - Support feature preparation and data validation activities under Government and senior analytic direction. - Assist in troubleshooting data-related issues impacting analytics or models. - Support monitoring and resolution of data quality issues, including completeness, consistency, and timeliness. - Implement basic validation, logging, and error-handling mechanisms within data pipelines. - Coordinate with data governance and analytics teams to address recurring data issues. - Collaborate with epidemiologists, research psychologists, and tactical sports scientists to understand analytic data needs. - Coordinate with software teams to support integration between data services and application components. - Support documentation and communication of data pipeline designs and dependencies. Qualifications - Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a related field. - Demonstrated experience building and maintaining data pipelines and data integration solutions. - Familiarity with data transformation, storage, and analytics enablement concepts. - Experience working with structured and semi-structured data. - Ability to collaborate effectively within multidisciplinary teams spanning analytics, research, and software. - Strong problem-solving and communication skills. - Ability to obtain and maintain a Secret security clearance. Requirements - Experience supporting analytics or research-driven data environments. - Familiarity with cloud-based data services or analytics platforms. - Experience preparing data for dashboards, reporting, or AI/ML workflows. - Prior experience supporting DoW or federal customers. Location & Travel - Duty Location: This position may be performed in a remote or hybrid capacity. - Travel: Limited travel to Fort Eustis, Virginia or LMI Headquarters may be required to support planning, integration, or stakeholder engagement. Salary Target salary range: $101,986 - $170,154. The salary range displayed represents the typical salary range for this position and is not a guarantee of compensation. Individual salaries are determined by various factors including, but not limited to location, internal equity, business considerations, client contract requirements, and candidate qualifications, such as education, experience, skills, and security clearances. Applicants must meet eligibility requirements for a U.S. Government security clearance. Only US Citizens are eligible for a security clearance. For this position, LMI will only consider applicants with security clearances or applicants who are eligible for security clearances, due to the nature of the work.
Postdoctoral Fellow, Data Engineering, Pipelines, PySpark
Sistema FibraPelo Futuro da Indústria | Pelo Futuro do Trabalho
• Plan and align the project with the Androidization strategy • Gather and validate functional and technical requirements • Design the solution architecture and the data model • Automate ingestion and processing of operational data • Model and automate refined tables • Implement monitoring and proactive alerts • Publish and validate tables in the production environment • Document the entire technical and functional structure of the platform • Conduct training sessions and formalize the technical handover
• Diseñar y desarrollar pipelines de datos eficientes y escalables utilizando Databricks, Spark y tecnologías relacionadas. • Implementar procesos de ingestión, transformación y almacenamiento de datos en entornos cloud (Azure, AWS o GCP). • Optimizar el rendimiento de clusters Spark y gestionar costos en plataformas Databricks. • Garantizar la calidad, gobernanza y seguridad de los datos en todas las etapas del ciclo de vida. • Colaborar con equipos de Data Science y BI para habilitar modelos predictivos y dashboards. • Automatizar procesos mediante CI/CD y herramientas de orquestación (Airflow, Azure Data Factory, etc.). • Documentar arquitecturas y flujos de datos, asegurando buenas prácticas y estándares.
• Understanding and aligning with business needs, identifying related data requirements • Participating in technical pre-sales activities, including effort estimation, and proposal creation • Architecting and planning robust data models and modern data platforms. • Acting as the senior technical lead for the development of cloud data solutions, with a focus on lakehouse architectures • Planning and leading testing activities • Leading and mentoring project teams (3-6 members)




