Optimizing business performance through people, data, tech & analytics
Senior Data Engineer – AI, AWS
Location
Brazil
Posted
6 days ago
Salary
0
Seniority
Senior
Job Description
Senior Data Engineer – AI, AWS
Blend360
• Build and maintain the data infrastructure that supports the company’s and clients’ analytical products, ML models, and GenAI solutions • Design and implement high-scale batch and real-time data ingestion and transformation pipelines • Build and maintain lakehouse architectures using AWS S3, Glue, Redshift, and Apache Iceberg • Develop and orchestrate ML/AI pipelines using AWS SageMaker and Apache Airflow • Implement real-time streaming solutions with Apache Kafka and/or AWS Kinesis • Explore and apply GenAI patterns via AWS Bedrock, including RAG pipelines, embedding workflows, and integration with LLMs • Apply Data Mesh practices to decentralize data domains and improve team autonomy • Ensure data quality, lineage, and governance using dbt and AWS Glue Data Catalog • Optimize cost and query performance in Redshift and Athena environments
Job Requirements
- 5+ years of experience as a Data Engineer with a focus on cloud
- Strong proficiency in Python, PySpark, and SQL for large-scale data processing and transformation
- Solid experience with AWS: S3, Glue, Redshift, Athena, Lambda, SageMaker, and Kinesis
- Experience orchestrating pipelines using Apache Airflow
- Knowledge of data streaming with Apache Kafka or AWS Kinesis
- Familiarity with dbt for data transformation and documentation
- Experience with Infrastructure as Code using Terraform for provisioning data infrastructure
- Knowledge of Apache Iceberg for table management in data lakes
- Demonstrated interest in AI/ML — experience with ML or GenAI pipelines is a strong plus
- AWS Certified Data Analytics – Specialty or AWS Certified Machine Learning (preferred)
Benefits
- Remote work available
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Lead the transversal platform and reliability efforts, driving industrialization, automation, CI/CD pipelines, and overall platform stability in a fully remote, multicultural environment. • Define and evolve technical standards, ensuring the reliability, scalability, and performance of data workflows. • Lead industrialization and automation initiatives across development and deployment processes, supporting multiple squads on DataOps best practices. • Contribute to technical roadmap definitions, architecture decisions, and continuous ecosystem improvements. • Design, maintain, and evolve internal development and deployment tooling around dbt, Airflow, and Snowflake. • Develop and optimize internal CLI tools for automated dbt model generation, YAML testing, DAG creation, and deployment automation. • Contribute to the integration of AI/LLM capabilities into development and DataOps workflows to reduce manual operations. • Design, implement, and maintain secure and automated multi-environment Data CI/CD pipelines. • Ensure deployment quality during release cycles, collaborating with project squads and supporting release governance. • Supervise, optimize, and design Apache Airflow orchestration workflows and execution DAGs. • Implement monitoring, alerting, and observability capabilities to maximize platform stability and operational efficiency. • Contribute to incident resolution and root cause analysis.
Data Engineer
Bluelight ConsultingBluelight is a leading software consultancy dedicated to designing and developing innovative technology that enhances users' lives. With a steadfast commitment to delivering exceptional service to our clients, Bluelight excels in its focus on quality and customer satisfaction. Our mission is not only to create cutting-edge applications but also to foster a collaborative and enriching work environment where each team member can grow and thrive. With a presence across the United States and Central/South America, Bluelight is in an exciting phase of expansion, continually seeking exceptional talent to join its dynamic and diverse community.
Role Description We are looking for a skilled individual to join our rapidly growing team at Bluelight Consulting. This position is ideal for someone who thrives in a fast-paced, dynamic environment where everyone's opinions and efforts are valued and appreciated. You will have the opportunity to contribute to challenging and meaningful projects, developing high-quality applications that stand out in the market. We value continuous learning, personal growth, and hard work, offering a collaborative environment that promotes professional development. If you are passionate about software development and eager to be part of a growing software consultancy, we invite you to apply and join us on this exciting journey. Qualifications - Experience with AWS services including but not limited to S3, Athena, EC2, EMR, and Glue. - Ability to solve any ongoing issues with operating the cluster. - Experience with the integration of data from multiple data sources. - Experience with various database technologies such as SQLServer, Redshift, Postgres, and RDS. - Experience with one or more of the following data integration platforms: Pentaho Kettle, SnapLogic, Talend OpenStudio, Jitterbit, Informatica PowerCenter, or similar. - Knowledge of best practices and IT operations in an always-up, always-available service. - Experience with or knowledge of Agile Software Development methodologies. - Excellent problem-solving and troubleshooting skills. - Excellent oral and written communication skills with a keen sense of customer service. - Experience with collecting/managing/reporting on large data stores. - Awareness of Data governance and data quality principles. - Well-versed in Business Analytics including basic metric building and troubleshooting. - Understand Integration architecture: application integration and data flow diagrams, source-to-target mappings, data dictionary reports. - Familiar with Web Services: XML, REST, SOAP. - Experience with Git or similar version control software. - Experience with integrations with and/or use of BI tools such as GoodData (preferred), Tableau, PowerBI, or similar. - Broad experience with multiple RDBMS: MS SQLServer, Oracle, MySQL, PostgreSQL, Redshift. - Familiarity with SaaS/cloud data systems (e.g. Salesforce). - Data warehouse design: star-schemas, change data capture, denormalization. - SQL/DDL queries/Tuning techniques such as indexing, sorting, and distribution. - BS or MS degree in Computer Science or a related technical field. - 3+ years of Data Pipeline development such as SnapLogic or Datastage, Informatica, or related experience. - 3+ years of SQL experience (No-SQL experience is a plus). - Experience designing, building, and maintaining data pipelines. Requirements - Develop and maintain data models for core package application and reporting databases to describe objects and fields for support documentation and to facilitate custom application development and data integration. - Monitoring execution and performance of daily pipelines, triage and escalate any issues. - Collaborates with analytics and business teams to improve data models and data pipelines that feed business intelligence tools, increasing data accessibility and fostering data-driven decision-making across the organization. - Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes. - Writes unit/integration tests, contributes to engineering wiki, and documents work. - Performs data analysis required to troubleshoot data-related issues and assist in the resolution of data issues. - Work within AWS/Linux cloud systems environment in support of data integration solution. - Works closely with a team of frontend and backend engineers, product managers, and analysts. - Teamwork: Collaborate with team members. Share knowledge, provide visibility of interpersonal accomplishments and follow directions when provided. Benefits - Competitive salary and bonuses, including performance-based salary increases. - Generous paid-time-off policy. - Flexible working hours. - Work remotely. - Continuing education, training, conferences. - Company-sponsored coursework, exams, and certifications.
Lead Data Engineer
VistaPrintVistaPrint is a Cimpress company helping small business owners market themselves professionally with quality products and design tools. Through patented technologies, direct market
• Architect & Lead Operational Data Flows by designing and overseeing the implementation of an Operational Data Store (ODS) • Build low-latency data streams using technologies like Kafka or Flink to power embedded analytics directly within customer-facing applications • Establish "Data Contracts" with upstream engineering teams to ensure high availability and schema stability for all real-time operational flows • Own the transition and scaling of our Analytical Data Store (e.g., Snowflake), ensuring it is optimized for both performance and cost-efficiency • Modernize transformation layer by implementing robust ELT patterns and modular data modeling (using dbt and airflow) • Champion Data Governance, ensuring that every dashboard and report is backed by high-quality, audited, and well-documented data • Build the "Data Foundation" for Machine Learning, including development of Feature Stores and automated pipelines for model training and inference • Mentor and grow a high-performing engineering team, fostering a culture of "DataOps" where automation, testing, and observability are the default • Act as a strategic partner to Product and Executive leadership, translating complex technical roadmaps into clear business value
• Work closely with business areas to understand needs and translate requirements into scalable data solutions aligned with the organization’s objectives. • Conduct requirements gathering and map business processes and data flows between systems. • Define and validate business rules, metrics, KPIs and corporate indicators. • Design and evolve data models for Data Warehouse, Data Lake and Lakehouse environments. • Define dimensional modeling strategies, including facts, dimensions, granularity and relationships. • Develop data architectures using Azure Data Factory, Databricks, Data Lake and Power BI. • Ensure data quality, consistency and traceability across data pipelines. • Define standards for data ingestion, transformation, quality and availability across Bronze, Silver and Gold layers. • Serve as a technical reference for Data Engineering, BI and Analytics teams. • Support solution implementation, ensuring adherence to architectural and business definitions. • Produce documentation for data models, business rules and corporate indicators.



