Job Closed

This listing is no longer active.

LWSA logo
LWSA

Integrando soluções & Impulsionando negócios

Mid-level Data Engineer

Data EngineerData EngineerFull TimeRemoteSeniorTeam 1,001-5,000Since 1998H1B No SponsorCompany SiteLinkedIn

Location

Brazil

Posted

53 days ago

Salary

0

Seniority

Senior

Job Description

Mid-level Data Engineer

LWSA

• Develop and scale data ingestion pipelines, ensuring lineage and documentation for traceability; • Implement and maintain automated data tests (e.g., schema validation, business rules, consistency and completeness) throughout pipelines; • Optimize ETL processes with a focus on latency and cost efficiency; • Define and monitor data SLAs/SLOs, ensuring availability, freshness and reliability of information; • Implement data observability practices, including monitoring of volume, distribution, anomalies and pipeline failures; • Participate in data incident analysis, performing root cause identification and implementing preventive actions; • Contribute to data documentation, cataloging and lineage to facilitate traceability and trust in the data.

Job Requirements

  • Experience with data-engineering-focused programming, including manipulation of complex structures, exception handling and use of libraries such as Pandas, PySpark or Boto3;
  • Strong experience with Apache Airflow, capable of developing, versioning and troubleshooting complex DAGs while applying coding best practices;
  • Practical experience with the AWS ecosystem, with a focus on:
  • Processing and Storage: S3, EC2 and AWS Glue;
  • Migration and Ingestion: proficiency in AWS DMS (Data Migration Service) for data replication.
  • Proficiency with relational databases (MySQL and PostgreSQL) and experience with non-relational databases (MongoDB), understanding their use cases, data modeling and optimizations;
  • Knowledge of MLflow for experiment tracking and model lifecycle management;
  • Experience writing advanced queries and optimizing performance for large volumes of data.

Benefits

  • Health insurance;
  • Dental insurance;
  • Meal allowance / food allowance;
  • Childcare assistance;
  • Life insurance;
  • Profit-sharing program (PPR);
  • Day off during your birthday month;
  • Wellhub;
  • Férias&Co (travel benefit);
  • 6 months maternity leave and 20 days paternity leave;
  • Flexible working hours;
  • Partnerships with various establishments and institutions in the areas of education, health, leisure, entertainment, and others.

Related Categories

Related Job Pages

More Data Engineer Jobs

L3HHCM20 logo

Lead, Data Engineer

L3HHCM20

L3Harris Australia excels as a prime defence contractor, providing integrated tech solutions for over four decades. Specialising in technology that connects and shapes operations spanning multiple domains: space, air, land, sea, cyber and first responders. Today, we employ over 500 professionals in all major cities who understand the region’s unique requirements.

Data Engineer53 days ago
Full TimeRemoteTeam 10,001

Job Title: Lead, Data Engineer Job Code: 35910 Job Location: Melbourne, FL or Remote Opportunity Job Schedule: 9/80: Employees work 9 out of every 14 days – totaling 80 hours worked – and have every other Friday off Job Description: L3Harris Enterprise Data and AI team is seeking a Data Engineer with experience in managing enterprise-level data life cycle processes. This role includes overseeing data ETL/ELT pipelines, ensuring adherence to data standards, maintaining data frameworks, conducting data cleansing, orchestrating data pipelines, and ensuring data consolidation. The selected individual will play a pivotal role in maintaining ontologies, building scalable data solutions, and developing dashboards to provide actionable insights for the enterprise within Palantir Foundry. This position will support the company’s modern data platform, Unified Data Layer, focusing on data pipeline development and maintenance, data platform design, documentation, and user training. The goal is to ensure seamless access to data for all levels of the organization, empowering decision-makers with clean, reliable data. Essential Functions: - Design, build, and maintain robust data pipelines to ensure reliable data flow across the enterprise. - Maintain data pipeline schedules, orchestrate workflows, and monitor the overall health of data pipelines to ensure continuous data availability. - Create, update, and optimize data connections, datasets, and transformations to align with business needs. - Troubleshoot and resolve data sync issues, ensuring consistent and correct data flow from source systems. - Collaborate with cross-functional teams to uphold data quality standards and ensure accurate data is available for use. - Utilize Palantir Foundry to establish data connections to source applications, extract and load data, and design complex logical data models that meet functional and technical specifications. - Develop and manage data cleansing, consolidation, and integration mechanisms to support big data analytics at scale. - Build visualizations using Palantir Foundry tools and assist business users with testing, troubleshooting, and documentation creation, including data maintenance guides. Qualifications: - Bachelor’s Degree and minimum 9 years prior Palantir experience or Graduate Degree and a minimum of 7 years of prior Palantir experience In lieu of degree, minimum 13 years of prior Palantir experience. - Minimum of 4 years of experience with Data Pipeline development or ETL tools such as Palantir Foundry, Azure Data Factory, SSIS, or Python. - Minimum of 4 years of experience in Data Integration. Preferred Additional Skills: - Experience with designing and developing data pipelines in PySpark, Spark SQL, SQL or Code Build. - Experience in building and deploying data synchronization schedules and maintaining data pipelines using Palantir Foundry. - Strong understanding of Business Intelligence (BI) and Data Warehouse (DW) development methodologies. - Hands-on experience with the Snowflake Cloud Data Platform, including data architecture, query optimization, and performance tuning. - Proficiency in Python, PySpark, Pandas, Databricks, JavaScript, or other scripting languages for data processing and automation. - Experience with other ETL tools such as Azure Data Factory (ADF), SSIS, Informatica, or Talend is highly desirable. - Familiarity with connecting and extracting data from various ERP applications, including Oracle EBS, SAP ECC/S4, Deltek Costpoint, and more. - Experience with AI tools such as OpenAI, Palantir AIP, Snowflake Cortex or similar. In compliance with pay transparency requirements, the salary range for this role in California, Massachusetts, New Jersey, Washington, and the Greater D.C, Denver, or NYC areas is $125,000-$232,000. The salary range for this role in Colorado state, Hawaii, Illinois, Maryland, Minnesota, New York state, and Vermont is $108,500-$201,500. This is not a guarantee of compensation or salary, as final offer amount may vary based on factors including but not limited to experience and geographic location. L3Harris also offers a variety of benefits, including health and disability insurance, 401(k) match, flexible spending accounts, EAP, education assistance, parental leave, paid time off, and company-paid holidays. The specific programs and options available to an employee may vary depending on date of hire, schedule type, and the applicability of collective bargaining agreements. #LI-Remote #LI-NR1

United States + 1 moreAll locations: United States | Australia
$108K - $232K / year
HEALTHSTREAM INC logo

Intern, Data Science

HEALTHSTREAM INC

Are you passionate about enhancing healthcare outcomes and empowering healthcare professionals? Join the HealthStream team and become a HealthStreamer! Together, we can make a difference in the world of healthcare.

Data Engineer53 days ago
InternshipRemoteTeam 501-1,000

Job DetailsJob Location: USA Remote - Nashville, TN 37203Position Type: InternshipJob Category: Intern Position Overview We are seeking a motivated and intellectually curious Data Science Intern with a strong interest in machine learning, artificial intelligence, and big data analytics. This role is ideal for students who are passionate about turning complex data into actionable intelligence and scalable data-driven solutions. Key Responsibilities Develop and document Machine Learning models for accuracy, performance, and reliability. Work closely with the team to understand requirements and enhance model quality. Support the development of data intelligence solutions, dashboards, or analytical reports for business and technical stakeholders. Assist in the development and execution of test plans, test cases, and test scripts for the project. Perform manual/auto testing on new features and functionalities. Help identify, report, and track defects and issues found during ML models testing. QualificationsQualifications Currently enrolled in a related degree program Knowledge in programming languages such as Python, SQL, or R. Familiarity with ML libraries like scikit-learn, PyTorch, or TensorFlow. Previous project experience with time-series forecasting or large language models (LLMs) is preferred. Requirements Basic understanding of software development, Machine Learning and testing concepts. Strong analytical and problem-solving skills. Ability to manage time effectively and prioritize tasks with the team. More Details This role will be paid at 18 an Hr. Full time 35-40 hours The program will run for 10-12 weeks depending on availability

United States
$18 / hour
HEALTHSTREAM INC logo

Intern, Business Systems & Data

HEALTHSTREAM INC

Are you passionate about enhancing healthcare outcomes and empowering healthcare professionals? Join the HealthStream team and become a HealthStreamer! Together, we can make a difference in the world of healthcare.

Data Engineer53 days ago
InternshipRemoteTeam 501-1,000

Job DetailsJob Location: USA Remote - Nashville, TN 37203Position Type: InternshipJob Category: Intern Position Overview This internship offers hands-on experience supporting data preparation and operational workflows that enable accurate metering. The intern will work closely with Product Managers, Data Analysts, and Business Operations teams to ensure course-to-product mappings and subscription rules are accurate, complete, and consistently maintained across systems. Through this role, the intern will gain exposure to corporate applications, Salesforce-based workflows, data quality validation, and cross-functional collaboration. Depending on interest, skill set, and learning pace, the role may expand into light data analysis and reporting using tools such as Excel, SQL, or internal dashboards. Key Responsibilities Support data preparation and cleanup activities related to subscription products for automated metering Cross-check and validate course-to-product mapping data across multiple data sources Communicate with Product Managers and stakeholders to confirm mapping accuracy Document data definitions, processes, and updates to ensure clarity and consistency Collaborate with team members on ad hoc operational or data-related tasks as skills develop QualificationsQualifications Currently pursuing a bachelor’s or master’s degree in a relevant field Academic or project-based exposure to data, systems, or business operations concepts Familiarity with excel spreadsheets and basic data organization concepts Interest in learning how enterprise systems and workflows function in a corporate environment Requirements Strong Excel skills (formulas, filtering, data cleanup) Ability to learn internal systems such as Salesforce and subscription management tools Basic understanding of data structures, or relational concepts is a plus SQL or reporting tool exposure is a plus but not required Strong attention to detail and data accuracy Highly organized with the ability to track details across multiple systems, data sources, and stakeholders Ability to follow defined processes and document work clearly Comfortable asking questions and collaborating across teams Ability to manage multiple tasks and prioritize effectively Strong written and verbal communication skills More Details This role will be paid at 18 an Hr. Full time 35-40 hours The program will run for 10-12 weeks depending on availability

United States
$18 / hour
Full TimeRemoteTeam 1,001-5,000Since 30+ yearsH1B Sponsor

• Contributing to team and department solution design architecture • Developing and implementing appropriate standards and strategies for all sources of data • Key contributor for data stewardship workgroups and teams in data management and governance activities • Champion of metadata and data quality practices across the organization • Perform analysis and research to support and/or develop glossary definitions, policies, programs, standards, guidelines, and workflows • Lead the development and maintenance of metadata catalogs • Identify and quantify data related pain points within the organization and assist in the development of remediation plans • Collaborate with SMEs to develop business rules that measure and assure data quality • Lead efforts to quantify and qualify reported data quality issues; develop and report program metrics to demonstrate progress and compliance

United States
$72.2K - $115.5K / year
Job Closed