Senior Data Engineer – Databricks, ADF, AI
Location
Brazil
Posted
7 days ago
Salary
0
Seniority
Senior
Job Description
Senior Data Engineer – Databricks, ADF, AI
Compass
• Perform data modeling, working with technologies such as Databricks, SAP Datasphere, Azure Data Factory, Azure Data Lake Storage (ADLS), Python, and databases (SQL Server, Oracle); • Work on extracting data from files received monthly, building a data lake and tables; • Structure RAG for the client's debt contracts and other documents related to loan agreements; • Perform ETL from CXL APIs and transform and process them into Databricks databases to feed CXL quotes; • Structure data tables in Databricks for Judicial Deposits, Financial Guarantees and Collateral; • Develop ETL processes for analyzing structured and unstructured data, using the client's chat API to analyze files and validate data; • Build the Position Manager data product and create reports to replace CXL's Crystal Reports;
Job Requirements
- Experience with Databricks, SAP Datasphere, Azure Data Factory, Azure Data Lake Storage (ADLS), and databases (SQL Server, Oracle)
- Ability to analyze requirements in order to assist Architects in structuring and detailing requests;
- Knowledge of ETL tools and data modeling;
- Ability to assist in defining requirements;
- Proven experience in Python;
- API development and publishing;
- Knowledge of DevOps;
- Minimum of 4 years of documented work experience (formal employment records), with an academic degree in IT or a postgraduate IT program of 360 accredited hours;
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Senior/Staff Data Engineer
Incognia ConfidentialA Incognia é a empresa inovadora em soluções de identidade de última geração que possibilitam experiências digitais seguras e sem fricção. Com sua solução persistente de identificação de dispositivos, a Incognia combina sinais de reconhecimento de dispositivo de excelência e à prova de adulteração e análise de localização para verificação de usuários e prevenção de fraudes. As avaliações de risco personalizáveis e insights acionáveis da Incognia permitem que empresas de serviços financeiros, plataformas de delivery e da economia compartilhada e marketplaces protejam sua reputação, retenção de clientes e receita. Para mais informações, visite Incognia.com/pt .
Role Description O time de Engenharia de Dados da Incognia é responsável por melhorar e manter toda a infraestrutura de dados da empresa, desde clusters, serviços de dados e autenticação, até ETLs, notebooks e outras ferramentas necessárias que permitem que nossos colaboradores possam extrair métricas e inteligência de um vasto volume de dados ingeridos diariamente, de forma fácil, confiável, segura e respeitando a privacidade dos nossos usuários. O trabalho do time é essencial para entregar uma variedade de soluções baseadas em um grande volume de dados coletados, no intuito de proteger usuários de diversos clientes de golpes e fraude. Precisamos de uma pessoa que tenha familiaridade com vários bancos de dados e engines de processamento distribuído, que consiga entender a fundo as ferramentas que usamos e que, em caso necessário, seja capaz de contribuir para os projetos opensource que rodamos em produção ou possa criar nossas próprias soluções para vencermos nossos desafios de dados. Buscamos alguém com aptidão para identificar problemas latentes (técnicos ou organizacionais) e disposto a resolver desafios de dados em larga escala ainda não devidamente solucionados pela indústria. Dia a dia da posição: - Criar, implantar, manter e monitorar soluções de infraestrutura de dados para toda a empresa, incluindo deployments de ferramentas open source e soluções desenvolvidas internamente; - Prover ferramentas como software, guidelines e templates que tornem o uso da infraestrutura de dados fluida e de fácil acesso aos colaboradores; - Criar e manter pipelines de dados recorrentes e/ou em tempo real de alta performance para tratar grandes volumes de dados; - Monitorar constantemente nossos sistemas para garantir alta disponibilidade, performance e redução de custos; - Criar mecanismos que garantam as políticas de acesso aos dados; Qualifications - Graduação em Engenharia de Software ou relacionado; - Experiência em kubernetes e deployment de sistemas em escala; - Experiência com ETLs/Pipelines em larga escala; - Experiência com o ecossistema Spark; - Experiência com Data Warehousing, Data Lake e formatos de tabela open table como Apache Iceberg e Delta; - Experiência com ferramentas de observabilidade; - Familiaridade com diferentes bancos de dados; - Conhecimentos sólidos em sistemas distribuídos, com destaque para engines de consulta distribuídas; - Experiência com trabalho exploratório, demonstrando capacidade de investigar e solucionar problemas open-ended onde o desafio em si muitas vezes ainda não está bem especificado; - Familiaridade com sistemas distribuídos; - Experiência com modelagem, análise e visualização de dados; - Capacidade de entender profundamente as ferramentas que vai utilizar a ponto de criar patches para adequação às nossas necessidades; - Excelente comunicação – habilidade para articular conceitos técnicos claramente, facilitar tomadas de decisão e garantir a disseminação de conhecimento entre os times; - Inglês avançado. Benefits - Confira nossos benefícios em nosso e-book - https://shorturl.at/fgW12 Company Description A Incognia é a empresa inovadora em soluções de identidade de última geração que possibilitam experiências digitais seguras e sem fricção. Com sua solução persistente de identificação de dispositivos, a Incognia combina sinais de reconhecimento de dispositivo de excelência e à prova de adulteração e análise de localização para verificação de usuários e prevenção de fraudes. As avaliações de risco personalizáveis e insights acionáveis da Incognia permitem que empresas de serviços financeiros, plataformas de delivery e da economia compartilhada e marketplaces protejam sua reputação, retenção de clientes e receita. Para mais informações, visite Incognia.com/pt .
Data Engineer, Go to Market (Remote)
CrowdStrikeCrowdStrike is an award-winning, global provider of cloud-delivered security technology, threat intelligence, and next-generation endpoint protection. Founded i
As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About the Role: We are seeking a seasoned Data Engineer to join our team. You will be responsible for architecting and building robust data frameworks that empower Operational Data Store / Enterprise Data Lake Platforms. This role requires a blend of high-level technical expertise in modern data stacks and the ability to collaborate independently with stakeholders to translate business needs into scalable data solutions. What You’ll Do: - Lead the full lifecycle of data engineering projects, from initial requirement gathering with stakeholders to production deployment and monitoring. - Design, develop and maintain complex data transformations, ensuring high data quality and performance using scripting languages like Python, Airflow, DBT and databases such as Snowflake or similar Data Lakes. - Build, scale, and maintain automated workflows using Apache Airflow to manage sophisticated data dependencies. - Maintain high engineering standards through CI/CD implementation and rigorous version control using GitHub. - Implement automated processes for data validation, ensuring high standards of data quality, accuracy, and integrity across all pipelines. - Act as a technical partner to the Analytics, Sales, and Marketing teams, building curated datasets that drive strategic decision-making. What You’ll Need: - 3+ years' experience in design & developing complex automation frameworks, queries, data modeling in SQL, Python, DBT, Apache Airflow. - Deep Experience in scripting languages such as Python and Cloud database experience such as Snowflake, Redshift, etc. to facilitate rapid ingestion and dissemination of key data. - Marketing Data Domain Expertise: Hands-on experience working with Marketing datasets including campaign performance data, lead, funnel stages and opportunity pipelines, and revenue attribution models. - Expertise in architecting scalable DBT projects using advanced modeling techniques, custom macros, complex Jinja-templated logic, and modular project structures to enforce DRY (Don't Repeat Yourself) principles across the enterprise. - Advanced proficiency in the DBT lifecycle including CI/CD processes such as Jenkins, Gitlab CI/CD etc., and source control tools such as GitHub, etc. - Experience identifying and solving issues concerning data management to improve data quality, and clean, prepare and optimize data for ingestion and consumption. You will work with your Data Engineering teammates to review design, code, and test plans to increase knowledge and application of key frameworks and methodologies. - Work with internal and external stakeholders to assist with data-related technical issues and support data infrastructure needs. - Proven experience integrating and managing business data from enterprise applications into Semantic Layers to decouple complex logic from the BI layer to drive analytics and insights. - Bachelor's Degree in Computer Science, Information Technology, Computer Engineering, or related IT discipline; or equivalent experience. Bonus Points: - Marketing Automation Tools Knowledge: Experience with platforms such as Salesforce, Marketo, People.ai, Outreach and CRM systems for data integration and processing. - Understanding of machine learning concepts: Ability to collaborate with data science teams and support machine learning initiatives through data preparation, transformation and Feature store support. #LI-Remote #LI-CS1 Benefits of Working at CrowdStrike: - Market leader in compensation and equity awards - Comprehensive physical and mental wellness programs - Competitive vacation and holidays for recharge - Paid parental and adoption leaves - Professional development opportunities for all employees regardless of level or role - Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections - Vibrant office culture with world class amenities - Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance. Find out more about your rights as an applicant. CrowdStrike participates in the E-Verify program. Notice of E-Verify Participation Right to Work CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $85,000 - $120,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.For detailed information about the U.S. benefits package, please click here. Expected Close Date of Job Posting is:07-28-2026
Role Description This Senior Data Engineer will be part of the team responsible for developing and deploying Engineering and Integration solutions. Primary responsibility will be to work closely with the Data Architects and Machine Learning Team to implement data solutions for the organization using Python, Java, Kafka and other big data solutions, creating technical specification documents and test plans. Also provide support for the data solutions across the enterprise. Job Roles and Responsibilities - Understand business processes and how they are modeled in various systems. - Work with business users, technology teams, and executives to understand their data needs to create innovative solutions to fulfill them. - Design, organize, and implement data structures, workflows, and integrations between enterprise platforms to ensure the accurate and timely execution of business processes. - Develop and maintain scalable data pipelines and build out new API integrations to support continuing increases in data volume and complexity. - Guide decisions and establish best practices on data integration/engineering, as well as the future of our data infrastructure. - Manage and improve the performance of our database, queries, tools, and solutions. - Create and maintain data warehouse, databases, tables, SQL queries, and ingestion pipelines to power reports (Tableau), dashboards, predictive models, and downstream analysis. - Write complex and efficient queries to transform raw data sources into easily accessible models for our teams and reporting platforms. - Prepare data for predictive and prescriptive modeling. - Identify and analyze data patterns. - Identify ways to improve data reliability, efficiency, and quality. - Work with analytics, data science, and wider engineering teams to help with automating data analysis and visualization needs, advise on transformation processes to populate data models, and explore ways to design and develop data infrastructure. - Document high and low level technical design. - Draw performance reports and strategic proposals from analyses results for senior data science leadership. - Collaborate, coordinate, and communicate across disciplines and departments. - Ensure compliance with HIPAA regulations and requirements. - Demonstrate Company's Core Competencies and values held within. - Please note due to the exposure of PHI sensitive data -- this role is considered to be a High Risk and privileged role. - The position responsibilities outlined above are in no way to be construed as all encompassing. Other duties, responsibilities, and qualifications may be required and/or assigned as necessary. Qualifications - Minimum high school diploma and five years' of relevant experience within data engineering. Bachelor's degree in computer science, information technology or a similarly relevant field is highly preferable. - Required licensures, professional certifications, and/or Board certifications as applicable. - Minimum of 3 years of experience with OOP, SQL, schema designing, data modeling, designing, building, and maintaining data processing systems. - Strong experience with advanced analytics tools for Object-Oriented/object function scripting using languages such as R, Python, Java, others. - Database development experience using SQL, SPARK, or BigQuery and experience with a variety of relational, NoSQL oriented databases like Hadoop, MongoDB, Cassandra. - Big Data Development experience using Hive, Impala, Spark, and familiarity with Kafka (preferred). - Extensive experience in triaging data issues, analyzing end-to-end data pipelines and working with business users in resolving issues. - Experience in working with data governance/data quality and data security teams and specifically data stewards and security officers in moving data pipelines into production with appropriate data quality, governance and security standards and certification. - Experience building infrastructure required for optimal extraction, transformation, and loading of data from diverse data resources. - Adept in agile methodologies and capable of applying DevOps and increasingly DataOps principles to data pipelines. - Exposure to containerization using Docker, Kubernetes etc. - Exposure to machine learning, data science, computer vision, artificial intelligence, statistics, and/or applied mathematics. - Excellent communication skills (verbal, listening and written). - Strong ability to design, build and manage data pipelines for data structures encompassing data transformation, data models, schemas, metadata and workload management. - Strong attention to detail when identifying data relationships, trends, and anomalies. - Thinking through long-term impacts of key design decisions and handling failure scenarios. - Ability to work with both IT and business in integrating analytics and data science output into business processes and workflows. - An agile learner who brings strong problem-solving skills and enjoys working as part of a technical, cross functional team to solve complex data problems. - A thought leader when approaching technical challenges. - Ability to provide mentoring to a team of data engineers. - Ability to prioritize and manage multiple projects and requests at any one time. - Ability to effectively share technical information, communicate technical issues and solutions to all levels of business. - Ability to meet strict deadlines, work on multiple tasks and work well under pressure. - Individual in this position must be able to work in a standard office environment which requires sitting and viewing monitor(s) for extended periods of time, operating standard office equipment such as, but not limited to, a keyboard, copier and telephone. Compensation The salary range for this position is $145-$155k. Specific offers take into account a candidate’s education, experience and skills, as well as the candidate’s work location and internal equity. This position is also eligible for health insurance, 401k and bonus opportunity. Benefits - Medical, dental and vision coverage with low deductible & copay. - Life insurance. - Short and long-term disability. - Paid Parental Leave. - 401(k) + match. - Employee Stock Purchase Plan. - Generous Paid Time Off – accrued based on years of service. - WA Candidates: the accrual rate is 4.61 hours every other week for the first two years of tenure before increasing with additional years of service. - 10 paid company holidays. - Tuition reimbursement. - Flexible Spending Account. - Employee Assistance Program. - Sick time benefits – for eligible employees, one hour of sick time for every 30 hours worked, up to a maximum accrual of 40 hours per calendar year, unless the laws of the state in which the employee is located provide for more generous sick time benefits. EEO Statement Claritev is an Equal Opportunity Employer and complies with all applicable laws and regulations. Qualified applicants will receive consideration for employment without regard to age, race, color, religion, gender, sexual orientation, gender identity, national origin, disability or protected veteran status. If you would like more information on your EEO rights under the law, please click here. Application Deadline We will generally accept applications for at least 5 calendar days from the posting date or as long as the job remains posted.
Senior Data Engineer, People Analytics
AirbnbAirbnb is a community based on connection and belonging.
• Collaborate with other team members and stakeholders to help understand data- and people-related business problems and translate them into scalable data solutions • Build data pipelines and tables from HR systems such as Workday, Greenhouse, and other data sources • Support Data Science team members in leveraging data for reporting, dashboard development, and other client-facing use-cases • Build, update, and maintain a production-grade data foundation that supports AI initiatives — including pipelines that feed LLM-powered tools, evaluation and feedback datasets, and the access controls and data models required to responsibly scale AI products from prototype to production • Design and deliver data products, including dashboards and reporting tools (e.g., Streamlit visualization apps), that surface actionable insights for non-technical stakeholders • Write and optimize queries across both distributed query engine (Trino/Presto) and private relational database (Postgres) • Align on priorities and work from a roadmap, ensuring you are focusing on the highest-priority projects • Assess data readiness for AI use cases, working with EX teams, Legal, and BizTech to ensure sensitive employee data is handled with appropriate governance, permissioning, and access controls • Support the transition of AI prototypes to production by building the underlying infrastructure — automated pipelines, security controls, and stable data models — that prototypes require to scale • Exercise traits of adaptability and good judgment to support organizational agility • Be a constant learner, active listener, and teacher to advance data engineering, people analytics, and Airbnb


