Big Data, Machine Learning, Data Science, AI, MLOps and DataOps Services
Data Engineer, Spark
Location
Poland
Posted
61 days ago
Salary
0
Seniority
Senior
Job Description
Data Engineer, Spark
Addepto
• Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability. • Design and implement data pipelines that process large volumes of data in both streaming and batch modes. • Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow. • Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently. • Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources. • Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing. • Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads. • Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.
Job Requirements
- At least 4 years of commercial experience implementing, developing, or maintaining Big Data systems, data governance and data management processes.
- Strong programming skills in Python (or Java/Scala): writing clean code, OOP design.
- Hands-on with Big Data technologies like Spark, Cloudera, Kafka, Data Platform, Airflow, NiFi, Docker, and Iceberg.
- Excellent understanding of dimensional data and data modeling techniques.
- Experience implementing and deploying solutions in cloud environments.
- Consulting experience with excellent communication and client management skills, including prior experience directly interacting with clients as a consultant.
- Ability to work independently and take ownership of project deliverables.
- Fluent English (at least C1 level).
- Bachelor’s degree in technical or mathematical studies.
- Nice to have: Experience with an MLOps framework such as Kubeflow or MLFlow. Familiarity with Databricks and/or dbt.
Benefits
- Work in a supportive team of passionate enthusiasts of AI & Big Data.
- Engage with top-tier global enterprises and cutting-edge startups on international projects.
- Enjoy flexible work arrangements, allowing you to work remotely or from modern offices and coworking spaces.
- Accelerate your professional growth through career paths, knowledge-sharing initiatives, language classes, and sponsored training or conferences, including a partnership with Databricks, which offers industry-leading training materials and certifications.
- Choose your preferred form of cooperation: B2B or a contract of mandate, and make use of 20 fully paid days off.
- Participate in team-building events and utilize the integration budget.
- Celebrate work anniversaries, birthdays, and milestones.
- Access medical and sports packages, eye care, and well-being support services, including psychotherapy and coaching.
- Get full work equipment for optimal productivity, including a laptop and other necessary devices.
- Experience a smooth onboarding with a dedicated buddy, and start your journey in our friendly, supportive, and autonomous culture.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Senior Data Science Engineer
AvengaA global IT engineering and consulting company specializing in custom software development.
Role Description Within the telecommunications domain, we are actively seeking a Senior Data Scientist to strengthen our team focused on building AI-driven solutions that enhance customer experience and operational efficiency. - You will be working on advanced Generative AI use cases, including conversational AI (chatbots and voice bots), as well as internal AI tools that support business teams in their daily operations. - The role involves designing and scaling solutions such as RAG-based systems and intelligent assistants, enabling both customer-facing and internal automation scenarios. Qualifications - Master’s degree in AI, Machine Learning, Data Science, or a related field - 3–6 years of experience as a Data Scientist (preferably at Senior level) - Strong expertise in AI/ML and practical experience delivering business-impacting solutions - Hands-on experience with Generative AI, especially LLMs - Experience with NLP use cases (e.g. chatbots, voice bots, text processing) - Solid programming skills in Python and experience working with APIs, SQL, and Git - Experience designing and implementing production-ready solutions - Ability to communicate complex technical concepts to non-technical stakeholders Requirements - Experience with telecom domain or customer-facing digital products (nice-to-have) - Knowledge of Agentic AI, RAG systems, vector databases, and chunking strategies (nice-to-have) - Familiarity with LLM frameworks such as LangChain, LlamaIndex, LangGraph, AutoGen, or similar (nice-to-have) - Experience with AWS and/or on-premise deployments (nice-to-have) - Knowledge of Docker, Airflow, CI/CD pipelines (e.g. GitHub Actions/Runners) (nice-to-have) - Experience with Google AI/NLP ecosystem (nice-to-have) Benefits - At Avenga, everyone matters. We provide equal opportunities in recruitment, career development, and leadership, regardless of race, ethnicity, gender identity, sexual orientation, disability, age, religion, or any other characteristic. - We are committed to fostering a work environment where our diverse community of employees, candidates, and business partners actively shapes our growth. - By bringing together people from different backgrounds and experiences, we build a workplace where everyone feels free to be themselves while honoring the boundaries of others.
• Work with our partners to understand business and technology problems and/or opportunities • Partner strongly with the head of data and the data team and data analysts. The head of data and data team will be responsible for setting a large amount of the requirements, but will hold short on technical design and physical modeling which this role will need to elaborate on. • Provide guidance and governance on standardized business logical data models, component re-use and platform modernization with an eye to designing business friendly views of data to empower citizen data scientists and BI users to leverage the data platform. • Lead the data design track, document and communicate solution that accommodate the Agile Principles of fail fast and provide the resilience necessary to accommodate changing and fluid requirements • In partnership with the data team, assist in the implementation of Data Governance Standards and Processes; partner with business stakeholders to establish a cadence in the upkeep of their information assets • Establish data domains for the Wealth/Global Investments business for reuse, standardization and security-by-design into your designs to simplify, secure and enable accelerated solution delivery. • Assist in the establishment of a self-governance operating model to manage data models and catalogs for the wealth/global investments business. • Bring a shift left mentality in maintenance of the business data terms, critical data elements as well as business rules. • Introduce learning capsules to enable self service capabilities for management of business data catalog and allied assets • Be innovative in your design thinking and use your expertise to influence the MMC technology roadmap • Take your architectural designs through our on boarding processes to ensure they meet MMCs standards and conventions • Participate in architecture reviews as part of MMCs Architecture Assurance process • Support the implementation of architectural designs from infrastructure set up through to delivery as necessary • Empower the development teams to deliver your architectural designs with a light touch approach • Be an involved member in our technology community, championing the latest technologies and trends within MMC and the wider industry, sharing your knowledge, successes and learnings with your colleagues. • Run the software delivery Pod ensuring there are smooth and efficient processes for getting software through from requirements to working in production focussing on value creation, growth and serving customers • Makes outcomes that leads to project success, and work with Product Owner and Business Analysts to help project team prioritize and groom backlog in order to deliver an Alpha early, then continue to deliver value often throughout the project lifecycle to staging and production environments. • You will have an Enterprise mind-set able to foster collaboration, break down silos, identify and work to remove blockers • You will be intimately familiar with the Agile processes and practices required for successful and sustainable delivery. • Solid understanding of Software Development, Architecture, Test Engineering and Security best practice in an Agile environment • Pods support the full lifecycle of a product and we operate on the philosophy that the work comes to the team. You will need to consider tech debt, obsolescence, security and modernisation • Working with the wider team to plan complex product releases • Ensure our standard metrics are visible and actively being used to drive drawing conclusions • Own the Continuous Improvement backlog for the Pod. This is the key mechanism by which we deliver improvements for the following: time to market, efficiency and quality
Principal Consultant, Data Architecture
InfosysFounded in 1981, Infosys is an information technology and services company providing consulting, outsourcing, technology, and next-generation services to clients in over 50 countri
• As Principal Consultant, Data Architecture you act as the technical lead in complex data and analytics projects. • You design and take responsibility for end-to-end enterprise data architectures. • You provide technical leadership to teams and serve as a trusted technical advisor to clients and internal stakeholders. • You ensure enterprise data and analytics solutions are scalable, secure, and operationally ready. • You translate business requirements into robust technical target architectures and plan their implementation.
Senior Data Engineer
NasstarFrom cloud optimisation and application modernisation to connectivity and collaboration, we are Nasstar.
• Design, develop, and implement data pipelines and ETL processes to ingest, transform, and load data into the data platform. • Collaborate with the Lead Architect and Cloud Engineer to ensure the seamless integration of data services within the cloud infrastructure. • Develop data models and schemas to support efficient data storage and retrieval. • Implement data quality and validation processes to ensure the accuracy and consistency of data. • Optimize data processes and queries for performance and scalability. • Collaborate with business stakeholders to understand their data requirements and provide insights and solutions for data-driven decision-making. • Work closely with data scientists to provide them with the necessary data and tooling for developing revenue-generating insights and models. • Implement data governance and security controls to ensure data privacy and compliance with regulations. • Stay updated with the latest data technologies and trends and make recommendations for technology adoption and improvements.




