Soum logo
Soum

Reimagining recommerce in the MENA region and beyond

Data Engineer

Location

Pakistan

Posted

63 days ago

Salary

0

Seniority

Mid Level

Job Description

Data Engineer

Soum

• Design, build, and maintain scalable and reliable data pipelines to support analytics, ML models, and business reporting. • Collaborate with data scientists and analysts to ensure data is available, clean, and optimized for downstream use. • Implement data quality checks, monitoring, and validation processes. • Work with cross-functional teams to design efficient ETL/ELT workflows using modern data tools. • Integrate data from multiple sources (databases, APIs, third-party tools) into centralized storage solutions (data lakes/warehouses). • Support cloud-based infrastructure for data storage and retrieval. • Monitor, troubleshoot, and optimize existing data pipelines to handle large-scale, real-time data flows. • Implement best practices for query optimization and cost-efficient data storage. • Ensure data is available and accessible for business-critical operations. • Partner with product, engineering, and business stakeholders to understand data requirements. • Document data workflows, schemas, and best practices. • Support a culture of data reliability, governance, and security.

Job Requirements

  • Proficiency in **Python** and **SQL** for data engineering tasks.
  • Strong understanding of **ETL/ELT processes**, data warehousing, and data modeling.
  • Hands-on experience with **cloud platforms** (AWS, GCP, or Azure) and data storage solutions (BigQuery, Redshift, Snowflake, etc.).
  • Familiarity with **data orchestration tools** Airflow, Airbyte **is a must.**
  • Experience with **containerization & deployment tools** (Docker, Kubernetes) is a plus.
  • Knowledge of **data governance, security, and best practices** for handling sensitive data.
  • Familiarity to work with Git and GitHub.
  • Dataform is a must**
  • Strong skills in eliciting requirements from cross-functional stakeholders and translating them into actionable data engineering tasks.

Related Categories

Related Job Pages

More Data Engineer Jobs

Future Processing logo

Cloud Data Architect, AWS

Future Processing

Great software... because we put people first

Data Engineer63 days ago
Full TimeRemoteTeam 1,001-5,000Since 2000H1B No Sponsor

• konsultowanie rozwiązań z klientami - zarówno obecnymi, jak i nowymi. Doradzanie w wyborze rozwiązania technicznego adekwatnego do problemu biznesowego klienta. Dążenie do wyboru rozwiązania, które jest optymalne kosztowo i odpowiada na potrzebę rozwiązującą problem klienta. • udział w spotkaniach z klientem na wczesnym etapie - pitch naszego doświadczenia, procesu, podejścia do technologii, dopytywanie, zbieranie/doszczegóławianie wymagań; • tworzenie wkładu merytorycznego do ofert - schemat proponowanej architektury, wyliczenia Total Cost of Ownership/Return of Investment/kosztów chmury za pomocą kalkulatorów dostawcy chmury; • estymowanie projektu/wyceny zaangażowania przedstawicieli DS w projekcie; • budowanie argumentów przekonujących klienta do naszego rozwiązania, pokazywanie przewag w stosunku do innego podejścia, • udział w spotkaniach prezentujących ofertę, odpowiadanie na pytania klienta, prezentowanie oferty, za którą stoi nasz fragment rozwiązania, • prace badawczo-rozwojowe w zakresie analizy funkcjonalności i przydatności nowych technologii i narzędzi w rozwiązaniach biznesowych klienta, • regularny kontakt z osobami decyzyjnymi w obszarze Data po stronie klienta (VP, IT Director), • tworzenie PoC w obszarze Data w celu zaprezentowania wyników prowadzonych prac R&D, • projektowanie i tworzenie całości platformy przetwarzania danych uwzględniając wszystkie jej części oraz powiązania z pozostałymi rozwiązaniami (przykładowo BI i ML) z uwzględnieniem ekosystemu chmurowego, • optymalizowanie całościowych rozwiązań/systemów przechowywania i analizy danych, • utrzymywanie relacji z zespołem technicznym po stronie klienta, • koordynowanie pracy inżynierów zaangażowanych w tworzenie rozwiązania, • nadzorowanie przebiegu całego projektu, od początku do końca, • zaangażowanie w rozwój linii biznesowej Data Solutions (pozyskiwanie pracowników, klientów, szkolenia, udział w konferencjach, mentoring itp.).

Poland
zł19.7K - zł29.3K / month
Future Processing logo

Cloud Data Architect – Azure

Future Processing

Great software... because we put people first

Data Engineer63 days ago
Full TimeRemoteTeam 1,001-5,000Since 2000H1B No Sponsor

• Konsultowanie rozwiązań z klientami - doradzanie w wyborze rozwiązania technicznego • Udział w spotkaniach z klientem na wczesnym etapie - pitch naszego doświadczenia, procesu, podejścia do technologii • Tworzenie wkładu merytorycznego do ofert - schemat proponowanej architektury, wyliczenia Total Cost of Ownership/Return of Investment • Estymowanie projektu/wyceny zaangażowania przedstawicieli DS w projekcie • Budowanie argumentów przekonujących klienta do naszego rozwiązania • Udział w spotkaniach prezentujących ofertę, odpowiadanie na pytania klienta • Prace badawczo-rozwojowe w zakresie analizy funkcjonalności i przydatności nowych technologii w rozwiązaniach biznesowych klienta • Regularny kontakt z osobami decyzyjnymi w obszarze Data po stronie klienta • Tworzenie PoC w obszarze Data w celu zaprezentowania wyników prowadzonych prac R&D • Projektowanie i tworzenie całości platformy przetwarzania danych • Utrzymywanie relacji z zespołem technicznym po stronie klienta • Koordynowanie pracy inżynierów zaangażowanych w tworzenie rozwiązania • Nadzorowanie przebiegu całego projektu • Zaangażowanie w rozwój linii biznesowej Data Solutions.

Poland
zł19.7K - zł29.3K / month
The US Oncology Network logo

Remote Oncology Data Engineer - Precision Medicine - Dallas, Tx

The US Oncology Network

The US Oncology Network is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

Data Engineer63 days ago
Full TimeRemoteTeam 10,001

Overview Texas Oncology is looking for a Remote Oncology Data Engineer to join our Precision Medicine team! This position is based out of the corporate office in Dallas, Texas. Texas Oncology is the largest community oncology provider in the country and has approximately 600+ providers in 300+ sites across Texas and southeastern Oklahoma. Our founders pioneered community-based cancer care because they believed in making the best available cancer care accessible to all communities, allowing people to fight cancer at home with the critical support of family and friends nearby. Our mission is still the same today—at Texas Oncology, we use leading-edge technology and research to deliver high-quality, high-touch, evidence-based cancer care to help our patients achieve “More breakthroughs. More victories.” ® in their fight against cancer. Today, Texas Oncology treats half of all Texans diagnosed with cancer on an annual basis. Why work for us? Come join our team that is responsible for helping lead Texas Oncology in treating more patient diagnosed with cancer than any other provider in Texas. We offer our employees a competitive benefits package that includes Medical, Dental, Vision, Life Insurance, Short-term and Long-term disability coverage, a generous PTO program, a 401k plan that comes with a company match, a Wellness program that rewards you practicing a healthy lifestyle, and lots of other great perks such as Tuition Reimbursement, an Employee Assistance program and discounts on some of your favorite retailers. Join a Team That Invests in Your Future At Texas Oncology, we recognize the long-term impact of our people and are committed to rewarding performance and potential. That’s why select roles may be eligible to participate in our Long-Term Incentive Plan (LTIP): an incentive program designed to attract, retain, and reward top talent. What is the Long-Term Incentive Plan (LTIP)? Long-Term Incentive Plan (LTIP): is an incentive program that typically vests over a three-year period and is tied to both individual performance and the operational success of Texas Oncology. Awards are discretionary and based on your position, performance, and potential for future career growth at Texas Oncology. Awards are reviewed and approved during the annual compensation review. LTIP awards are subject to your continued employment through the award payment date, and are governed by the written terms and conditions of the LTIP document. What does the Oncology Data Engineer do? The Oncology Data Engineer will support Precision Medicine's data delivery team, design and build robust data pipelines and implement new data architecture to support informatics decision-making. Leveraging deep understanding of ETL methodologies, and AI technologies, the Oncology Data Engineer will create scalable and efficient solutions using innovative technology, including SQL, OpenAI tools and large language models (LLMs). Supports and adheres to US Oncology Compliance Program, to include the Code of Ethics Business Standards. Responsibilities The essential duties and responsibilities (included but not limited to): Data Delivery Support - Design, develop, and maintain robust ETL pipelines for large-scale data ingestion and transformation from various sources such as Electronic Medical Records (EMRs), lab interfaces, and data warehouses. - Support data science initiatives with SQL coding from various data warehouses. - Implement new data architecture, drawing inspiration from existing pipelines. - Optimize ETL workflows for performance and accuracy, ensuring seamless data integration. AI and LLM Integration - Integrate AI functionalities into data platforms using OpenAI tools and LLMs. - Collaborate with AI teams to implement AI-driven solutions within the data pipeline. - Stay updated on the latest advancements in AI and LLM technologies to enhance platform capabilities. Collaboration and Support - Collaborate with cross-functional teams to understand requirements and translate them into technical solutions. Monitoring and Maintenance - Implement monitoring and alerting systems to proactively identify and resolve platform issues. - Perform regular maintenance, updates, and upgrades to cloud infrastructure and associated services. Documentation and Best Practices - Maintain comprehensive documentation of system architectures, processes, and procedures. - Advocate for and implement best practices in cloud engineering, SQL coding, ETL processes, and AI integration. Qualifications The ideal candidate will have the following background and experience: Education - Bachelor’s or master’s degree in computer science, engineering, or a related field. Healthcare & Oncology Domain Knowledge - Understanding of oncology workflows and clinical data types - Familiarity with molecular/genomic data (e.g., NGS, variants, biomarkers) - Experience integrating laboratory, pathology, and molecular testing data - Knowledge of healthcare data standards (HL7, FHIR, ICD-10, LOINC, SNOMED) - Experience working with EHR data (e.g., IKMg1/IKMg2, Epic, Copia) Experience - 7–10 years of professional experience in data engineering with a focus on ETL processes. - Strong background in cloud platforms (e.g., AWS, Azure, GCP). - Experience with OpenAI tools and integrating AI functionalities, including LLMs, into data platforms. Technical Skills - Strong scripting and automation skills (e.g., Python). - Strong experience with SQL required. - Experience with GitHub, Confluence, Jira preferred Soft Skills - Excellent problem-solving abilities and attention to detail. - Effective communication and teamwork skills. - Ability to manage multiple priorities in a challenging environment. Physical Demands: The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations will be offered to enable individuals with disabilities to perform the essential functions. Requires sitting for long periods of time. Some bending and stretching are required. Adequate finger dexterity and feeling to perform keyboarding and substantial repetitive motions involving the wrists, hands and/or fingers. Requires vision and hearing corrected to normal range. Must be able to view computer screens and printed material accurately. Occasionally lifts and carries items weighing up to 40 lbs. Work Environment: The work environment characteristics described here are representative of those an employee encounters while performing the essential functions of this job. Reasonable accommodations will be offered to enable individuals with disabilities to perform essential functions. The work environment is typical of an office setting.

Texas
Job Closed
Scalepex logo

AWS Data Engineer

Scalepex

Helping our clients reach their peak

Data Engineer63 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Develop scalable, reliable data pipelines using AWS services (e.g., Glue, S3, Redshift) to process and transform large datasets from utility systems like smart meters or energy grids. • Use AWS Step Functions to orchestrate workflows across data pipelines; experience with Airflow is acceptable but Step Functions is preferred. • Implement ETL/ELT processes using PySpark, Python, and Pandas to clean, transform, and integrate data from multiple sources into unified datasets. • Leverage experience with complex distributed systems to ensure reliability, scalability, and performance in handling large-scale utility data. • Use AWS Lambda functions to build serverless solutions for automating data processing tasks. • Design data models tailored for utilities use cases (e.g., energy consumption forecasting) to enable advanced analytics. • Continuously monitor and improve the performance of data pipelines to reduce latency, enhance throughput, and ensure high availability. • Implement robust security measures to protect sensitive utility data and ensure compliance with industry regulations.

Texas