Ocean Technologies Group logo
Ocean Technologies Group

Powering teams that deliver for people & planet, with maritime learning, crew and fleet management and GRC solutions

Data Engineering Team Lead

Data EngineerData EngineerFull TimeRemoteSeniorTeam 201-500Since 2020H1B No SponsorCompany SiteLinkedIn

Location

Philippines

Posted

10 days ago

Salary

0

Seniority

Senior

Job Description

Data Engineering Team Lead

Ocean Technologies Group

• Lead a team of data engineers, ensuring alignment on goals, quality and delivery timelines. • Mentor and coach team members to support their technical and professional growth. • Drive engineering excellence by promoting best practices in coding, architecture, testing and observability. • Plan and manage team capacity, sprints and milestones to ensure predictable delivery. • Own the design, evolution and operation of ingestion and transformation pipelines on Apache Airflow and the analytical serving layer on Apache Druid. • Make architectural calls on concurrency, partitioning, memory sizing and cost — including JVM heap and direct-memory tuning on the Druid cluster. • Collaborate closely with DevOps on the Kubernetes / EKS platform that hosts our Druid and Airflow workloads. • Ensure robust data validation, reconciliation and verification so that reporting is trustworthy. • Collaborate with other Team Leaders, Development Managers, Architects and Product Owners to align engineering execution with business objectives. • Contribute to the evolution of development processes, CI/CD pipelines and DevOps practices. • Foster a culture of continuous improvement, innovation and knowledge sharing.

Job Requirements

  • 10+ years of commercial experience delivering Data & Analytics solutions.
  • 5+ years leading a Data Engineering or BI team with a solid grasp of Agile methodologies (Scrum / Kanban).
  • Strong hands-on expertise in Apache Airflow — production DAG authoring in Python; hooks, sensors, XCom, callbacks, retries; dynamic task mapping.
  • Production-grade Python — type hints, packaging, pytest; comfortable reading and reviewing other people's DAGs.
  • Apache Druid (ingestion-side) — index_parallel specs, transformSpec / dimensionsSpec / granularitySpec, tuningConfig, supervisors, task lifecycle, segment management; familiarity with Coordinator / Overlord / Broker / Historical / MiddleManager roles.
  • Strong PostgreSQL — query tuning, JSONB, window functions, indexes, EXPLAIN / EXPLAIN ANALYZE.
  • SQL fluency across dialects (Postgres, T-SQL, Oracle); comfortable optimising queries.
  • AWS data services — S3 (Druid deep storage), EMR, Secrets Manager, IAM, VPC fundamentals; EC2 sizing for memory-bound workloads.
  • Production debugging instincts — reading YARN container logs, tracing failure from Airflow → EMR step → Spark driver → Python traceback.
  • Exceptional communication, teamwork, attention to detail, organisational and leadership skills.
  • Ability to inspire, mentor and lead by example.****
  • Nice to have*
  • JVM tuning — heap (Xms / Xmx), direct memory, GC choice (G1, ZGC), reading GC logs; distinguishing JVM OOM from cgroup OOMKilled.
  • Kubernetes operations — pods, Deployments / StatefulSets, ConfigMaps, resource requests vs limits, HPA, kubectl describe / exec / logs.
  • Helm / Helmfile — most production Druid on EKS is Helm-deployed.
  • PySpark — DataFrame API, Spark SQL, JDBC reads / writes; deploy-mode cluster vs client; executor / driver memory.
  • Delta Lake — MERGE semantics, time-travel, schema evolution, SCD Type 1 / 2.
  • CI/CD on Bitbucket Pipelines (or transferable: GitHub Actions / GitLab CI) — OIDC-to-AWS, deploy gates, artifact handling.
  • Observability — Prometheus / Grafana for Druid metrics, distributed tracing, log aggregation (CloudWatch / Loki / ELK).
  • Terraform / IaC.
  • Other Druid ingestion sources — Kafka, Kinesis, S3 / Parquet, batch SQL.
  • dbt, Spark or Beam — for source-side transformation outside Druid.
  • Druid row-level-security patterns (RLSGroupID / parse_json transformSpec) — multi-tenant Druid experience would stand out.
  • Maritime / personnel data domain (IMO, Installation, CustomerID, Personnel Pool).

Benefits

  • Safeguard your tomorrow: **
  • Future Security with SSS & HDMF: Secure your financial future and home ownership dreams with contributions to Social Security and Home Development Mutual Fund.
  • Healthcare Assurance with Philhealth: Rest easy knowing your health is safeguarded by the Philippine Health Insurance Corporation.
  • Extra Allowances for Daily Comfort:**
  • Boost Your Budget: Receive a monthly allowance of 2,500 PHP for your everyday expenses.
  • Top-Tier Health and Wellness Coverage:**
  • Comprehensive Medical Insurance: After your initial 3 months, enjoy Maxicare's extensive medical insurance for you and one dependent, including up to 200,000 PHP per illness annually.
  • Around-the-Clock Teleconsult: Have 24/7 access to medical consultations, ensuring you and your family's health concerns are promptly addressed.
  • Annual Health Maintenance: Benefit from regular check-ups and dental care to maintain your health year-round.
  • Life Insurance Peace of Mind: Gain additional security with a life insurance policy valued at 250,000 PHP, protecting what's most important to you.
  • Employee Support and Wellbeing:**
  • Employee Assistance Programme (EAP): Our EAP, provided by Health Assured and Comp Psych, offers confidential counseling, financial, and legal support via phone, ensuring your well-being is always a priority.
  • ⭐ **OTG’s Guiding Stars:**
  • ⭐**Pioneering:** Constantly charting new courses in innovation.
  • ⭐**Caring:** Keeping the maritime community's safety and sustainability at the helm.
  • ⭐**Collaborating:** Navigating together with clear communication and shared goals.
  • ⭐**Optimizing:** Always in pursuit of excellence and constructive evolution.
  • 🚢 Will You Navigate the Next Chapter with Us****Join us on a journey that transcends a traditional job—it’s a mission to innovate at the intersection of technology and education. Discover more about our vision at OneOcean and see if your path aligns with our pioneering direction. We’re eager to welcome aboard our next visionary Junior Application Consultant, ready to make a significant impact on both tech and educational fronts.****Cast your resume into our waters and embark on a journey that transcends a mere job - it’s an adventure in innovation and support. Visit us at OneOcean and see if your compass aligns with ours. We’re excited to welcome our next Oceaneer onboard. ****
  • ✊🏼 **All Hands on Deck:** We steer with equality and celebrate the diversity of our Oceaneer’s. OneOcean is a proud equal opportunity employer, where passion unites and differences are celebrated.

Related Categories

Related Job Pages

More Data Engineer Jobs

Nuvitek logo

Data Engineer

Nuvitek

Speed Up True Modernization

Data Engineer10 days ago
Full TimeRemoteTeam 51-200Since 2012H1B No Sponsor

• Design, develop, and maintain scalable RAG/CAG pipelines for AI-powered applications • Build and optimize document ingestion workflows for structured and unstructured data sources • Manage and maintain vector stores to support semantic search and retrieval capabilities • Develop OCR processing pipelines for historical and modern document collections spanning 1781–2025 • Optimize retrieval performance, relevance tuning, and ranking strategies for LLM-based systems • Build reliable data pipelines that support integrations with large language models and AI services • Collaborate with engineers, UX teams, product owners, and stakeholders to deliver scalable AI solutions • Ensure data quality, integrity, security, and performance across ingestion and retrieval systems • Implement monitoring, logging, and troubleshooting for AI and data processing workflows • Contribute to architecture decisions, technical documentation, and engineering best practices • Participate in agile pod-based development teams and continuous improvement initiatives

United States
$115K - $125K / year
Job Closed
CSpring logo

Enterprise Data Warehouse ETL/Data Engineer

CSpring

Unlocking the power and potential of data.

Data Engineer10 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

• Design and develop reusable, parameter-driven ingestion and transformation pipelines • Build and maintain medallion architecture solutions • Develop performant ELT workflows • Create and optimize PySpark notebooks and distributed processing jobs • Design dimensional data models • Implement data vault patterns • Optimize distributed SQL workloads • Implement CI/CD processes • Build monitoring, logging, and auditing solutions • Lead or contribute to cloud modernization initiatives

Illinois

Data Engineer

UnitedHealth Group

UnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of

Data Engineer10 days ago

Role Description We are seeking a highly skilled Senior Data Engineer to design, build, and optimize scalable data and AI platforms on Azure. This role will focus on enabling enterprise data pipelines, real-time processing, and AI/ML model integration using Databricks and modern cloud technologies. You will enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges. - Design and develop scalable data pipelines using Databricks, Apache Spark, and Python on Azure - Build cloud-native solutions leveraging Azure Data Lake, Azure Data Factory, and Delta Lake - Collaborate with Data Science and AI teams to operationalize ML models and embed them into production workflows - Develop and maintain feature stores, model input pipelines, and real-time/streaming frameworks - Ensure data quality, governance, and security across the full data lifecycle - Build reusable frameworks, accelerators, and automation scripts to improve engineering efficiency - Optimize performance, scalability, and reliability of data workflows and batch/streaming pipelines - Participate in Agile development processes, including sprint planning, code reviews, and CI/CD pipelines - Provide production support and on-call coverage, ensuring system stability and rapid issue resolution - Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI Qualifications - Bachelor’s degree in Computer Science, Engineering, or IT related field - 6+ years of experience in Data Engineering with Python/PySpark - 6+ years of experience in building ETL/ELT pipelines using Databricks - 6+ years of experience working in Agile environments - 5+ years of strong experience in SQL / PL-SQL - 4+ years of experience with Azure Databricks and Delta Lake architecture - 4+ years of hands-on experience with CI/CD (GitHub Actions, Azure DevOps) - 3+ years of hands-on experience with Azure cloud services (ADF, ADLS, Databricks) - 2+ years of experience with Databricks Delta Live Tables (DLT) - 2+ years of experience with unit testing, validation, and pipeline testing frameworks Requirements - Familiarity with medallion architecture and SCD2 implementations - AI builder: Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI - Experience building enterprise-scale data platforms - Strong skills in performance tuning and debugging large-scale pipelines - Experience with real-time/streaming frameworks (Structured Streaming) - Ability to work in distributed, cross-functional global teams - Exposure to GenAI tools (e.g., GitHub Copilot) for engineering productivity - Strong understanding of secure coding practices and vulnerability remediation - Proven ability to analyze logs, troubleshoot production issues, and optimize performance - Demonstrated capability to design and deploy AI-powered solutions responsibly Benefits - Comprehensive benefits package - Incentive and recognition programs - Equity stock purchase - 401k contribution (all benefits are subject to eligibility requirements)

United States
$72.8K - $130K / year
Job Closed
Full TimeRemoteTeam 11-50H1B No Sponsor

• 8–10 years in Data Engineering and Data Analysis. • Strong hands-on experience in Informatica PowerCenter/IDQ for ETL design, development, and optimization. • Advanced skills in PySpark for large-scale data processing, transformation, and analytics. • Solid working knowledge of Hadoop technologies (HDFS, Hive, Sqoop, MapReduce). • Proficiency in Python and Kafka for streaming and batch data pipelines. • Strong understanding of database concepts, data design, data modeling, and ETL workflows. • Experience in analyzing, designing, and coding ETL programs including data extraction, ingestion, quality checks, normalization, and loading. • Hands-on experience with Agile methodology and Jira for project delivery. • Proven ability in client-facing roles with strong communication and leadership skills to coordinate across SDLC. • Exposure to AWS data components and analytics. • Familiarity with machine learning models and AI concepts. • Experience with data modeling tools such as Erwin.

Ohio + 1 moreAll locations: Ohio | Pennsylvania