Jalasoft logo
Jalasoft

We provide the best software engineering solutions by investing in our people first.

Senior Data Engineer – AWS, RAG Pipelines

Data EngineerData EngineerFull TimeRemoteSeniorTeam 1,001-5,000Since 2003H1B No SponsorCompany SiteLinkedIn

Location

Colombia

Posted

2 days ago

Salary

0

Seniority

Senior

Job Description

Senior Data Engineer – AWS, RAG Pipelines

Jalasoft

• Design and operate the cloud data infrastructure powering AI initiatives. • Architect production-scale data lakes on AWS. • Build real-time ingestion and observability pipelines. • Own the vector search and embedding layers that feed RAG systems and autonomous agents.

Job Requirements

  • Overall Experience: 7+ years in Data Engineering, Distributed Systems, or Data Architecture
  • AWS & Infrastructure: 4+ years architecting production-scale data lakes, storage tiers, and event streaming
  • AI/LLM Pipelines: 2+ years building RAG systems, managing embeddings, and orchestrating foundational models
  • Proficiency in AWS Data Lake Architecture & Storage
  • Proficiency in Real-Time Observability & Log Analytics
  • Proficiency in Elasticsearch & OpenSearch Optimization, Vectorization, Embeddings
  • Proficiency in Amazon Bedrock & Generative AI Pipelines
  • Proficiency in Software Engineering & API Ingestion
  • Production-level proficiency in one or more of: C# (.NET Core), Java, Python, or Node.js
  • AWS S3 partitioning strategies, lifecycle policies, and columnar formats (Parquet, Iceberg)
  • AWS Glue Data Catalog and Lake Formation for multi-tenant, fine-grained access control
  • Query optimization over petabyte-scale datasets using Amazon Athena and Redshift Spectrum
  • Distributed oTel collector configuration for log, trace, and metrics capture and routing into S3
  • High-volume streaming of system logs, Datadog captures, and raw server events into S3
  • Real-time CDC from PostgreSQL using Debezium or AWS DMS
  • Amazon OpenSearch clusters with simultaneous lexical and high-dimensional vector search
  • OpenSearch index lifecycle management, sharding strategies, and dynamic mappings at scale
  • Amazon Bedrock foundational model APIs (Claude, Titan) for data enrichment, classification, and semantic parsing
  • Knowledge Bases for Amazon Bedrock for automatic chunking, metadata extraction, and vector index syncs from S3
  • ETL/ELT pipelines ingesting unstructured event data from SaaS APIs (e.g., Pendo, Hotjar, Google Analytics)
  • MCP server development to expose data lake context and utilities to AI agents

Benefits

  • Remote work.
  • 13 floating holiday.
  • 15 vacation days per year completed.
  • Good working environment.

Related Categories

Related Job Pages

More Data Engineer Jobs

Quantiphi logo

Architect, Data Engineer

Quantiphi

Pioneering AI-first solutions, solving complex business challenges through expertise, cloud, data engineering, and AI.

Data Engineer2 days ago
Full TimeRemoteTeam 1,001-5,000Since 2013H1B Sponsor

• Lead the architectural vision for a next-generation data layer designed specifically for Agentic AI. • Design the end-to-end blueprint for a modern data layer that seamlessly integrates structured, unstructured, and relational (Graph) data for AI agents. • Oversee the health, security, and performance optimization of our data clusters (Snowflake/Kinetica), ensuring 99.9% availability for mission-critical AI workflows. • Act as the 'Face of Engineering' for the customer. Lead discovery workshops, manage technical expectations, and align the architectural roadmap with their business objectives. • Establish benchmarks for data latency and retrieval accuracy, ensuring the data layer can keep pace with the real-time demands of agentic execution.

Massachusetts
Full TimeRemoteTeam 5,001-10,000Since 2000H1B Sponsor

Role Description - Data Analysis & Insight Generation: - Analyze large and complex datasets to extract meaningful insights that drive business outcomes. - Communicate findings and recommendations through reports, dashboards, and presentations. - Data Engineering & Preparation: - Clean, preprocess, and transform raw data for analysis and modeling. - Collaborate with data engineering teams to ensure data availability and quality. - Collaboration with Stakeholders: - Work closely with product managers, engineers, and business leaders to understand requirements and deliver data-driven solutions. - Translate business problems into analytical frameworks. - A/B Testing & Experimentation: - Design and analyze A/B tests to measure the impact of product changes and marketing campaigns. - Provide statistical rigor in experimentation and decision-making. - Research & Innovation: - Stay up-to-date with the latest developments in data science, machine learning, and AI. - Propose innovative approaches and solutions for complex problems. - Other duties as assigned Qualifications - Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, Data Science, or a related field. - 5+ years of experience in data science or a related field. - Hands-on experience with data analysis, machine learning, and statistical modeling. - Proficiency in Python, R or similar technologies for data analysis and modeling. - Strong experience with data manipulation libraries (e.g., Pandas, NumPy) and machine learning libraries (e.g., Scikit-Learn, TensorFlow, PyTorch). - SQL proficiency for data extraction and transformation. - Knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and big data technologies (e.g., Spark, Hadoop) is a plus. Benefits - Medical, Dental & Vision Benefits - 401(k) Savings Plan with Company Match - Flexible Planned Paid Time Off - Generous Sick Leave - Inclusive & Welcoming Environment - Purpose-Driven Culture - Work-Life Balance - Commitment to Community Involvement - Employer-Paid Parental Leave - Employer-Paid Short-Term Disability - Remote Work Flexibility

United States
Full TimeRemoteTeam 1,001-5,000Since 1980H1B Sponsor

• Senior Data Engineers for various and unanticipated worksites throughout the U.S. (HQ: Chicago, IL). • Develop large scale end to end data pipeline applications, covering multiple data sources spread across data center and AWS cloud. • Use developed software applications to locate and analyze source data; create data flows to extract, profile, and store ingested data; define and build data cleansing and imputation; map to a common data model; transform to satisfy business rules and statistical computations; and validate data content. • Produce software data building blocks, data models, and data flows, such as dimensional data, data feeds, dashboard reporting, and data science research and exploration. • Produce automated software tests of data flow components and for data content quality. • Automate orchestration and error handling for use by production operation teams. • Provide technical expertise to diagnose errors from production support teams. • Guide junior team members in performance tuning applications in distributed computing environments. • Perform root cause analysis on all data and processes and identify opportunities for improvement. • Develop metadata-driven and fully parameterized data processing tools. • Mentor junior engineers.

Illinois
$108.1K - $125.3K / year
Full TimeRemoteTeam 10,001+H1B Sponsor

• Design and develop complex Power BI semantic models and scalable reporting solutions. • Write advanced SQL (including Databricks SQL) and DAX to implement complex business logic. • Architect and maintain shared semantic models and datasets for analytics across reporting solutions. • Diagnose and resolve performance issues across Databricks and Power BI. • Collaborate with data engineering teams to define and consume curated Gold-layer datasets. • Refactor existing reports to transition to governed semantic models built on Databricks-backed data products. • Implement dataset governance practices including certification, documentation, and metric standardization. • Develop and validate data quality checks across Silver and Gold layers. • Design and implement automated analytical workflows integrating Power BI, Python, and the Power Platform. • Build forecasting, trend analysis, and statistical models supporting advanced analytics use cases. • Perform code reviews and provide technical guidance to Associate developers.

Virginia
$6K - $13.0K / month