Only applicants currently, and in the future, eligible to work in the United States will be considered for this position.
Data Engineer
Location
United States
Posted
2 days ago
Salary
$95K - $130K / year
Seniority
Mid Level
Job Description
Data Engineer
Arva Intelligence
Role Description The Data Engineer is responsible for building and scaling the data and computational backbone that supports Arva’s ecosystem modeling and measurement, reporting, and verification platforms. This role sits within a multidisciplinary Data Science team and focuses on designing reliable, auditable, and scalable data systems that enable biogeochemical modeling and optimization at production scale. In this role, the Data Engineer will design and maintain production-grade data pipelines that integrate diverse datasets including field measurements, management practices, soils, and weather with process-based ecosystem models. The role plays a critical part in ensuring data quality, reproducibility, and traceability so that scientific outputs can be translated into trusted, credit-grade results with real-world impact. Qualifications - 3+ years demonstrated experience building and maintaining data pipelines for large, complex, and heterogeneous datasets - Strong proficiency in Python and modern data engineering tools, with experience writing production-grade, testable code - Experience working with cloud platforms, with AWS strongly preferred - Familiarity with containerization tools such as Docker and version control systems such as GitHub - Experience with relational and spatial databases, including PostgreSQL and PostGIS - Experience working with geospatial data formats and spatial data processing - Experience supporting scientific or ecosystem modeling workflows preferred - Familiarity with workflow orchestration tools such as Airflow or Prefect preferred - Bachelor’s or Master’s degree or equivalent experience in Data Engineering, Computer Science, Environmental Informatics, or a related field Requirements - Design, implement, and maintain scalable data pipelines supporting ecosystem and biogeochemical modeling - Build reproducible workflows that generate standardized model inputs and manage outputs across space, time, and scenario analysis - Integrate heterogeneous datasets, including field data, management data, soil data, and weather data, into modeling pipelines - Develop and maintain cloud-based infrastructure to support modeling pipelines and optimization workflows - Implement data storage solutions using relational, spatial, and object-based databases - Support efficient data access and processing using platforms such as PostgreSQL, PostGIS, and cloud object storage - Ensure data quality, versioning, traceability, and auditability to support measurement, reporting, and verification requirements - Implement validation and monitoring processes to ensure reliability of model inputs and outputs - Support transparent, repeatable workflows suitable for regulatory and credit market review - Write clean, modular, and well-documented production code that supports maintainable and scalable data systems - Apply software engineering best practices including testing, version control, and documentation - Collaborate closely with Data Science and Technology teams to align data infrastructure with modeling, analytics, and production needs Benefits - $95k - $130k base salary range
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Role Description Als Data & AI Engineer schaffst du bei uns die zwingende technische Voraussetzung für jeden erfolgreichen KI-Einsatz: eine belastbare und strukturierte Datenbasis. Dein Schwerpunkt liegt darauf, historisch gewachsene, heterogene Datenlandschaften zu erschließen und für moderne KI-Anwendungen, insbesondere Retrieval-Systeme, nutzbar zu machen. Als eine:r der ersten dedizierten Engineering-Hires gestaltest du den Aufbau unserer technischen Umsetzungskraft mit – eng an der Seite unseres AI Solutions Architect. Dein Tätigkeitsfeld umfasst zwei Bereiche: - In unseren Beratungsmandaten analysierst du die bestehende Datenarchitektur, deckst Lücken auf und legst das Fundament für die KI-Strategie. - Parallel dazu entwickelst du die Daten- und Retrieval-Pipelines für unsere eigene KI-Infrastruktur und Software-Produkte. Deine Aufgaben - Datenbestandsaufnahme & Reifegrad: Du erstellst Datenlandkarten über heterogene Bestände hinweg und bewertest den Reifegrad des digitalen Fundaments. Deine Lückenanalysen zu Identifikatoren und Metadaten zeigen präzise, wo der Hebel liegt. - AI Data Ingestion (KI-Enablement): Du erschließt unstrukturierte Datenquellen (PDFs, Berichte, Publikationen) für die Nutzung in KI-Systemen – Text-Extraktion, Chunking-Strategien, Metadaten-Generierung, z. B. mit Werkzeugen wie LlamaParse oder Unstructuredio. - Datenfundament & Retrieval: Du entwickelst Metadaten- und Identifikatorkonzepte, Datenmodelle und Embedding-Pipelines und baust die Retrieval-Grundlage für RAG-Anwendungen – inklusive Befüllung und Betrieb von Vektordatenbanken (z. B. Qdrant, Weaviate, pgvector). Die Qualität dieser Grundlage bewertest du systematisch. - Datenschutz & Souveränität: Du gehst verantwortungsvoll mit sensiblen Daten um und stimmst dich eng zu AI-Governance-, Datenschutz, EU AI Act und Souveränitätsanforderungen ab. Datensparsamkeit und Schutzwürdigkeit denkst du von Anfang an mit. - Pipelines für interne Software-Entwicklung: Du baust und betreibst perspektivisch die Ingest- und Retrieval-Pipelines für eigene Software-Produkte – mit DataOps-Mindset (Versioning, Testing, Observability) und einem Verständnis agentischer Muster inkl. Human-in-the-loop. Qualifications - Fundierte Data-Engineering-Erfahrung: Mehrjährige (3+ Jahre) im Data Engineering oder als Data Platform Engineer – idealerweise in gewachsenen, heterogenen Datenlandschaften. Exzellentes Python und SQL sowie sicherer Umgang mit dem Modern Data Stack (z. B. dbt, Airflow, Dagster) und ETL-/ELT-Prozessen. - KI-Enablement: Praktische Erfahrung mit Embedding-Pipelines und Vektordatenbanken (z. B. Qdrant, Weaviate, Milvus, pgvector), ein Gespür für Retrieval-Strategien und Erfahrung mit der Erschließung unstrukturierter Daten (z. B. LlamaParse, Unstructuredio). - Datenschutz-Bewusstsein: Erfahrung im verantwortungsvollen Umgang mit sensiblen und personenbezogenen Daten sowie Kenntnis der einschlägigen Anforderungen (insb. DSGVO, EU-AI-Act-Awareness). - Pragmatismus bei realer Datenlage: Du fühlst dich in unvollständigen, gewachsenen Datenbeständen wohl und weißt, dass ein nutzbares Datenmodell mehr wert ist als ein perfektes. Du priorisierst, wo es zählt. - Kommunikationsstärke & Haltung: Du übersetzt die Datenrealität verständlich für nicht-technische Stakeholder und kommunizierst auf Augenhöhe mit Fachbereichen. Dein Deutsch und Englisch ist verhandlungssicher. Du steuerst dich selbst, denkst lösungsorientiert und teilst unsere Werte rund um eine gerechte Arbeitswelt von morgen. Benefits - Echter Impact & Haltung: Ein Arbeitsumfeld, das technologische Innovation mit gesellschaftlicher Verantwortung und nachhaltigen Werten verbindet. Du gestaltest die KI-Transformation an vorderster Front nach europäischen, demokratischen Werten. - Sichtbarkeit & Netzwerk: Einblicke in hochkarätige Mandate aus Politik, Wirtschaft und Gewerkschaften. - Hohe Autonomie: Flache Strukturen, die bewusste Abwesenheit von Mikromanagement und echte Verantwortung für deine Accounts und Themen. - Flexibles Set-up: Remote-first mit einem Kernteam in Berlin sowie flexiblen Arbeitszeiten, die zu deinem Leben passen. - Standards: 30 Tage Urlaub, ein eigenes Weiterbildungsbudget und modernste Arbeitsausstattung. - Faire Vergütung: Ein transparentes Gehaltsband von 80.000 bis 95.000 € brutto p.a. auf Basis einer 40h Woche (je nach Erfahrung). - Langfristige Perspektive: Die Stelle ist aufgrund unserer agilen Startup-Phase zunächst auf ein Jahr befristet. Da wir uns im nachhaltigen Aufbau befinden, ist eine langfristige Zusammenarbeit unser klares Ziel. Eine Verlängerung oder Entfristung wird bei entsprechender Mandats- und Geschäftsentwicklung ausdrücklich angestrebt.
Data Engineer
3 Oaks Gaming3 Oaks Gaming is a fast-growing distributor of iGaming content & marketing tools for regulated markets across the globe
• Develop and maintain Python scripts to retrieve and process data from APIs; • Clean and transform raw data into structured formats; • Troubleshoot and debug issues related to API requests and data processing; • Continuously improve the project/application by optimizing performance, enhancing features, and implementing best practices; • Generate, analyze, and visualize reports using Tableau to support business decisions; • Create and manage dashboards, filters, and data visualizations to provide insights; • Collaborate with teams to ensure data accuracy and system efficiency.
Role Description We are seeking a skilled and operationally minded Data Effects Cell (DEC) Coordinator to support U.S. Army Special Operations Command (USASOC) Tactical Mission Network (TMN) operations. The DEC Coordinator serves as the central coordinator for all data-related activities within the cell, ensuring that data collection, processing, exploitation, and dissemination efforts are synchronized, prioritized, and producing tangible effects in support of USASOC mission requirements. This position sits at the intersection of data operations, mission support, and organizational coordination — requiring someone who understands both the technical realities of working with complex military datasets and the operational demands of a Special Operations customer. The ideal candidate brings prior military or DoD contracting experience, a working knowledge of SOF data environments, and the ability to translate raw data activities into mission-relevant effects that commanders can act on. Familiarity with the European theatre, and EUCOM/SOCEUR priorities, and the unique political-military dynamics of Eastern Europe (EE) is a distinct advantage. This position is Remote/ Eastern Europe. This position requires an active DoD Top Secret SCI which requires US citizenship for work on DoD contracts. Application Deadline: June 29, 2026 Essential Duties & Responsibilities - Data Effects Cell Coordination - Serve as the primary coordinator for all Data Effects Cell activities, synchronizing data collection, processing, analysis, and dissemination efforts across cell members and supported units. - Maintain a current and accurate picture of all active data tasks, RFIs, and data products in work across the cell, ensuring nothing falls through the cracks. - Develop and manage the DEC battle rhythm, including recurring meetings, reporting cycles, product delivery schedules, and coordination touchpoints with the supported unit. - Track the status of all data taskings from receipt through delivery, maintaining accountability for quality, timeliness, and relevance of outputs. - Identify and resolve coordination bottlenecks, resource conflicts, or prioritization disputes within the cell and escalate unresolved issues to the program or site lead. - RFI Management & Data Product Oversight - Receive, triage, and assign incoming Requests for Information (RFIs) from supported units, ensuring each is properly scoped, resourced, and tracked to completion. - Coordinate with data analysts, subcontractors, and subject matter experts to ensure RFI responses and associated data visualization review (DVR) products meet unit requirements and quality standards. - Review data products for completeness, accuracy, and alignment with the originating RFI before delivery to the supported unit. - Maintain an RFI log and product library to track historical requests, avoid duplication of effort, and support trend analysis across supported unit data needs. - Data Synchronization & Integration - Synchronize data activities across the DEC to ensure outputs from different cell members and subcontractors are integrated, deconflicted, and mutually supporting. - Coordinate with the program's data strategist, site leads, and TMN technical staff to ensure data workflows and pipelines are aligned with cell priorities and operational requirements. - Facilitate the integration of multiple data sources — including geospatial, tactical, ISR, and open-source datasets — into cohesive, actionable analytical products. - Ensure data products and deliverables are formatted and packaged appropriately for the intended audience, whether that is a staff planner, a commander, or a technical analyst. - Supported Unit Liaison - Serve as the primary day-to-day point of contact between the Data Effects Cell and the supported USASOC unit staff, building and maintaining strong working relationships at the action officer and staff NCO level. - Attend supported unit battle rhythm events as required (staff syncs, ISR syncs, planning sessions) to maintain situational awareness of evolving data requirements and emerging priorities. - Proactively communicate cell capacity, current workload, and product status to the supported unit to manage expectations and prevent last-minute surprises. - Solicit feedback from the supported unit on data product quality and utility, and incorporate lessons learned into cell processes and standards. - Data Operations Planning & Prioritization - Assist in the development and maintenance of the TMN Data Strategy by providing ground-level insight into supported unit data demands, recurring gaps, and unmet requirements. - Contribute to the planning and execution of data-related support to exercises, mission rehearsals, and operational events. - Prioritize competing data tasks and allocate cell resources in coordination with the program or site lead, balancing immediate unit needs against longer-term data development efforts. - Identify patterns in RFI volume, topic, and data source usage to inform proactive product development and reduce reactive demand on the cell. - Quality Control & Standards - Enforce data product quality standards across the cell, ensuring all deliverables are accurate, clearly presented, appropriately classified, and operationally relevant before release. - Develop and maintain standard operating procedures (SOPs) for DEC coordination processes, RFI management, product review, and data dissemination. - Conduct after-action reviews (AARs) on significant data tasks and cell performance, documenting lessons learned and driving continuous improvement. - Ensure all cell activities comply with applicable DoD data policies, classification handling requirements, and USASOC data governance standards. - Reporting & Communication - Prepare and deliver regular status briefings and written reports on cell activities, RFI status, product delivery, and data operations to program leadership and the supported unit. - Maintain accurate and current documentation of all cell activities, data tasks, and coordination actions in designated tracking systems. - Draft and coordinate correspondence, product transmittals, and coordination packages on behalf of the DEC as required. Qualifications - Bachelor's degree in Information Systems, Intelligence Studies, Operations Research, Data Science, or a related field; equivalent military education and experience will be considered in lieu of degree. - 4+ years of experience in a data operations, intelligence analysis, information management, or mission support coordination role within a DoD or defense contracting environment. - Prior U.S. military service with experience in an operations, intelligence, or data/information management function strongly preferred. - Demonstrated experience managing multiple concurrent tasks, products, or requests in a high-tempo operational environment. - Familiarity with military RFI processes, intelligence production cycles, or data product development and delivery workflows. - Strong organizational skills with meticulous attention to detail and a commitment to product quality and on-time delivery. - Excellent written and verbal communication skills with experience briefing military staff and preparing professional-grade written products. - Proficiency with common productivity and tracking tools (e.g., Microsoft Office Suite, SharePoint, or equivalent collaboration platforms). - Active TS/SCI security clearance required. Desired Skills/Experience - Prior experience supporting USASOC, USSOCOM, or a Special Operations Task Force in an operations, intelligence, or data support role. - Familiarity with Tactical Mission Network (TMN) architectures, SOF C2 systems, and the data environments that support Special Operations at the tactical edge. - Experience with data visualization tools (e.g., Tableau, Power BI, ArcGIS) and the ability to critically assess the quality of geospatial and analytical data products. - Background as a military intelligence analyst (35-series), information manager, or operations NCO/officer with exposure to data-intensive staff functions. - Experience working with or managing subcontractor deliverables in a DoD program environment. - Familiarity with ISR, GEOINT, SIGINT, or OSINT data types and their application in SOF operational contexts. - Experience developing or enforcing SOPs, data standards, or quality control processes in an operational setting. - Experience with force protection planning and emergency action procedures in an OCONUS environment. - Proficient in Eastern European language such as (but not limited to): Russian, Ukrainian, Polish, etc. Benefits - Health insurance - Paid leave - Retirement Salary The proposed salary for this position is: $195,000 — $228,000 USD. Company Description At SMX®, we are a team of technical and domain experts dedicated to enabling your mission. From priority national security initiatives for the DoD to highly assured and compliant solutions for healthcare, we understand that digital transformation is key to your future success. We share your vision for the future and strive to accelerate your impact on the world. We bring both cutting edge technology and an expansive view of what’s possible to every engagement. Our delivery model and unique approaches harness our deep technical and domain knowledge, providing forward-looking insights and practical solutions to power secure mission acceleration. SMX is an Equal Opportunity employer including disabilities and veterans. Selected applicant may be subject to a background investigation and/or education verification. SMX does not sponsor a new applicant for employment authorization or immigration related support for this position (i.e. H1B, F-1 OPT, F-1 STEM OPT, F-1 CPT, J-1, TN, E-2, E-3, L-1 and O-1, or any EADs or other forms of work authorization that require immigration support from an employer).
Enterprise Data Architect
InfosysFounded in 1981, Infosys is an information technology and services company providing consulting, outsourcing, technology, and next-generation services to client
• Enterprise AI is forcing organisations to rethink their data estates. • Data platforms designed mainly for reporting are often not enough for GenAI, semantic search, agentic workflows and AI-enabled decision-making. • Help clients transform fragmented data estates into AI-ready foundations. • Advise on architecture decisions across cloud data platforms, lakehouse and warehouse patterns, data products, semantic layers, metadata, lineage, governance, knowledge graphs and GenAI retrieval patterns. • Diagnose ambiguous client problems, shape options, make trade-offs explicit, and translate complex data architecture issues into clear decisions for both technical teams and executive stakeholders. • Work in cross-functional teams alongside product owners, data scientists, ML and GenAI engineers, data engineers, business analysts and client stakeholders. • Typical outputs may include target-state architectures, maturity assessments, platform option appraisals, data product designs, governance models, lineage maps, ontology and semantic models, integration patterns, GenAI data-readiness assessments and implementation roadmaps.


