Stratus

Built Around People. Driven by Outcomes. Designed for P&C Insurance.

Senior Data Architect – Hands on

Data EngineerData EngineerFull Time Remote SeniorTeam 501-1,000Since 2001H1B SponsorCompany Site LinkedIn

Location

United States

Posted

3 days ago

Salary

Seniority

Senior

Bachelor Degree8 yrs expEnglishAWS Azure Cloud Google Cloud Platform MongoDB

Job Description

• Own our canonical data architecture — the schema, contracts, tenancy, and governance. • Make production data AI-ready: well-modeled, contract-enforced, lineage-tracked, and drift-detectable. • Own the canonical data model — the normalized definition of the core business objects shared across our products. • Define the multi-tenant data architecture: tenancy isolation, data residency posture, and per-tenant cost attribution. • Lead staged modernization toward the right mix of stores and patterns. • Own the architectural direction of the data pipeline and lake / lakehouse layer. • Drive hands-on prototypes, reference implementations, and in-repo guardrails. • Partner with database engineering on production data health while owning long-term architectural direction.

Job Requirements

8+ years in data architecture, data engineering, database administration, or analytics engineering, with 3+ years in senior / lead roles.
Demonstrated ownership of a canonical or enterprise data model / cross-product schema.
Hands-on MongoDB at production scale (Atlas M40+ ideal): document modeling, aggregation framework, indexing, change streams, sharding, replica sets.
Strong polyglot-persistence judgment.
Hands-on relational depth: schema design, indexing strategy, and query tuning.
Production experience making data AI/ML-ready: data architecture supporting RAG, semantic search, embeddings / vector pipelines, or agentic workloads.
Multi-tenant architecture experience: data residency and per-tenant cost attribution.
Pipeline / ELT / lake / lakehouse design at scale, with incremental migration strategies that minimize disruption.
Cloud-native data services (Azure, AWS, or GCP).
Strong grasp of data quality, testing, lineage, and monitoring.

Benefits

Comprehensive and competitive health benefits plan
Matching 401k contributions
20 days annual PTO
Primarily remote work with occasional annual team onsites

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More Data Engineer Jobs

Senior Data Architect – Technology

Design and Build The Future | Somos uma empresa Randoncorp

Data Engineer3 days ago

Full Time RemoteTeam 501-1,000H1B Sponsor

Company Site LinkedIn

• Define and document end-to-end data architectures, from source ingestion to analytical consumption layers, ensuring scalability, performance, and governance • Establish technical standards, development guidelines, and architectural decisions to promote consistency across projects and teams • Lead adoption and evolution of Lakehouse architecture (Medallion Architecture) in Databricks and Azure environments, including partitioning, clustering, and Delta Table optimization strategies • Define and guide data ingestion strategies for batch, incremental, CDC, and streaming scenarios • Provide technical leadership for data modeling decisions appropriate to the analytical context and consumption patterns • Ensure implementation of data governance practices (cataloging, lineage, access control at object/row/column levels) • Act in a consultative capacity with clients, conducting technical discovery, gathering requirements, and presenting architectural proposals • Support and review the work of data engineers, ensuring adherence to defined standards and promoting development best practices • Collaborate with BI, Engineering, and AI teams to define the data layers that support analytical, semantic, and ML models • Evaluate and recommend technologies, tools, and integration patterns aligned with the Azure + Databricks ecosystem • Monitor the health of data platforms (quality, SLAs, compute and storage costs) and propose continuous improvements • Contribute to pre-sales activities and technical qualification of opportunities by developing reference architectures and effort estimates.

Azure PySpark Spark Unity

View details: Senior Data Architect – Technology

Brazil

Apply

Data Engineer

Stefanini LATAM

Co-creating solutions for a better future

Data Engineer3 days ago

Full Time RemoteTeam 10,001+Since 1987H1B No Sponsor

Company Site LinkedIn

• Participar en la definición y ejecución de estrategias de migración de datos. • Realizar análisis de estructuras origen y destino. • Implementar, administrar y validar estructuras de bases de datos. • Garantizar la integridad, consistencia y calidad de los datos migrados. • Ejecutar procesos de validación y conciliación de información. • Participar en pruebas funcionales, de carga y rendimiento. • Gestionar incidencias relacionadas con bases de datos durante la ejecución del proyecto. • Elaborar documentación técnica y evidencias de migración. • Apoyar la definición y ejecución de estrategias de gestión de datos históricos. • Analizar y definir proceso de integraciones con sistemas satélite y terceros.

SQL

View details: Data Engineer

Honduras

Apply

Data Engineer

Arva Intelligence

Only applicants currently, and in the future, eligible to work in the United States will be considered for this position.

Data Engineer3 days ago

Full Time Remote

Role Description The Data Engineer is responsible for building and scaling the data and computational backbone that supports Arva’s ecosystem modeling and measurement, reporting, and verification platforms. This role sits within a multidisciplinary Data Science team and focuses on designing reliable, auditable, and scalable data systems that enable biogeochemical modeling and optimization at production scale. In this role, the Data Engineer will design and maintain production-grade data pipelines that integrate diverse datasets including field measurements, management practices, soils, and weather with process-based ecosystem models. The role plays a critical part in ensuring data quality, reproducibility, and traceability so that scientific outputs can be translated into trusted, credit-grade results with real-world impact. Qualifications - 3+ years demonstrated experience building and maintaining data pipelines for large, complex, and heterogeneous datasets - Strong proficiency in Python and modern data engineering tools, with experience writing production-grade, testable code - Experience working with cloud platforms, with AWS strongly preferred - Familiarity with containerization tools such as Docker and version control systems such as GitHub - Experience with relational and spatial databases, including PostgreSQL and PostGIS - Experience working with geospatial data formats and spatial data processing - Experience supporting scientific or ecosystem modeling workflows preferred - Familiarity with workflow orchestration tools such as Airflow or Prefect preferred - Bachelor’s or Master’s degree or equivalent experience in Data Engineering, Computer Science, Environmental Informatics, or a related field Requirements - Design, implement, and maintain scalable data pipelines supporting ecosystem and biogeochemical modeling - Build reproducible workflows that generate standardized model inputs and manage outputs across space, time, and scenario analysis - Integrate heterogeneous datasets, including field data, management data, soil data, and weather data, into modeling pipelines - Develop and maintain cloud-based infrastructure to support modeling pipelines and optimization workflows - Implement data storage solutions using relational, spatial, and object-based databases - Support efficient data access and processing using platforms such as PostgreSQL, PostGIS, and cloud object storage - Ensure data quality, versioning, traceability, and auditability to support measurement, reporting, and verification requirements - Implement validation and monitoring processes to ensure reliability of model inputs and outputs - Support transparent, repeatable workflows suitable for regulatory and credit market review - Write clean, modular, and well-documented production code that supports maintainable and scalable data systems - Apply software engineering best practices including testing, version control, and documentation - Collaborate closely with Data Science and Technology teams to align data infrastructure with modeling, analytics, and production needs Benefits - $95k - $130k base salary range

Python Data Engineering AWS Docker/Containers Docker GitHub PostgreSQL PostGIS Airflow Observability/Monitoring

View details: Data Engineer

United States

$95K - $130K / year

Apply

Data & AI Engineer

Team Passerelle

Data Engineer4 days ago

Full Time Remote

Role Description Als Data & AI Engineer schaffst du bei uns die zwingende technische Voraussetzung für jeden erfolgreichen KI-Einsatz: eine belastbare und strukturierte Datenbasis. Dein Schwerpunkt liegt darauf, historisch gewachsene, heterogene Datenlandschaften zu erschließen und für moderne KI-Anwendungen, insbesondere Retrieval-Systeme, nutzbar zu machen. Als eine:r der ersten dedizierten Engineering-Hires gestaltest du den Aufbau unserer technischen Umsetzungskraft mit – eng an der Seite unseres AI Solutions Architect. Dein Tätigkeitsfeld umfasst zwei Bereiche: - In unseren Beratungsmandaten analysierst du die bestehende Datenarchitektur, deckst Lücken auf und legst das Fundament für die KI-Strategie. - Parallel dazu entwickelst du die Daten- und Retrieval-Pipelines für unsere eigene KI-Infrastruktur und Software-Produkte. Deine Aufgaben - Datenbestandsaufnahme & Reifegrad: Du erstellst Datenlandkarten über heterogene Bestände hinweg und bewertest den Reifegrad des digitalen Fundaments. Deine Lückenanalysen zu Identifikatoren und Metadaten zeigen präzise, wo der Hebel liegt. - AI Data Ingestion (KI-Enablement): Du erschließt unstrukturierte Datenquellen (PDFs, Berichte, Publikationen) für die Nutzung in KI-Systemen – Text-Extraktion, Chunking-Strategien, Metadaten-Generierung, z. B. mit Werkzeugen wie LlamaParse oder Unstructuredio. - Datenfundament & Retrieval: Du entwickelst Metadaten- und Identifikatorkonzepte, Datenmodelle und Embedding-Pipelines und baust die Retrieval-Grundlage für RAG-Anwendungen – inklusive Befüllung und Betrieb von Vektordatenbanken (z. B. Qdrant, Weaviate, pgvector). Die Qualität dieser Grundlage bewertest du systematisch. - Datenschutz & Souveränität: Du gehst verantwortungsvoll mit sensiblen Daten um und stimmst dich eng zu AI-Governance-, Datenschutz, EU AI Act und Souveränitätsanforderungen ab. Datensparsamkeit und Schutzwürdigkeit denkst du von Anfang an mit. - Pipelines für interne Software-Entwicklung: Du baust und betreibst perspektivisch die Ingest- und Retrieval-Pipelines für eigene Software-Produkte – mit DataOps-Mindset (Versioning, Testing, Observability) und einem Verständnis agentischer Muster inkl. Human-in-the-loop. Qualifications - Fundierte Data-Engineering-Erfahrung: Mehrjährige (3+ Jahre) im Data Engineering oder als Data Platform Engineer – idealerweise in gewachsenen, heterogenen Datenlandschaften. Exzellentes Python und SQL sowie sicherer Umgang mit dem Modern Data Stack (z. B. dbt, Airflow, Dagster) und ETL-/ELT-Prozessen. - KI-Enablement: Praktische Erfahrung mit Embedding-Pipelines und Vektordatenbanken (z. B. Qdrant, Weaviate, Milvus, pgvector), ein Gespür für Retrieval-Strategien und Erfahrung mit der Erschließung unstrukturierter Daten (z. B. LlamaParse, Unstructuredio). - Datenschutz-Bewusstsein: Erfahrung im verantwortungsvollen Umgang mit sensiblen und personenbezogenen Daten sowie Kenntnis der einschlägigen Anforderungen (insb. DSGVO, EU-AI-Act-Awareness). - Pragmatismus bei realer Datenlage: Du fühlst dich in unvollständigen, gewachsenen Datenbeständen wohl und weißt, dass ein nutzbares Datenmodell mehr wert ist als ein perfektes. Du priorisierst, wo es zählt. - Kommunikationsstärke & Haltung: Du übersetzt die Datenrealität verständlich für nicht-technische Stakeholder und kommunizierst auf Augenhöhe mit Fachbereichen. Dein Deutsch und Englisch ist verhandlungssicher. Du steuerst dich selbst, denkst lösungsorientiert und teilst unsere Werte rund um eine gerechte Arbeitswelt von morgen. Benefits - Echter Impact & Haltung: Ein Arbeitsumfeld, das technologische Innovation mit gesellschaftlicher Verantwortung und nachhaltigen Werten verbindet. Du gestaltest die KI-Transformation an vorderster Front nach europäischen, demokratischen Werten. - Sichtbarkeit & Netzwerk: Einblicke in hochkarätige Mandate aus Politik, Wirtschaft und Gewerkschaften. - Hohe Autonomie: Flache Strukturen, die bewusste Abwesenheit von Mikromanagement und echte Verantwortung für deine Accounts und Themen. - Flexibles Set-up: Remote-first mit einem Kernteam in Berlin sowie flexiblen Arbeitszeiten, die zu deinem Leben passen. - Standards: 30 Tage Urlaub, ein eigenes Weiterbildungsbudget und modernste Arbeitsausstattung. - Faire Vergütung: Ein transparentes Gehaltsband von 80.000 bis 95.000 € brutto p.a. auf Basis einer 40h Woche (je nach Erfahrung). - Langfristige Perspektive: Die Stelle ist aufgrund unserer agilen Startup-Phase zunächst auf ein Jahr befristet. Da wir uns im nachhaltigen Aufbau befinden, ist eine langfristige Zusammenarbeit unser klares Ziel. Eine Verlängerung oder Entfristung wird bei entsprechender Mandats- und Geschäftsentwicklung ausdrücklich angestrebt.

AI Weaviate Observability/Monitoring Data Engineering Python SQL dbt Airflow ETL

View details: Data & AI Engineer

Germany

€80K - €95K / year

Apply

Senior Data Architect – Hands on

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Senior Data Architect – Technology

Data Engineer

Data Engineer

Data & AI Engineer