Job Closed
This listing is no longer active.
Datadog provides cloud-scale monitoring and security for metrics, traces and logs in one unified platform.
Senior AI Engineer – APM Integrations
Location
Portugal
Posted
132 days ago
Salary
0
Seniority
Senior
Job Description
Senior AI Engineer – APM Integrations
Datadog
• Build agent workflows that take an integration need from plan to implementation and validation with humans approving at the right checkpoints. • Create systems that synthesize context from codebases, docs, specs, telemetry, and historical incidents to make changes that match Datadog conventions and customer expectations. • Generate and evolve integration code and tests, including end-to-end scenarios that reflect real customer workloads and product features. • Design evaluation harnesses that prevent silent regressions: golden sets, scenario baselines, semantic checks, performance thresholds, and release gating. • Build portfolio-level automation: proactive updates for upstream breaking changes, tracer feature rollouts across the catalog, migrations to new schemas/semantics, and targeted coverage expansion. • Partner tightly with PM, support engineers, and integration-owning teams to make the system adoptable, trustworthy, and embedded in daily engineering workflows.
Job Requirements
- 6+ years building backend systems (Go, Java, or .NET) with strong focus on simplicity, correctness, and performance.
- Proven experience delivering LLM/agent features to production (prompting, tooling, evals, safety/guardrails).
- Comfortable navigating ambiguity, iterating from prototype to production, and measuring impact with clear metrics.
- Solid grasp of the ML lifecycle (task definition, dataset construction, modeling, evaluation, deployment, iteration) and statistics for experiments.
- Fluency with offline/online evals: golden sets, automated regressions, and evaluation harnesses that prevent silent quality drift.
- Experience with microservices performance: tracing, latency breakdowns, concurrency, resiliency patterns.
- Production operations mindset: monitoring, alerting, and participating in on‑call rotations where applicable.
Benefits
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development, product training, and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks, our Internal panel discussions
- Free, global mental health benefits for employees and dependents age 6+
- Competitive global benefits
Related Guides
Related Job Pages
More AI Engineer Jobs
Staff AI Engineer, LLM
TidioAI-powered customer service management platform. We help businesses convert more leads and grow sales. Join us!
• Designing, implementing, and deploying end-to-end NLP and deep learning systems • Building LLM-powered applications that interact with real users • Developing and maintaining production Python services • Exposing models and pipelines via REST APIs (FastAPI, Flask) • Working on retrieval models and techniques (RAG, embeddings, ranking) • Evaluating, monitoring, and continuously improving model and system quality • Scaling systems to handle enormous volumes of requests
Senior AI Engineer, LLM
TidioAI-powered customer service management platform. We help businesses convert more leads and grow sales. Join us!
• Designing, implementing, and deploying end-to-end NLP and deep learning systems • Building LLM-powered applications that interact with real users • Developing and maintaining production Python services • Exposing models and pipelines via REST APIs (FastAPI, Flask) • Working on retrieval models and techniques (RAG, embeddings, ranking) • Evaluating, monitoring, and continuously improving model and system quality • Scaling systems to handle enormous volumes of requests
Mid AI Engineer, LLM
TidioAI-powered customer service management platform. We help businesses convert more leads and grow sales. Join us!
• Designing, implementing, and deploying end-to-end NLP and deep learning systems • Building LLM-powered applications that interact with real users • Developing and maintaining production Python services • Exposing models and pipelines via REST APIs (FastAPI, Flask) • Working on retrieval models and techniques (RAG, embeddings, ranking) • Evaluating, monitoring, and continuously improving model and system quality • Scaling systems to handle enormous volumes of requests
• You’ll work at the intersection of product engineering and ML engineering. • Our mission is to create the platform that enables building accurate and reliable AI agents, delivering an outstanding experience to our internal Ops team for the creation, iteration and monitoring of their agents, without requiring hard technical skills. • You'll implement new capabilities for AI agents, as well as tooling to make ops autonomous on the creation, evolution, monitoring and evaluation of their agents. • You'll work on making our system scale, going from a few dozen to tens of thousands of tasks automated daily. • You'll have significant influence on system architecture and code quality. • You'll mentor engineers, contribute to our product & engineering culture, and help build the team (including participating in hiring).


