Job Closed
This listing is no longer active.
We help you hear the voices that matter.
AI/LLM Evaluation & Alignment Software Engineer
Location
Texas
Posted
64 days ago
Salary
$135K - $160K / year
Seniority
Senior
Job Description
AI/LLM Evaluation & Alignment Software Engineer
LEO Technologies, LLC
• Build and maintain evaluation frameworks for LLMs and generative AI systems tailored to public safety and intelligence use cases. • Design guardrails and alignment strategies to minimize bias, toxicity, hallucinations, and other ethical risks in production workflows. • Partner with AI engineers and data scientists to define online and offline evaluation metrics (e.g., model drifts, data drifts, factual accuracy, consistency, safety, interpretability). • Implement continuous evaluation pipelines for AI models, integrated into CI/CD and production monitoring systems. • Collaborate with stakeholders to stress test models against edge cases, adversarial prompts, and sensitive data scenarios. • Research and integrate third-party evaluation frameworks and solutions; adapt them to our regulated, high-stakes environment. • Work with product and customer-facing teams to ensure explainability, transparency, and auditability of AI outputs. • Provide technical leadership in responsible AI practices, influencing standards across the organization. • Contribute to DevOps/MLOps workflows for deployment, monitoring, and scaling of AI evaluation and guardrail systems (experience with Kubernetes is a plus). • Document best practices and findings, and share knowledge across teams to foster a culture of responsible AI innovation.
Job Requirements
- Bachelor's or Master's in Computer Science, Artificial Intelligence, Data Science, or related field.
- 3–5+ years of hands-on experience in ML/AI engineering, with at least 2 years working directly on LLM evaluation, QA, or safety.
- Strong familiarity with evaluation techniques for generative AI: human-in-the-loop evaluation, automated metrics, adversarial testing, red-teaming.
- Experience with bias detection, fairness approaches, and responsible AI design.
- Knowledge of LLM observability, monitoring, and guardrail frameworks e.g Langfuse, Langsmith
- Proficiency with Python and modern AI/ML/LLM/Agentic AI libraries (LangGraph, Strands Agents, Pydantic AI, LangChain, HuggingFace, PyTorch, LlamaIndex).
- Experience integrating evaluations into DevOps/MLOps pipelines, preferably with Kubernetes, Terraform, ArgoCD, or GitHub Actions.
- Understanding of cloud AI platforms (AWS, Azure) and deployment best practices.
- Strong problem-solving skills, with the ability to design practical evaluation systems for real-world, high-stakes scenarios.
- Excellent communication skills to translate technical risks and evaluation results into insights for both technical and non-technical stakeholders.
Benefits
- 3 weeks of paid vacation – out the gate!!
- Competitive Salary.
- Generous medical, dental, and vision plans.
- Sick, and paid holidays are offered.
Related Guides
Related Job Pages
More Full-stack Engineer Jobs
• Design, plan, and build all aspects of our products frontend, UI, data pipelines, and backend components. • Take leadership and pursue the best, state-of-the-art solutions, within the dynamic requirements and timelines. • Take E2E ownership of all aspects of the development cycle. • Focus on continuous growth and improvement, in every aspect (personal, products, processes, tools, skills, etc.). • Mentor engineers to develop new skills and grow professionally.
Staff AI Product Engineer
Modern HealthOffering global, personalized mental health care designed to help you feel more resilient, productive, and empowered.
• Provide technical leadership on a new team prototyping and experimenting with new AI features • Productionize and ship AI integrations into Modern Health’s core product • Collaborate with cross-functional teams to deliver product features on time • Stay up-to-date with the latest AI technologies and trends
Senior Software Engineer
Maker&SonHome of the most comfortable furniture in the world. Handmade with love in the UK, US, AUS, NZ from natural materials.
• Execute full software development life cycle, with a DevOps practice adopting CI/CD • Write well-designed, testable code. We currently use NodeJs, TypeScript, Python, and looking to use Golang for new backend services • Integrate software components into a fully functional software system • Troubleshoot, debug and upgrade existing systems • Deploy programs and evaluate user feedback • Comply with project plans and industry standards • Ensure software is updated with latest features
Senior Fullstack Developer
ClearviewMeet Clearview, a remote-first, distributed software company with team members spread across the globe.
• Actively drive code review and architecture choices • Mentor other engineers • Work closely with management and clients • Make independent decisions and create initiatives




