The Next Chapter logo
The Next Chapter

IT & Technology recruitment - contingency or "Recruiter as a Service" - we're your recruiter

Solutions Architect – AI/ML, Training & GPU Infra

Solutions EngineerSolutions EngineerFull TimeRemoteSeniorTeam 1-10Since 2021H1B No SponsorCompany SiteLinkedIn

Location

Netherlands

Posted

65 days ago

Salary

€300K / year

Seniority

Senior

Job Description

Solutions Architect – AI/ML, Training & GPU Infra

The Next Chapter

• Join a fast-moving AI infrastructure team working on large-scale ML workloads • Design and validate production-grade distributed training and large-scale inference architectures on large GPU clusters • Work hands-on with customers to debug, optimize, and scale ML workloads across multi-node GPU environments • Act as a technical authority on GPU performance, networking, and schedulers • Collaborate closely with engineering, product, and R&D to influence roadmap decisions

Job Requirements

  • Hands-on experience designing and operating production-grade, multi-node GPU workloads for training or inference
  • Strong background in distributed deep learning (PyTorch Distributed, DeepSpeed) on GPU clusters
  • Deep understanding of GPU architecture and interconnects (H100/A100 class, NVLink, InfiniBand)
  • Experience with Kubernetes or Slurm and performance tuning using GPU profiling and monitoring tools
  • Must be fluent in English.

Benefits

  • Remote from anywhere in Europe

Related Categories

Related Job Pages

More Solutions Engineer Jobs

ContractRemoteTeam 51-200Since 1993H1B No Sponsor

• Design and implement robust cloud architectures (AWS, GCP) and on-premises data center monitoring and automation solutions, ensuring they align with business requirements and provide seamless SAP landscape integration. • Craft comprehensive system monitoring strategies specifically tailored for data center environments, covering both physical and virtual infrastructure. • Select and implement appropriate infrastructure monitoring tools (e.g., Dynatrace, Prometheus, Grafana, Datadog, CloudWatch, Nagios, Zabbix) with a focus on data center metrics and SAP application specific KPIs. • Innovate CICD Pipeline strategies incorporating GIT branching and delta merging capabilities. • Assess resource management practices in align with both cloud and data center requirements. • Collaborate closely with development, SAP, and infrastructure teams to optimize deployments, troubleshoot issues, and ensure seamless integration across hybrid environments. • Design and implement SAP strategies to the cloud, ensuring optimal performance, security, and integration with data center components. • Act as a cloud, SAP, and monitoring evangelist, sharing best practices and mentoring team members. • Must have excellent Design and Documentation Skills.

United States
DigitalGenius logo

Solutions Engineer

DigitalGenius

AI Customer Service Automation Platform for eCommerce

Full TimeRemoteTeam 11-50Since 2015H1B No Sponsor

• Leading the technical implementation of our AI solutions, ensuring best practices and quality standards. • Collaborating with customers to understand their requirements and tailor our solutions to match their business needs. • Providing technical expertise throughout the implementation process, including setup, configuration, and integration with third-party platforms. • Conducting training sessions and workshops to empower customers to utilize our solutions effectively. • Monitoring implementation progress and ensuring the timely delivery of projects. • Identifying and troubleshooting any technical issues that may arise during the implementation phase. • Continuously gathering feedback from customers to improve the implementation process and enhance our offerings.

Argentina
Job Closed
ContractRemoteTeam 201-500Since 2004H1B No Sponsor

• Design, build, and maintain integrations between state eligibility systems and external partners such as SSA, CMS, IRS, DHS, and FNS. • Support data exchanges including SDX/SVES, MAGI, BEP, PARIS, FTI , and other federal hub transactions. • Implement eligibility workflows such as intake, verification, identity proofing, redetermination, and enrollment. • Develop RESTful APIs, microservices, and service-to-service integrations using .NET, Java, or similar frameworks. • Design and maintain API gateways (AWS API Gateway, Azure APIM) and enforce versioning, mapping, and access controls. • Implement resilient integration patterns (circuit breakers, retries, DLQs, idempotency). • Build event-driven pipelines using Kafka, RabbitMQ, or Azure Service Bus. • Support real-time eligibility verification and high-volume data processing. • Deploy, monitor, and secure services in AWS or Azure cloud environments. • Use CI/CD pipelines (Azure DevOps, Jenkins, GitLab) to automate builds, tests, and deployments. • Ensure compliance with MARS-E, HIPAA, NIST, and IRS Pub 1075 security frameworks. • Work closely with state HHS teams, policy analysts, business analysts, architects, and DevOps engineers. • Translate eligibility rules and business requirements into technical specifications. • Participate in architecture reviews, data mapping sessions, and testing cycles.

United States
$60 - $75 / hour
Full TimeRemoteTeam 51-200H1B No Sponsor

• Consult with customers and build advanced IT solutions in HPC/AI and enterprise • Design, build, recommend, and sell the latest technologies in the market • Determine customer needs and recommend the best platforms to solve challenges • Prescribe solutions averaging +$1M annually to various sectors • Design highly optimized solutions for HPC, AI, and ML across disciplines • Consult on enterprise information technology solutions including data storage, data protection, and server/container virtualization • Represent the company in face-to-face and online meetings, technical demonstrations, and conferences • Validate customer deliverables including quotes, proposals, and statements of work • Build long-lasting relationships with vendors, clients, and IT distributors

Massachusetts
$90K - $175K / year