TechBiz Global is a leading IT recruitment and software development company
Senior AI DevOps, LLMOps
Location
Poland
Posted
1 day ago
Salary
0
Seniority
Senior
Job Description
Senior AI DevOps, LLMOps
TechBiz Global
• Automation of Build-to-Production - Design and implement robust CI/CD pipelines tailored for AI • Develop specialized workflows for PromptOps • Automate the deployment of Agentic workflows • Provision and manage high-performance compute environments (GPU clusters, TPU pods) • Define and enforce Policy-as-Code for AI endpoints • Maintain a consistent environment across Hybrid Infrastructure • Architect Progressive Delivery strategies for AI • Build 'Evaluation-in-the-Loop' gates within the pipeline • Establish deep observability into Inference Endpoints
Job Requirements
- 10+ years in DevOps, SRE, or Cloud Engineering
- 2+ years of hands-on experience in MLOps or LLMOps
- Advanced Kubernetes (K8s) skills, specifically with KubeFlow, Ray, or NVIDIA Triton
- Expertise in GitHub Actions/GitLab CI, and Terraform or Pulumi
- Experience with Weights & Biases, MLflow, LangSmith, or Arize Phoenix
- Understanding of GPU virtualization, CUDA drivers, and on-premises hardware management
- Familiarity with Open Policy Agent (OPA) and secret management (Vault)
Benefits
- Health insurance
- Flexible working arrangements
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior Application DevOps Engineer
Sedona DigitalExperts in software development and cloud technologies.
Role Description Join Sedona Digital, a fast-growing scale-up organization with an ambition to be recognized as one of the leading technology companies servicing high tech, global enterprises across Technology, Finance, and Life Sciences sectors. Our global client base needs builders: engineers and developers who love technology, have deep expertise in software, engineering, and cloud technologies, and importantly, have a passion for culture and customers. - Collaborate with the development teams to plan, deploy and administer applications running on AWS and Kubernetes - Maintain the AWS infrastructure using Infrastructure as Code principles and Terraform - Initiate and drive the adoption of technologies and use of good patterns for development and operations - Mentoring and overseeing other engineers - Lead and manage specific technical domains or projects, ensuring architectural consistency, scalability, and alignment with business objectives - Oversee and improve engineering processes and practices, identifying opportunities for automation, optimization, and enhanced system reliability - Implement and oversee cloud migration projects - Provide guidance for re-platform and re-factor of current cloud infrastructure - Communicate to the senior leadership, the cloud migration projects progress - Preserve business continuity 24/7 with minimum downtime and financial impact - Investigate and perform regular assessments of cloud deployments in compliance with the company’s standards and best practices - Ensure the Company’s deployment standards and pillars are followed in cloud solutions and resources - Stay up to date with the latest tools and trends in the industry - Follow and provide training regarding new and current technologies and services used Qualifications - BSc/MSc in Computer Science, or a similar discipline - Overall knowledge of solution design and deployment projects - Experience with Docker - Experience in or knowledge of CI/CD with GitLab and/or Jenkins - Experience in AWS technologies, services and ecosystem - Experience with Kubernetes - Knowledge of Infrastructure as Code (IaC) concepts and tools, preferably Terraform - Experience with versioning tools such as Git - Distinctive organisation and documentation skills - Excellent time management skills - Essential Linux and Windows admin, networking and scripting skills Benefits - Remote/home working - Opportunity to work in a rapidly growing scale-up organisation - Exposure to complex, global client engagements - Training on market trends and client needs - Ongoing learning and development opportunities - Competitive compensation package
Lead DevOps Engineer
EPAM SystemsEPAM Systems is an information technology (IT) company that has become a leading global digital and product design, digital platform engineering, and product de
Lead DevOps Engineer (GCP) Location: Argentina, Chile, Colombia Job Description: Remote in Argentina, & 3 others We are seeking a Lead DevOps Engineer with solid experience in designing, automating and supporting cloud-based infrastructure across GCP. The ideal candidate should be comfortable collaborating with development teams to enhance deployment processes, infrastructure reliability, scalability, security and operational efficiency. This position requires hands-on experience with CI/CD pipelines, Infrastructure as Code, cloud services, containerization, monitoring and automation. Responsibilities - Design, build and maintain cloud infrastructure using GCP - Implement and enhance CI/CD pipelines for application delivery - Automate infrastructure provisioning and configuration through Infrastructure as Code - Support containerized applications and cloud-native deployments - Collaborate with development, QA, security and operations teams - Enhance system reliability, scalability, performance and security - Monitor production systems and assist with troubleshooting and incident resolution - Define and promote DevOps best practices across engineering teams Requirements - Strong background as a DevOps Engineer or Cloud Engineer - Hands-on expertise in at least one major cloud platform: GCP - Experience with CI/CD tools such as Jenkins, GitHub Actions, GitLab CI, Azure DevOps or similar - Familiarity with Infrastructure as Code tools such as Terraform, CloudFormation, ARM/Bicep or Pulumi - Practical experience with Docker and containerized environments - Understanding of Kubernetes or cloud container services - Hands-on work with monitoring, logging and alerting tools - Strong scripting skills in Bash, Python, PowerShell or similar - Solid understanding of networking, security and cloud architecture fundamentals - Capability to work independently and support production environments Nice to have - Hands-on experience administering Kubernetes clusters - Practical knowledge of Helm, Argo CD or GitOps practices - Background in cloud security and compliance - Exposure to serverless architectures - Familiarity with cost optimization in cloud environments We offer/Benefits - Connectivity Bonus (25,000 ARS are paid with a salary receipt at the end of each month as a non-wages concept). - Medicina Prepaga (It covers the collaborator and direct family group). - Paternity Leave (Two additional days are added to what is established by law, total of 4 days). - Discounts card. - English Training (English lessons, twice per week). - Training Program (Access to multiple customized training plans according to the needs of each role within the company). - Marriage bonus (The company doubles the allowance established by law that ANSES offers). - Referral Program (Referral bonus is paid when the referral of a collaborator joins the Company). - External Agreements and Discounts. - Vacations: 14 calendar days a year
Site Reliability Engineer – Application Support
NBCUniversalHere you can create the extraordinary. Join us.
• Design, implementation, and full-stack lifecycle support for digital asset delivery systems • Delivery application tuning, performance optimization and troubleshooting • Assisting with scoping, design, and implementation of media delivery project initiatives • Participating in incident cause-analysis & assistance in remediation & design efforts • Working closely with DevOps teams to identify and develop monitoring for key system health/performance metrics • Writing code and scripts to automate everything possible
• Design, develop, and maintain reusable Terraform and Ansible modules for Azure and GCP. • Build and operate scalable, secure Azure and GCP environments: AKS/GKE, App Gateway, VNets, NSGs, Key Vault, Managed Identities. • Build and optimize Jenkins and GitHub Actions pipelines for automated build, test, and deployment across environments. • Integrate vulnerability scanning, code analysis, and compliance checks into CI/CD workflows. • Contribute to basic portal application code (modern JS/TS frontend, REST API endpoints, simple backend handlers) when the work calls for it. • Implement proactive monitoring with Azure Monitor, Google Cloud Monitoring, Prometheus, Grafana, Log Analytics.




