Gcore logo
Gcore

Powerful edge and cloud solutions for media business and the entertainment industry

DevOps Engineer – AI Inference

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 201-500H1B No SponsorCompany SiteLinkedIn

Location

Poland

Posted

4 days ago

Salary

0

Seniority

Senior

Job Description

DevOps Engineer – AI Inference

Gcore

• Design, develop, and maintain infrastructure for AI inference workloads, including GPU scheduling, model deployment pipelines, and data access patterns in on-prem environments • Build and manage monitoring and observability tools for AI inference platforms, including dashboards, alerts, and runbooks for model health and system performance • Collaborate with ML engineers and platform teams to design system architecture for AI workloads, integrate inference runtimes, and test performance at scale

Job Requirements

  • Hands-on experience deploying, operating, and troubleshooting Kubernetes clusters, including Helm, Docker, or CRI-O.
  • Strong understanding of Linux systems and networking concepts, including troubleshooting connectivity and performance issues.
  • Ability to develop automation and operational tooling using Python, Go, or Bash.
  • Experience provisioning and managing infrastructure with tools such as Terraform and Ansible.
  • Experience designing, implementing, and maintaining CI/CD pipelines using GitLab CI or GitHub Actions.
  • Preferred Qualifications
  • Experience operating or administering Slurm clusters.
  • Experience with Cluster API (CAPI) or other Kubernetes cluster lifecycle management ("Kubeception") technologies.
  • Deep understanding of Kubernetes internals, including CNI, CSI, Operators, and cluster architecture.
  • Nice to Have
  • Experience with Kubernetes ecosystem tools such as Argo CD and Helmfile.
  • Experience with Prometheus.
  • Familiarity with other Cloud Native technologies

Benefits

  • Competitive compensation
  • Flexible working hours and hybrid or remote options, depending on your role
  • Work from anywhere in the world for up to 45 days per year
  • Private medical insurance for you and your family*
  • Extra paid vacation and sick leave days*
  • Support for life’s important moments and celebrations
  • Language courses to help you connect and grow
  • Modern, welcoming offices with snacks, drinks, and entertainment*
  • Team sports and social activities*

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevSecOps Engineer

Direct Meds LLC

To learn more about Direct Meds, please visit https://directmeds.com Important! To apply and be considered as a candidate for this position, you must apply and complete the form using the application link on Zoho below: Application Link

DevOps Engineer4 days ago

Role Description This DevSecOps role is foundational to our engineering department. We aren't looking for just another DevOps hand—we need a Security Guardian who understands that reliable software must first be secure, especially when handling Protected Health Information (PHI). You will own the full security lifecycle of our platform, turning complex regulatory requirements (like HIPAA) into simple, automated, and ironclad engineering solutions. This is where technical mastery meets legal compliance. If you thrive on bridging the gap between rapid development cycles and critical healthcare regulations, this role is for you. What You Will Own: - Define and enforce our approach to handling PHI, making HIPAA adherence a non-negotiable part of every system we build or update. - Build robust CI/CD pipelines that aren't just deploy code; they automatically inject security checks—from vulnerable scanning to compliance verification and ensuring least-privilege access at every single step. - Lead design and code reviews, proactively identifying architectural weak points or compliance risks before they become problems in production. - Keep our core platforms running smoothly by continually hardening them, establishing security baselines, and maintaining thorough documentation to ensure we are always audit-ready. Company Description To learn more about Direct Meds, please visit https://directmeds.com . Important! To apply and be considered as a candidate for this position, you must apply and complete the form using the application link on Zoho below: Application Link

United States
$50K - $250K / year
Dev Partners logo

DevOps Engineer – Platform

Dev Partners

Scale your dev team faster with our IT Staff Augmentation services. Hire 100% fully vetted and reliable developers.

DevOps Engineer4 days ago
Part TimeRemoteTeam 201-500H1B Sponsor

• Maintain and improve existing production environments. • Support deployment and release processes. • Improve authentication and access management. • Help implement security-conscious operational practices. • Configure and improve automation for operational workflows. • Monitor and improve production reliability. • Troubleshoot infrastructure and deployment issues. • Assist with environment configuration and maintenance. • Recommend operational improvements and best practices. • Work alongside developers to ensure smooth product releases.

Philippines
Mastercard logo

Senior Site Reliability Engineer

Mastercard

Founded in 1966, Mastercard is a worldwide transaction, payment-processing, and consulting company best known for its line of personal and business credit cards. As an employer, Ma

DevOps Engineer4 days ago
Full TimeRemoteTeam 38,800Since 1966

Our Purpose Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential. Title and Summary Senior Site Reliability Engineer Overview Commerce Media is hiring a Senior Site Reliability Engineer to lead the reliability, scalability, and production operations of a greenfield application within our enterprise platform. This is a high-impact, individual contributor role with end-to-end ownership of system reliability-from design influence through production operations. You will partner across engineering and platform teams to ensure services are resilient, observable, and production-ready from day one. About the Role Design Influence & Production Readiness• Drive reliability-focused design in partnership with engineering and platform teams• Lead architecture and launch readiness reviews, including: o Capacity planning o Failure-mode and risk analysis• Define and enforce non-functional requirements (availability, latency, resilience) Production Ownership & Incident Leadership• Own production reliability and service health• Act as incident commander, leading triage, mitigation, and communication• Lead blameless post-mortems with clear, actionable follow-ups• Proactively identify and reduce operational risk across the system Observability & SLO Management• Define and manage SLIs, SLOs, and error budgets• Design and operate monitoring and alerting using: o Prometheus, Grafana o OpenSearch / Elasticsearch o Opsgenie• Build dashboards aligned to user impact and system health Automation, Scalability & Platform Enablement• Drive automation-first operations to scale systems sustainably• Enhance CI/CD pipelines (GitHub Actions) with deployment gating and validation• Identify and resolve performance and reliability bottlenecks• Improve developer experience through operational tooling and best practices Technology Environment• Kubernetes, Docker• GitHub Actions (CI/CD)• Prometheus, Grafana (observability)• OpenSearch / Elasticsearch (logging/search)• Opsgenie (incident management)• AWS or equivalent cloud platforms• Preferred: Spring Boot and/or Golang services All About You• Years of professional experience operating distributed systems at scale in production• Strong expertise in: o Kubernetes and containerized environments o Observability (metrics, logging, tracing) o Spring Boot and/or Golang ecosystems• Hands-on across application, infrastructure, and release pipelines• Demonstrated ownership of service reliability, incident response, and operational strategy• Ability to influence system design through technical leadership and data-driven decisions• Pragmatic mindset-balancing automation, trade-offs, and system evolution• Experience navigating enterprise environments while maintaining delivery velocity• Leverages AI tools (e.g., Copilot, ChatGPT, Claude) to: o Accelerate design, coding, and testing o Improve code quality and operational outcomes• Integrates AI into workflows: o Architecture reviews, code generation, testing, and documentation• Applies strong judgment in production-critical, low-latency environments Mastercard is a merit-based, inclusive, equal opportunity employer that considers applicants without regard to gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law. We hire the most qualified candidate for the role. In the US or Canada, if you require accommodations or assistance to complete the online application process or during the recruitment process, please contact reasonable_accommodation@mastercard.com and identify the type of accommodation or assistance you are requesting. Do not include any medical or health information in this email. The Reasonable Accommodations team will respond to your email promptly. Corporate Security Responsibility All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must: - Abide by Mastercard's security policies and practices; - Ensure the confidentiality and integrity of the information being accessed; - Report any suspected information security violation or breach, and - Complete all periodic mandatory security trainings in accordance with Mastercard's guidelines. In line with Mastercard's total compensation philosophy and assuming that the job will be performed in the US, the successful candidate will be offered a competitive base salary and may be eligible for an annual bonus or commissions depending on the role. The base salary offered may vary depending on multiple factors, including but not limited to location, job-related knowledge, skills, and experience. Mastercard benefits for full time (and certain part time) employees generally include: insurance (including medical, prescription drug, dental, vision, disability, life insurance); flexible spending account and health savings account; paid leaves (including 16 weeks of new parent leave and up to 20 days of bereavement leave); 80 hours of Paid Sick and Safe Time, 25 days of vacation time and 5 personal days, pro-rated based on date of hire; 10 annual paid U.S. observed holidays; 401k with a best-in-class company match; deferred compensation for eligible roles; fitness reimbursement or on-site fitness facilities; eligibility for tuition reimbursement; and many more. Mastercard benefits for interns generally include: 56 hours of Paid Sick and Safe Time; jury duty leave; and on-site fitness facilities in some locations. Pay Ranges Remote - Utah: $96,000 - $163,000 USD Job Posting Window Posting windows may change based on the volume of applications received and business necessity. Candidates are encouraged to apply expeditiously.

Utah
$96K - $163K / year
Full TimeRemoteTeam 10,001+Since 1983H1B Sponsor

Role Description The Regional Site Start Up (SSU) role is responsible for leading and delivering site start-up and activation activities across clinical trials. This role will ensure timely site activation, maintain strong relationships with sites, and work cross-functionally with internal and external teams to efficiently achieve study site activation timelines. The role provides regional expertise, ensuring large areas of geographic-specific needs are addressed and adherence to study milestone timelines. This role must possess excellent interpersonal skills, attention to detail, and the ability to collaborate across teams to ensure timelines are achieved. CORE JOB RESPONSIBILITIES: - Site Start Up and Activation: - Accountable to delivering individual site activation timelines to plan for assigned sites. - Gather, organize and share, as appropriate, all required essential documents from clinical sites and Sponsor specific documents to ensure compliance with Regulatory and Sponsor requirements as part of the site activation process. - Collect site intelligence to inform site discussions and maintain site information in CTMS. - Ensure site regulatory packages meet country requirements, TMF standards and ICH-GCP compliance. - Assist with reviewing Informed Consent Forms (ICF) as requested. - Facilitate the translation of Essential Documents that may be required in languages other than English for purposes of submission to and approval from Regulatory Health Authorities and/or Independent Review Board/Ethics Committees. - Provide regional expertise, addressing specific geographic challenges to facilitate site activation. Serve as the primary point of contact and escalation point for sites: troubleshoot issues and provide strategic solutions to ensure activation timelines are achieved. - Update trackers with key study information, risks and mitigation strategies. - Ensure all site start-up documents are filed in the TMF and are inspection ready. - Support inspection readiness activities related to site start up documents. - Cross-Functional Collaboration: - Partner with internal, external stakeholders and clinical sites to ensure good communication and coordination through the site start-up phase. - Ensure alignment with all global and local regulatory requirements. - Process Optimization and Compliance: - Maintain accurate records of site activation progress, including updates on document collections, submissions statuses, and timelines. - Identify and escalate challenges or delays in document collection, regulatory submissions, or site activation processes for resolution. - Identify opportunities for process improvement in site start-up activities and implement best practices to enhance efficiency and effectiveness. Qualifications - Demonstrated interpersonal & leadership skills. - A data driven approach to planning, executing, and problem solving. - Effective communication skills via verbal, written and presentation abilities. - Proactive and self-disciplined, ability to meet deadlines, effective use of time, and prioritization. - Demonstrated vendor management experience. - Technical proficiency in trial management systems (CTMS, TMF) and MS applications including (but not limited to) Project, PowerPoint, Word, Excel. - Experience in the clinical drug development process, including study start-up. - Knowledge of ICH/GCP and regulatory guidelines/directives. - Ability to understand and implement operational strategic direction and guidance for respective clinical trials, fostering a culture of collaboration and trust across diverse teams and stakeholders. - Support stakeholders by addressing concerns promptly and professionally, building positive relationships, and ensuring clear communication to maintain alignment with trial objectives. - Contribute to team productivity by maintaining open communication and supporting team members in their tasks. Requirements - Education: Bachelor’s Degree, minimum. - Years of Experience: 3 - 4 years. - Must have experience working with US and Canada. Company Description Parexel is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to legally protected status, which in the US includes race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.

United States + 1 moreAll locations: United States | Canada