IFS

Be your best when it really matters. At the #MomentOfService

DevOps Engineer

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 5,001-10,000Since 1983H1B SponsorCompany Site LinkedIn

Location

Japan

Posted

82 days ago

Salary

Seniority

Senior

Bachelor DegreeEnglishJapaneseAnsible AWS Azure Docker GCP Linux Microsoft SQL Server Oracle Database ServiceNow SQL Terraform Unix

Job Description

• Designing, implementing, and maintaining automation and shared tooling within Cloud Operations • Leading event, incident, case, and problem management, as well as service-request fulfilment • Ensuring security, latency, performance, efficiency, monitoring, emergency response, and capacity planning of IFS Cloud services • Demonstrating a strong commitment to service and process quality • Taking proactive action to prevent issues and resolving them quickly when they do occur • Contributing to knowledge management (KBAs, SOPs) and utilizing IFS support tools effectively • Actively participating in training and mentoring, both receiving and occasionally providing guidance

Job Requirements

Solid working knowledge of cloud platforms (Azure/GCP/AWS)
Practical knowledge of Docker, containerization, and container orchestration tools
Proficient in Linux/Unix and Windows Server (2016 preferred but other versions will be considered)
Competent in Azure VPN/Express Route and Cloud Service Routers
Hands-on experience with Oracle DB and MS SQL Server
Scripting experience in two or more of the following languages – Terraform, PowerShell, Bash, Ansible
Good understanding of ITIL, ServiceNow, Jira Service Desk, and knowledge management tools
Solid understanding of information security concepts like authentication, access controls, and encryption
Fluent in English and Japanese required

Benefits

IFS Referral Bonus

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps Engineer

Software Mind

Software House focused on results since 1999

DevOps Engineer82 days ago

Full Time RemoteTeam 1,001-5,000Since 1999H1B No Sponsor

Company Site LinkedIn

• Collaborating with engineering and development teams to evaluate and identify optimal solutions • Modifying and improving existing systems • Developing and maintaining solutions in accordance with best practices • Identifying, analyzing, and resolving infrastructure and application vulnerabilities • Performing system administration tasks including configuration, systems monitoring, troubleshooting, and support while innovating to automate as much as possible • Regularly reviewing existing systems and making recommendations for improvements • Performing software application installation, patching, and upgrades • Troubleshooting and resolving issues with the existing systems • Ownership of the custom build/deployment module #LI-DNI

Azure Linux

View details: Senior DevOps Engineer

Argentina

Apply

Senior SRE Engineer – AWS Cloud

AM53 Smart Solutions

A tecnologia certa. O talento ideal. No momento exato.

DevOps Engineer82 days ago

Full Time RemoteTeam 11-50Since 2010H1B No Sponsor

Company Site LinkedIn

• Develop, maintain and evolve CI/CD pipelines, ensuring continuous, stable and secure deliveries • Automate infrastructure and application deployment processes, reducing toil and increasing reliability • Continuously monitor and optimize the performance, availability and security of production environments • Administer and support AWS cloud environments, ensuring resilience and scalability • Serve as a technical reference for the development team, promoting best practices for continuous delivery • Ensure end-to-end observability with robust practices for metrics, logs, tracing, versioning and rollback • Manage and ensure availability and performance of MongoDB and PostgreSQL databases • Act as a FinOps mentor and point of reference, fostering a culture of cloud cost efficiency and governance • Lead the response to critical incidents — rapidly diagnose issues, coordinate resolution and ensure clear communication during crises • Conduct blameless post-mortems, turning incidents into lessons and concrete improvements

AWS Docker Jenkins Kubernetes MongoDB PostgreSQL Python Terraform

View details: Senior SRE Engineer – AWS Cloud

Brazil

Apply

Job Closed

DevOps/MLOps Engineer

Kyivstar

Kyivstar.Tech is a Ukrainian hybrid IT company and a resident of Diia.City. We are a subsidiary of Kyivstar, one of Ukraine's largest telecom operators. Our mission is to change lives in Ukraine and around the world by creating technological solutions and products that unleash the potential of businesses and meet users' needs. Over 600+ KS.Tech specialists work daily in various areas: mobile and web solutions, as well as design, development, support, and technical maintenance of high-performance systems and services. We believe in innovations that truly bring quality changes and constantly challenge conventional approaches and solutions. Each of us is an adherent of entrepreneurial culture, which allows us never to stop, to evolve, and to create something new.

DevOps Engineer82 days ago

Full Time RemoteTeam 1,001-5,000

Role Description We are looking for a DevOps Engineer to design, build, and operate the infrastructure behind our LLM platform. You will be responsible for keeping our ML infrastructure reliable, scalable, and efficient - from data pipelines to training and inference. In this role, you will develop and maintain CI/CD pipelines, orchestration workflows, and observability for distributed ML workloads across GPU/TPU/CPU environments. This is a DevOps-first role with strong exposure to ML infrastructure. You will work closely with ML Engineers and Data Engineers, while focusing on building a robust, automated, and production-grade platform that accelerates model development and delivery. Responsibilities - Design, build, and operate scalable ML infrastructure on GCP (GKE), supporting both experimentation and production workloads for LLMs and NLP systems. - Manage Kubernetes-based environments (GKE): deployment, scaling, upgrades, and reliability of training and inference workloads across GPU/TPU/CPU pools. - Build and maintain CI/CD pipelines (GitHub Actions, Jenkins) to automate testing, training, and deployment of ML services and infrastructure. - Implement infrastructure as code (Terraform, Ansible) to provision and manage cloud resources in a reproducible, secure, and cost-efficient way. - Ensure observability of ML systems: monitoring, logging, and alerting for infrastructure, pipelines, and production inference workloads. - Collaborate with ML engineers and Data Engineers to design and support reliable training and inference pipelines. - Optimize resource utilization and cost, improving efficiency of training and serving infrastructure. - Troubleshoot and resolve issues across the ML platform - from data pipelines to distributed training and production deployments. - Contribute to engineering best practices: code reviews, automation, and continuous improvement of platform reliability and developer experience. Qualifications - Experience: 4+ years in DevOps, Platform Engineering, or ML Infrastructure roles, with strong understanding of production systems and distributed workloads. - Cloud & Infrastructure: Hands-on experience with GCP. Other major cloud platforms is a plus. Strong understanding of cloud-native architectures and experience designing scalable systems for compute and data-intensive workloads. - Kubernetes & Containers: Solid experience with Docker and Kubernetes (preferably GKE), including deploying, scaling, and operating production workloads. Familiarity with Helm and Kubernetes networking fundamentals. - CI/CD & Automation: Experience building and maintaining CI/CD pipelines (GitHub Actions, Jenkins, or similar) to automate testing, deployment, and infrastructure changes. - Workflow Orchestration: Experience with Airflow (or similar tools). - Infrastructure as Code: Strong experience with Terraform (preferred) or similar tools for provisioning and managing infrastructure in a reproducible way. - Programming: Strong hands-on scripting languages experience (Bash and/or Python). - Observability & Reliability: Experience with monitoring and logging systems (e.g., Prometheus, Grafana). Understanding of reliability, alerting, and debugging in distributed systems. - ML Infrastructure Understanding: Familiarity with the ML lifecycle (training, evaluation, inference) and experience supporting ML workloads in production environments. - Collaboration: Ability to work closely with ML Engineers and Data Engineers, translating ML requirements into reliable and scalable infrastructure solutions. Benefits - Office or remote — it’s up to you. - Remote onboarding. - Performance bonuses. - We train employees with the opportunity to learn through the company’s library, internal resources, and programs from partners. - Health and life insurance. - Wellbeing program and corporate psychologist. - Reimbursement of expenses for Kyivstar mobile communication.

View details: DevOps/MLOps Engineer

Worldwide

Apply

Job Closed

Senior Site Reliability Engineer

Coterie

A modern baby care brand changing everything about changing.

DevOps Engineer82 days ago

Full Time RemoteTeam 11-50H1B Sponsor

Company Site LinkedIn

• Manage and maintain cloud infrastructure on Azure, including Azure Kubernetes Service (AKS) clusters and supporting resources • Build, improve, and maintain CI/CD pipelines using GitHub Actions to support reliable and repeatable deployments • Own and enhance our Grafana implementation; designing dashboards, configuring alerts, and supporting incident management workflows • Monitor system health, triage incidents, and drive root cause analysis to prevent recurrence • Collaborate with development teams to define and track SLIs, SLOs, and error budgets that align with business goals • Contribute to infrastructure-as-code practices using Pulumi • Identify and resolve reliability risks through capacity planning, performance tuning, and proactive system improvements • Participate in an on-call rotation to support production systems and respond to incidents • Document runbooks, operational procedures, and architectural decisions to support team knowledge sharing

Azure Cloud DNS Grafana Kubernetes Prometheus Python

View details: Senior Site Reliability Engineer

United States

$140K - $170K / year

Apply

DevOps Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevOps Engineer

Senior SRE Engineer – AWS Cloud

DevOps/MLOps Engineer

Senior Site Reliability Engineer