Job Closed

This listing is no longer active.

Xebia Poland

A place where experts grow.

Expert Azure DevOps Engineer

DevOps EngineerDevOps EngineerFull Time Remote LeadTeam 1,001-5,000Since 2001H1B No SponsorCompany Site LinkedIn

Location

Romania

Posted

124 days ago

Salary

Seniority

Lead

Bachelor Degree8 yrs expEnglishAngular AWS Azure Distributed Systems JavaScript Kubernetes Node.js NoSQL Python React SQL Terraform Vue.js .NET

Job Description

• designing and evolving cloud platforms with a focus on Azure Cloud services • leading on-premises to cloud migration initiatives • ensuring scalability, security, and high availability • implementing automated deployment pipelines and infrastructure provisioning solutions • driving platform security testing and ensuring compliance with company-wide security standards • partnering with Product, Architecture, and Software Engineering teams to deliver secure, cost-effective SaaS products • contributing to platform engineering, operations, and security research to introduce innovative solutions • supporting continuous improvement of operational processes critical for cloud platform and operations success • aligning work with company-wide OKRs and strategic initiatives focused on cloud-native transformation

Job Requirements

8+ years of experience as a DevOps Engineer or in Platform Engineering
programming experience with modern frameworks and languages (e.g., React, Vue, Angular for frontend; Node.js, .NET, Python, Go for backend)
proven experience in building and deploying production web applications
experience in configuring, deploying, and maintaining large Kubernetes environments (including network topology, security, monitoring) — both on bare metal and using AKS — as well as cloud services on Azure
in dept knowledge of Azure infrastructure and services
in depth knowledge of Azure DevOps
practical experience with developing Infrastructure as Code, especially using Terraform, and automating application deployment
solid understanding of application and infrastructure security principles
comfortable working with database technologies (SQL and NoSQL) across development and operations
good verbal and written communication skills in English (min. B2)
nice to have: experience building internal developer platforms and self-service tools, AWS knowledge and multi-cloud development experience, background in observability tools and monitoring solutions, familiarity with event-driven architectures and streaming platforms, understanding of AI integration patterns and automation opportunities, contributions to open-source projects, experience with GitOps principles and implementation, background in building highly available, distributed systems, knowledge in areas such as chaos engineering, DevSecOps, or data pipeline development, awareness of FinOps practices and cost optimization strategies.

Benefits

Work from the European Union region and a work permit are required
Candidates must have an active VAT status in the EU VIES registry: https://ec.europa.eu/taxation_customs/vies/

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevOps Engineer / Site Reliability Engineer

Leidos

Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.

DevOps Engineer125 days ago

Other RemoteTeam 10,001+Since 1969H1B Sponsor

Company Site LinkedIn

• Design, develop, troubleshoot, and debug mission critical infrastructure • Manage private and public cloud infrastructures via code, designing reusable infrastructure components for scalable, highly-available, secure architectures for cloud-based applications. • Enable the continuous integration and continuous delivery of our diverse suite of software products by applying best practices for infrastructure provisioning, configuration and automated software deployments. • Continually evaluate and apply best practices to facilitate continuous improvement that can be applied across teams. • Ability to take high level requirements from senior engineers and program management and break the work down into smaller tasks for the team. • Makes process and program recommendations to the project or program. • Own entire projects or processes within a technical area. • Responsible for coaching and reviewing the work of lower-level technical staff.

View details: DevOps Engineer / Site Reliability Engineer

Virginia

$87.1K - $157.5K / year

Apply

Job Closed

DevSecOps Specialist

Tivita

Impulsionando o sucesso de clínicas e consultórios ✨

DevOps Engineer125 days ago

Full Time RemoteTeam 11-50Since 2023H1B No Sponsor

Company Site LinkedIn

• You will be Tivita's first dedicated DevSecOps, responsible for establishing the foundations of a secure, predictable, resilient, and scalable infrastructure. • Create a DevSecOps foundation that enables structured growth without hindering the development team's velocity. • Identify critical risks, define realistic priorities, and create roadmaps that balance speed, reliability, and security. • Map the attack surface and assess the current state of infrastructure and security pipelines. • Identify critical vulnerabilities, exposures of sensitive data, and inadequate access policies. • Implement IAM and least-privilege policies in cloud environments. • Validate, automate, and test backup and disaster recovery procedures. • Integrate essential security stages into CI/CD pipelines (Secrets Scan, SAST, SCA, IaC Scan) without introducing blocking gates at the outset. • Protect applications against the OWASP Top 10 using WAFs, rate limiting, and anti-bot measures. • Implement encryption at rest and in transit, and deploy DLP controls on critical interfaces. • Establish canary releases, automatic rollback, and secure deployment practices. • Build a SIEM layer and create incident response playbooks. • Review LGPD/privacy policies, onboarding/offboarding processes, and governance. • Engage the engineering team with contextualized training and security testing. • Implement the “paved road” concept to provide autonomy and reduce dependence on Infra/Sec.

Docker GCP Jenkins Kubernetes SQL Terraform

View details: DevSecOps Specialist

Brazil

Apply

Job Closed

Site Reliability Architect

HHAeXchange

Better Homecare, Better Health

DevOps Engineer125 days ago

Other RemoteTeam 501-1,000Since 2008H1B Sponsor

Company Site LinkedIn

• Architect with a resiliency-by-design intent, for self-healing, fault-tolerant systems, focusing on proactive readiness rather than reactive correction. • Operate within a secure high-volume, high-volatility application environment, utilizing advanced networking and compute structures, in cloud hosted environments (AWS/GCP). • Move the organization from "firefighting" to a proactive culture through habits and systems supporting feature flagging, production readiness reviews, architectural decision records, and chaos engineering. • Support the incident management practice, mentoring SREs and Software engineers alike in utilizing our monitoring and observability toolsets for effective troubleshooting. • Define SLIs, SLOs, and error budgets that balance feature velocity with platform stability, supporting a shift to service ownership. • Underscore an automation-first perspective using Terraform, CDK, and other cloud-formation infrastructure as code toolsets to ensure repeatable, audit-ready environments.

AWS DNS GCP Java Kubernetes Python TCP/IP Terraform

View details: Site Reliability Architect

United States

$170K - $185K / year

Apply

Job Closed

Sr. Site Reliability Engineer (SRE)

Moonlite AI

Moonlite is building a cloud-native experience on-prem. Our software provides the control and customization enterprises need for AI. Build Faster with Moonlite Instantly download and deploy NIMS from NVIDIA or build your own applications with Hugging Face. Customize and deploy AI agents in one click or integrate your own with ease. Total Control Over Your AI Obtain the highest level of security by design for your private environments. Moonlite provides total visibility into all your resources, applications, and users. Find Value with Your Use Case Allocate resources in real-time as needed in your environment. Use the models that best align with your use cases. When a new model is released, test it out and power your applications with it.

DevOps Engineer125 days ago

Other RemoteTeam 10Since 2024

Moonlite delivers high-performance AI infrastructure for organizations running intensive computational research, large-scale model training, and demanding data processing workloads.We provide infrastructure deployed in our facilities or co-located in yours, delivering flexible on-demand or reserved compute that feels like an extension of your existing data center. Our team of AI infrastructure specialists combines bare-metal performance with cloud-native operational simplicity, enabling research teams and enterprises to deploy demanding AI workloads with enterprise-grade reliability and compliance. Your Role: You will be instrumental in building and operating production-grade AI infrastructure with deep Kubernetes expertise at its core. Working closely with our systems engineers, network engineers, and platform engineering team, you’ll architect and operate the Kubernetes infrastructure that powers our control plane and orchestrates compute, storage, and networking at scale. This role requires deep understanding of Kubernetes internals, custom resource definitions (CRDs), storage and network integrations, and building production-grade clusters from the ground up (not just deploying in managed environments). You'll ensure enterprise-grade reliability while establishing the automation, observability, and operational practices. Job Responsibilities Kubernetes Infrastructure Engineering: Design, build, and operate production Kubernetes clusters on bare-metal infrastructure – including cluster bootstrapping, control plane architecture, etcd management, and scaling strategies for high-performance compute workloads. Kubernetes Networking & CNIs: Implement and operate custom Kubernetes networking solutions with SR-IOV for high-performance GPU interconnects, multi-tenancy isolation and advanced networking policies. Configure CNI plugins and network segmentation for research workloads. Custom Operators & Controllers: Develop and maintain custom Kubernetes operators and controllers for bare-metal provisioning, infrastructure lifecycle management, and resource orchestration across compute, storage, and networking domains. GPU Infrastructure Integration: Deploy and optimize NVIDIA GPU operators, device plugins, and other custom scheduling logic for GPU workload placement and utilization optimization. Platform Integration & Storage: Build deep integrations between Kubernetes and underlying infrastructure including CSI drivers for storage, custom admission controllers for policy enforcement, and scheduling extensions for specialized hardware placement. Design and implement automation using Terraform, Ansible, Helm, and custom operators to orchestrate infrastructure workflows and enable deployments across multiple regions. Production Operations & Reliability: Manage production bare-metal infrastructure across multiple regions. Build systems ensuring high availability, fault tolerance, and graceful degradation – establishing SLIs, SLOs, and monitoring to meet enterprise reliability commitments. Observability & Incident Response: Build comprehensive monitoring, logging, and alerting using Prometheus, Grafana, and ELK stack. Lead incident response, conduct postmortems, and implement preventative measures to improve reliability and reduce MTTR. Performance & Capacity Planning: Identify and resolve performance bottlenecks across infrastructure domains. Monitor utilization trends, forecast capacity needs, and optimize resource allocation for various workloads. Requirements Preferred Qualifications Experience building custom Kubernetes operators or controllers for infrastructure orchestration Deep familiarity with Kubernetes networking (Calico, Cilium, Multus), service mesh technologies, and network policy management Experience with GPU workload orchestration including NVIDIA GPU Operator, MIG, time-slicing, and device plugins Background with advanced Kubernetes features including custom schedulers, admission controllers, and API server extensions Experience with Kubernetes cluster federation or multi-cluster management Knowledge of high-performance networking technologies (InfiniBand, RDMA, RoCE) and their integration with Kubernetes Experience with enterprise storage systems (VAST, Lightbits, Ceph, or similar) Familiarity with configuration management at scale and GitOps practices Understanding of security best practices for Kubernetes and bare-metal infrastructure Experience operating infrastructure in regulated industries or co-located data center environments Background supporting research institutions, technical computing environments, or enterprise AI infrastructure Key Technologies Kubernetes, Linux, Terraform, Ansible, Prometheus, Grafana, ELK Stack, Go, Python, Bash, NVIDIA GPU Technologies, High-Performance Networking, Enterprise Storage Systems Why Moonlite Build Critical Research Infrastructure: Your work will directly enable quantitative research teams and AI practitioners to push the boundaries of what's possible in financial modeling and AI research. Enterprise Impact: Build and operate infrastructure that supports mission-critical research and AI workloads for leading financial institutions and research organizations. Technical Excellence: Join an infrastructure team focused on delivering enterprise-grade reliability while pushing the boundaries of high-performance computing capabilities. Hands-On Ownership: As part of our growing infrastructure team, you'll have significant ownership over critical systems and the autonomy to influence our operational practices and technology choices. Industry Leadership: Work alongside experienced infrastructure professionals who have built and operated systems for the most demanding computing environments. We offer a competitive total compensation package combining a competitive base salary, startup equity, and industry-leading benefits. The total compensation range for this role is $165,000 – $225,000, which includes both base salary and equity. Actual compensation will be determined based on experience, skills, and market alignment. We provide generous benefits, including a 6% 401(k) match, fully covered health insurance premiums, and other comprehensive offerings to support your well-being and success as we grow together. #li-remote

Ansible Shell ELK Stack Grafana Kubernetes Linux Prometheus Python Terraform

View details: Sr. Site Reliability Engineer (SRE)

Indiana + 1 more

$165K - $225K / year

Apply

Expert Azure DevOps Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer / Site Reliability Engineer

DevSecOps Specialist

Site Reliability Architect

Sr. Site Reliability Engineer (SRE)