DevOps Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 10,001+Since 1978H1B No SponsorCompany SiteLinkedIn

Location

Mexico

Posted

2 days ago

Salary

0

Seniority

Senior

Bachelor Degree4 yrs expEnglishDockerJavaJenkinsKubernetesPython

Job Description

DevOps Engineer

Minor Hotels Europe and Americas

• Design, implement, and maintain CI/CD pipelines to support embedded software development and deployment workflows. • Automate build, test, and release processes using modern DevOps tools and scripting. • Manage and optimize containerized environments using Docker. • Collaborate with Software Architects, Engineering Leads, and DevSecOps teams to ensure adherence to organizational standards and secure development practices. • Support continuous testing environments, including integration of unit tests and automated quality checks. • Monitor and improve system reliability, performance, and deployment efficiency. • Work within Agile teams and participate in sprint planning, reviews, and retrospectives.

Job Requirements

  • 4+ years of experience in a DevOps / CI-CD / Build & Release Engineering role.
  • Strong experience with Git, GitHub, GitHub Actions, and Jenkins. (GitHub is mandatory)
  • Hands-on experience with Docker and Kubernetes for containerization and orchestration.
  • Proficiency in scripting using Bash or PowerShell or Python.
  • Basic coding knowledge in C# or Java.
  • Experience working with Chocolatey packages.
  • Familiarity with Agile frameworks and tools like Jira.
  • Experience with code quality and review tools such as SonarQube, PC-Lint, etc.
  • Experience supporting continuous testing environments and automation integration

Benefits

  • Flexible working hours
  • Professional development opportunities

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Calriz logo

Junior SRE – Google Cloud Platform

Calriz

Inovação digital com tecnologia em Nuvem

DevOps Engineer2 days ago
Full TimeRemoteTeam 201-500Since 2021H1B No Sponsor

• Support the administration and evolution of environments on Google Cloud Platform (GCP); • Automate tasks and processes using scripts; • Work on monitoring and observability of applications and infrastructure; • Assist in incident management and improving system reliability; • Collaborate with development and operations teams.

Brazil

Role Description The Site Reliability Engineering discipline at Noctua Technology, LLC is a strategic force driving digital transformation. We treat operations as a software engineering challenge, focusing on the seamless integration, scalability, and long-term reliability of cloud native systems. Our SREs don’t just manage infrastructure; they build it using Infrastructure as Code (IaC), monitor it through advanced observability stacks, and protect it by engineering for failure. We work closely with clients to bridge the gap between development and operations. We are seeking a motivated Site Reliability Engineer (SRE) to join our dynamic team. As a key contributor, you will apply software engineering principles to operations, focusing on the reliability, scalability, and performance of production systems. You will play a crucial role in reducing toil through automation, defining and monitoring Service Level Objectives (SLOs), and implementing best practices for system stability and incident response. This role requires working with modern cloud technologies to ensure the high availability and efficiency of applications and infrastructure. Location: Primarily Remote. Candidates must be based in CA or DC Metro Area for proximity to project and client teams. Security Clearance Requirement: Applicants must be US citizens and eligible to obtain and maintain an active Secret security clearance or above. Key Responsibilities - Site Reliability Engineering - Define, measure, and report on Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to ensure system reliability and uptime. - Develop and deploy Infrastructure as Code (IaC) using Terraform, CloudFormation, or similar tools, with an emphasis on repeatability and change management. - Implement and manage containerized and serverless architectures using Docker, Kubernetes, and cloud-native services, focusing on performance and error budgets. - Build and maintain reliable and self-healing CI/CD pipelines to automate deployments and improve development workflows. - Toil Reduction and Incident Management - Implement and refine comprehensive monitoring, alerting, and logging to detect and address performance and availability issues proactively. - Eliminate toil by extensively automating operational tasks, including provisioning, patching, and deployments, using scripting and configuration management tools such as Python, Bash, or Ansible. - Conduct post-incident reviews (blameless postmortems) to drive continuous improvement in system reliability and operational processes. - Testing and Service Resiliency - Implement cloud security best practices, including identity and access management (IAM), encryption, and compliance controls. - Proactively identify and address system weaknesses and ensure performance under stress. - Support disaster recovery and high availability strategies through backup and failover planning. - Collaboration and Knowledge Sharing - Collaborate with development teams to improve the operability and production readiness of applications from design through deployment. - Create and maintain documentation for cloud architectures, deployment processes, and best practices. - Contribute to internal knowledge-sharing initiatives, ensuring continuous learning within the team. - Stakeholder Communication - Provide technical guidance and support to clients and internal teams on cloud infrastructure and reliability best practices, with a focus on defining Service Level Agreements (SLAs). - Act on client feedback to refine and enhance cloud solutions. - Conduct training and knowledge-sharing sessions to help clients manage their cloud environments effectively. - Continuous Learning and Innovation - Stay updated on the latest developments in cloud infrastructure and technology trends. - Drive innovation by proposing and implementing new techniques and technologies. Qualifications - 1-5 years of experience in site reliability engineering, cloud engineering, or related fields. - Strong software engineering skills with an emphasis on writing clean, modular, and maintainable code, specifically for automation and system management. - Proficiency in Infrastructure as Code (IaC) tools like Terraform or CloudFormation. - Experience with containerization and orchestration tools like Docker and Kubernetes. - Knowledge of networking concepts, cloud security best practices, and identity management. - Experience with programming or scripting languages such as Python, Bash, or Go. - Familiarity with CI/CD pipelines and DevOps methodologies. - Strong problem-solving skills and the ability to troubleshoot complex cloud environments. - Effective communication skills and a willingness to learn and collaborate. Preferred Qualifications - Bachelor's or advanced degree in Computer Science or a related field. - Any of the below cloud certifications: - Google Cloud Professional Cloud Architect - Google Cloud Professional Cloud DevOps Engineer - AWS Certified Solutions Architect - AWS Certified Developer - AWS Certified SysOps Administrator - Azure Solutions Architect Expert - CompTIA Security+ certification or an equivalent DoD 8140/8570 IAT Level II baseline certification. Salary Range $106,500 - $177,500

United States
$106.5K - $177.5K / year

Role Description The Site Reliability Engineering discipline at Noctua Technology, LLC is a strategic force driving digital transformation. We treat operations as a software engineering challenge, focusing on the seamless integration, scalability, and long-term reliability of cloud native systems. Our SREs don’t just manage infrastructure; they build it using Infrastructure as Code (IaC), monitor it through advanced observability stacks, and protect it by engineering for failure. We work closely with clients to bridge the gap between development and operations. We are seeking a highly experienced and autonomous Senior Site Reliability Engineer (SRE) to join our dynamic team. As a technical leader, you will: - Define the strategy and apply advanced software engineering principles to operations. - Focus on the architecture, reliability, and long-term performance of large-scale production systems. - Play a crucial role in reducing toil through automation. - Define and monitor Service Level Objectives (SLOs). - Implement best practices for system stability and incident response. This role requires working with modern cloud technologies to ensure the high availability and efficiency of applications and infrastructure. Location: Primarily Remote. Candidates must be based in CA or DC Metro Area for proximity to project and client teams. Security Clearance Requirement: Applicants must be US citizens and eligible to obtain and maintain an active Secret security clearance or above. Qualifications - 5+ years of experience in site reliability engineering, cloud engineering, or related fields. - Strong software engineering skills with an emphasis on writing clean, modular, and maintainable code, specifically for automation and system management. - Deep experience with Infrastructure as Code (IaC) tools like Terraform or CloudFormation. - Deep experience with containerization and orchestration tools like Docker and Kubernetes. - Deep knowledge of networking concepts, cloud security best practices, and identity management. - Experience with programming or scripting languages such as Python, Bash, or Go. - Experience with CI/CD pipelines and DevOps methodologies. - Strong problem-solving skills and the ability to troubleshoot complex cloud environments. - Demonstrated ability to influence technical decision-making across organizational boundaries. Preferred qualifications: - Bachelor's or advanced degree in Computer Science or a related field. - Any of the below cloud certifications: - Google Cloud Professional Cloud Architect - Google Cloud Professional Cloud DevOps Engineer - AWS Certified Solutions Architect - AWS Certified Developer - AWS Certified SysOps Administrator - CompTIA Security+ certification or an equivalent DoD 8140/8570 IAT Level II baseline certification. Requirements - Drive the definition and adoption of SLIs and SLOs across multiple services or entire platforms, ensuring alignment with business goals. - Design and architect Infrastructure as Code (IaC) solutions for large-scale, complex environments, establishing standards and best practices. - Implement and manage containerized and serverless architectures using Docker, Kubernetes, and cloud-native services, focusing on performance and error budgets. - Build and maintain reliable and self-healing CI/CD pipelines to automate deployments and improve development workflows. - Implement and refine comprehensive monitoring, alerting, and logging to detect and address performance and availability issues proactively. - Lead the strategic effort to eliminate toil, identifying and championing major automation projects that deliver significant organizational efficiency. - Lead high-severity incident response and coordinate blameless postmortems for major outages, driving the resulting remediation and systemic improvements. - Implement cloud security best practices, including identity and access management (IAM), encryption, and compliance controls. - Proactively identify and address system weaknesses and ensure performance under stress. - Support disaster recovery and high availability strategies through backup and failover planning. - Serve as a primary SRE liaison for development teams, influencing application architecture and design to meet reliability and scalability targets from inception. - Create and maintain documentation for cloud architectures, deployment processes, and best practices. - Contribute to internal knowledge-sharing initiatives, ensuring continuous learning within the team. - Act as a subject matter expert and trusted advisor to clients and internal leadership on cloud infrastructure, reliability strategy, and Service Level Agreement (SLA) negotiations. - Act on client feedback to refine and enhance cloud solutions. - Conduct training and knowledge-sharing sessions to help clients manage their cloud environments effectively. - Stay updated on the latest developments in cloud infrastructure and technology trends. - Drive innovation by proposing and implementing new techniques and technologies. Benefits - Salary Range: $149,400 - $202,000

United States
$149.4K - $200K / year
Gorillas Group logo

Senior SRE

Gorillas Group

A Gorillas Group é líder mundial em inovação, com um forte foco em avanços pioneiros nos campos dinâmicos do iGaming.

DevOps Engineer2 days ago
Full TimeRemoteTeam 51-200Since 2024

• Maintain platform reliability in production, focusing on availability, incidents, monitoring, deployments, costs, security, and capacity. • Help structure and evolve alerts, runbooks, incident response rituals, SLOs, SLIs, metrics, and operational processes. • Improve platform observability by separating useful alerts from noise and creating visibility for engineering, operations, and the business. • Work with cloud, containers, pipelines, databases, networks, CDN/WAF, APIs, and automations, without listing the full architectural details here. • Support decisions on scaling, resilience, isolation, rollback, security, cost, and technical evolution. • Automate repetitive tasks and turn operational knowledge into simple, useful, and actionable documentation. • Collaborate with development, security, product, support, and operations so that reliability is part of the product—not an island at the end of the delivery pipeline.

Brazil