Job Closed
This listing is no longer active.
Corporate Alumni Engagement & Management Platform For The Enterprise
DevOps, Cloud & Infrastructure Engineer
Location
Argentina
Posted
108 days ago
Salary
C$18 - C$23 / hour
Seniority
Senior
Job Description
DevOps, Cloud & Infrastructure Engineer
EnterpriseAlumni
• Design, build, and maintain infrastructure on AWS (ECS, EKS, RDS) • Manage and scale Kubernetes clusters (EKS) using Helm • Develop and maintain infrastructure as code using Terraform / Terragrunt • Improve and maintain CI/CD pipelines (Jenkins) • Automate operational tasks using Bash and Python • Work with Docker to build and optimize containerized workloads • Implement and maintain observability solutions (Prometheus, Grafana, OpenSearch) • Ensure system reliability, scalability, and security (Linux hardening, OS-level tuning) • Troubleshoot production issues across infrastructure, networking, and applications • Collaborate with engineering teams and participate in architectural decisions
Job Requirements
- Senior-level experience (5+ years in DevOps / SRE / Platform roles)
- Upper-intermediate or fluent English
- Strong experience with Kubernetes (EKS) + Helm
- Solid hands-on experience with AWS
- Experience with Terraform (preferably Terragrunt)
- Strong knowledge of Docker and containerization
- Experience with CI/CD (Jenkins or similar)
- Good scripting skills (Bash and/or Python)
- Strong Linux/system administration background
- Good understanding of networking (TCP/IP, DNS, routing, load balancing)
- Experience with observability tools (Prometheus, Grafana, OpenSearch)
Benefits
- Equal opportunity employer
- Committed to creating an inclusive environment for all employees
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Cloud Engineer, Open LMS
Learning Technologies Group plcLTG is a leader in corporate digital learning and talent management.
• Using automation and Infrastructure as a Code (IaC) to continuously improve reliability, scalability, and performance of services deployed on AWS. • Performance tuning and configuration of both Linux system and application parameters supporting highly concurrent web stacks. • Manage infrastructure through code using configuration management and IaC templating software such as Terraform and Puppet. • Documenting procedures and knowledge base articles throughout problem resolution and architecture development processes. • Monitoring the availability, performance, and health of production systems to meet service level objectives using monitoring systems such as Icinga, Prometheus, Grafana, CloudWatch, and Loki. • Participating in emergency incident response on-call rosters. • Practicing blameless postmortems that lead to improvements in resiliency and reductions in alert fatigue.
Senior IAM Operations – Reliability Engineer
GenesysOrchestrating billions of remarkable experiences in more than 100 countries – through cloud, digital and AI technology.
• Resolve IAM-related incidents through hands-on troubleshooting and remediation, serving as an escalation point for other junior engineers on the team. • Monitor observability, AIOps, and event management platforms to identify anomalies, authentication failures, provisioning delays, and emerging IAM-related incidents. • Perform incident triage and correlation to determine probable cause and appropriate routing for deeper investigation. • Validate automated remediation workflows and assist in identifying repeated manual IAM tasks that could be automated. • Participate in early-stage automation and AI-readiness activities by documenting remediation steps, key patterns, and operational edge cases related to identity services. • Reduce alert noise by suggesting adjustments to IAM-related thresholds, suppression logic, or detection rules. • Support post-incident reviews by providing relevant data, timelines, and insights related to identity service behavior. • Collaborate with Cloud, Network, Security, Endpoint, and ServiceNow teams to support incident resolution and improve IAM operational processes. • Assist with access lifecycle, certification, or remediation workflows by troubleshooting failures, validating outcomes, and performing manual intervention when automation isn’t available. • Ensure accuracy of identity event data, alerts, and service mappings to support effective correlation within monitoring and CMDB systems. • Troubleshoot and resolve IAM-related incidents across authentication, authorization, provisioning, deprovisioning, and access lifecycle workflows. • Analyze logs, events, and telemetry from IAM platforms (e.g., Okta system logs, Microsoft Entra ID (formerly Azure Active Directory) sign-in logs, directory events) to determine service impact and root causes. • Support correlation of IAM events with dependencies across cloud applications, SaaS platforms, endpoints, and network access paths. • Participate in validating IAM automation workflows such as Joiner/Mover/Leaver processes, access provisioning, and deprovisioning flows and apply fixes and minor enhancements to automation or AIOps capabilities. • Assist in identifying IAM-related automation opportunities by documenting repeated failure modes and manual remediation steps. • Support certificate, trust, and integration troubleshooting for IAM-connected applications and services. • Maintain IAM-focused dashboards and alerts, ensuring clear signals and early detection of user-impacting identity issues. • Provide knowledge-sharing to team members and peers regarding common IAM troubleshooting patterns and operational best practices. • Participate in IAM readiness activities for new application onboarding, platform changes, or lifecycle process updates by reviewing operational and monitoring requirements.
• Spend your days working to automate and improve reliability and continue to push FlightAware's infrastructure forward, ensuring it is resilient and reproducible. • Be responsible for service availability, performance, monitoring, incident response, and capacity planning. • Create, improve, and manage environments to ensure decisions on resource allocation, problem identification, and capacity planning are based on accurate data-driven insights. • Maintain a physical infrastructure using Kubernetes, Linux, & Ceph, and a cloud infrastructure in AWS as part of the Site Reliability Engineering team. • Impact technology decision and direction to grow and support the FlightAware platform. • Collaborate closely with fellow SREs on your team and extend your collaboration across other FlightAware teams and disciplines to design dependable and scalable solutions and services. • Identify, implement, and champion process improvements to enhance productivity, collaboration, and delivery efficiency, while ensuring alignment with company goals and industry best practices.
DevOps Engineering Manager
Roadpass DigitalOur brands help inspire, educate, and empower millions of RVers and roadtrippers to enjoy camping and the open road.
• Drives the delivery of company level features and initiatives while prioritizing work alignment with product and business goals • Continuously monitors deadlines and removes any blockers to ensure delivery at a team level • Be a hands on member of the team by contributing to code as an independent contributor at a senior level • Design, develop and deliver scalable and automated services and architecture • Architect, design, and implement solutions with native AWS Services and other cloud/managed services as necessary • Ensure solutions are architected and delivered using best practices and technologies • Communicate the benefits and drawbacks of infrastructure choices to technical and non-technical stakeholders • Create and apply reusable automation libraries across the company • Manage centralized monitoring and alerting for infrastructure, and enable developers to extend with application-level monitoring • Setup infrastructure for easy reporting and accountability across products • Troubleshoot production issues and perform on-call duties • Plan and coordinate infrastructure and operations for new projects and acquisitions • Monitors, advises and implements solutions to address security and risks for company Infrastructure/Ops • Evaluate DevOps priorities for the company • Manages and prioritizes day to day tasks for DevOps team • Holds regular 1:1 meetings with direct reports allowing for two way feedback • Lead, mentor, and manage a team of DevOps engineers fostering a culture of collaboration, accountability, and continuous improvement • Owns performance reviews, and career growth for direct reports, and actively participates in hiring processes.




