Omegro

A people-first buy-and-grow acquirer of software companies seeking a permanent and safe home to continue their legacy.

DevOps Specialist

DevOps EngineerDevOps EngineerFull Time Remote Mid LevelTeam 501-1,000H1B No SponsorCompany Site LinkedIn

Location

Worldwide

Posted

86 days ago

Salary

C$95K - C$120K / year

Seniority

Mid Level

No structured requirement data.

Job Description

Role Description As a DevOps Specialist, you are responsible for the reliable, secure, and scalable operation of our cloud infrastructure. You will own your work from initial design to deployment and beyond, playing a key role in supporting our development and customer-facing environments. You’ll work closely with developers and QA to build the systems and tools that help us ship code faster and safer, always with a focus on automation, resilience, and visibility. Make your impact with a rapidly growing Marine Software business! - Manage and maintain cloud infrastructure, with a focus on AWS services. - Design, implement, and maintain Infrastructure as Code (IaC) for production and non-production environments. - Ensure scalability, performance, and security across systems. - Participate in disaster recovery (DR) planning and testing, ensuring infrastructure is designed and maintained to support high availability, resiliency, and business continuity. Automation and Deployment: - Build and improve CI/CD pipelines to streamline and automate deployments. - Create automation scripts and tooling to support code rollouts, environment setups, and recovery scenarios. - Develop automation for large-scale test data migration to support customer sandbox environments. Monitoring and Support: - Design and implement monitoring, alerting, and logging solutions to detect and prevent issues proactively. - Maintain on-call availability for urgent issues and production-impacting events on a rotational basis. - Continuously evaluate and optimize system reliability, uptime, and health. Collaboration and Documentation: - Collaborate with developers, QA, and product teams to align infrastructure with application needs. - Document infrastructure, deployment processes, and runbooks to ensure knowledge sharing across teams. Qualifications - Strong experience with AWS infrastructure and services. - Proficiency with Terraform, CloudFormation, or similar IaC tools. - Comfortable with scripting languages (e.g., Python, Bash, Powershell) for automation. - Solid understanding of CI/CD systems (e.g., GitHub Actions). - Familiarity with observability tools like CloudWatch, Kibana, Prometheus, Grafana, or similar. - Hands-on experience with containerization and orchestration technologies, ideally Docker and Kubernetes, for building, deploying, and managing scalable applications. Requirements - Passionate about automation and improving systems through tooling. - Comfortable supporting production systems and handling incidents when needed. - Previous exposure to security best practices and cloud governance. - Contributes to resiliency planning and incident preparedness, collaborating with cross-functional teams to ensure recovery processes are tested and effective. Benefits - The role is remote. - Flexible hours with overlap in core business time. - Competitive salary ($95,000 – $120,000 CAD) + bonus. - Paid vacation, 7 floater days, and your birthday day off. - Comprehensive benefits health, dental & vision from day 1. - Fitness reimbursement. - Employee Stock Purchase Plan (after 6 months). - Learning & professional development pathways available. - Inclusive culture & remote-team events.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps Platform Engineer

VetsEZ

Agile | Adaptive | Ardent

DevOps Engineer86 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Design, implement, and maintain CI/CD pipelines using tools such as Jenkins and GitHub Actions. • Develop reusable automation workflows for containerized InterSystems IRIS for Health environments. • Build and maintain Docker-based development, testing, and deployment environments. • Support automated deployment and installation of MUMPS routines and Object-Script classes related to application artifacts. • Integrate automated testing frameworks into CI/CD pipelines, including JUnit XML reporting and automated test execution workflows. • Develop and maintain automation tooling, scripts, and console-based utilities used for testing and deployment orchestration. • Support container lifecycle management, environment initialization, configuration management, and infrastructure automation. • Collaborate with development teams to improve software delivery, developer workflows, and testing automation processes. • Troubleshoot CI/CD pipeline failures, container runtime issues, deployment problems, and infrastructure-related defects. • Document automation processes, deployment workflows, and operational procedures. • Help establish reusable DevOps and platform engineering standards for future development teams and projects.

Azure Docker Jenkins JUnit Linux Shell Scripting

View details: Senior DevOps Platform Engineer

United States

Apply

DevOps Team Lead, Core Foundation

AlpacaDB

AlpacaDB, Inc., also known as Alpaca and Alpaca Securities, is an API stock and crypto brokerage platform that enables services to embed investing and developer

DevOps Engineer86 days ago

Full Time Remote

Company Site

• Lead, mentor, and foster a healthy, high-performing globally distributed engineering team. • Own the execution and delivery of highly critical, complex yearly roadmap items centered around large-scale foundational infrastructure upgrades, high availability, and platform resilience. • Own and drive the change management processes across engineering and product domains. You will orchestrate the smooth delivery of major systemic changes, ensuring alignment, mitigating friction, and breaking down silos between diverse technical groups to deliver cohesive infrastructure solutions. • Design, implement, and refine robust support workflows, agile planning methodologies, and deployment/rollout strategies to ensure operational excellence. • Manage and optimize the global on-call rotation to ensure team well-being while maintaining high availability. Lead incident response (via Rootly), establishing clear communication, rapid resolution processes, and blameless post-mortems.

Grafana Kubernetes PostgreSQL Prometheus Terraform

View details: DevOps Team Lead, Core Foundation

Europe

Apply

Job Closed

Senior Site Reliability Engineer (Calgary, Canada)

Syndio

Syndio builds expert-backed technology that helps companies measure, achieve, and sustain workplace equity.

DevOps Engineer86 days ago

Full Time RemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

Do you want to empower organizations to build smarter compensation strategies while ensuring fair pay for all employees? Syndio is a Series C technology company leveraging advanced technology and responsible AI to accelerate decision-making, streamline compliance, and democratize insights for consistent, equitable compensation practices at scale. Backed by $83M in investments from Bessemer Venture Partners, Voyager Capital, and Emerson Collective, we are expanding our team and products to help companies align their rewards strategies with their business goals. Our customers are our greatest asset. Syndio partners with many of the world’s most recognized and respected enterprises, helping them implement leading-edge compensation solutions with expert guidance. We analyze pay for over 10 million employees across dozens of countries, ensuring fair, defensible rewards that drive better business outcomes. Join us in our mission to help companies make smarter pay decisions they can trust! About the Role We are looking for a skilled and motivated Senior Site Reliability Engineer (SRE) who will design, implement, maintain, and evolve solutions that increase the reliability and availability of our applications and systems. We strive for close collaboration, shared ownership, and constant learning. As a Senior SRE, you will apply software principles to ensure our applications are designed to eliminate single points of failure, provide maximum observability, and experienced failures are resolved quickly and efficiently. We are looking for someone who is passionate about SRE practices and methodologies to solve business requirements, constantly seeking new techniques, tools, and solutions to enhance our systems. As a startup in a fast-paced high growth environment, we are looking for an engineer that isn't bound by principles assigned to traditional engineering roles. You’ll be exposed to, develop skills in, and be responsible for work that may at times fall into the traditional world of platform, data, security, and software engineering. We use Kubernetes and Terraform almost exclusively in a 100% cloud-based environment, and are looking for an SRE who is still growing, comfortable in their knowledge of these technologies, and has relevant experience managing Kubernetes applications in an SRE role. Why this job is exciting - Be a critical member in our engineering organization designing, implementing, managing, and reviewing infrastructure and application operational and architectural state - Design, implement, and operate production systems using best practices in automation, monitoring, and observability - Collaborate with developers, SRE’s, and other engineers to ensure smooth deployments, minimize downtime, maximize observability, share knowledge, and contribute to continuous improvement initiatives - Experiment with cloud infrastructure environments and services - Technologies you will work with: GCP, Kubernetes, Terraform, Helm, Datadog, Python, Go, and more - Participate in 24/7 on-call rotation- (https://status.synd.io/) About You - 5+ years of experience in Site Reliability Engineering or similar role operationalizing and maintaining cloud services - Strong experience with Infrastructure as Code (IaC) tools such as Terraform - Strong experience with Linux, Kubernetes, Helm and public cloud platforms such as GCP - Experience with monitoring and alerting tools such as Datadog. - Experience with CI/CD pipelines and GitOps model for deployment - Ability to diagnose technical problems, debug code, and automate routine tasks - Experience in Python and/or Go programming language is a plus - Experience with security best practices for cloud deployments is a plus - You are self-disciplined, self-motivated, and have strong sense of ownership, urgency, and drive - You are available and willing to step in for emergency response and incident management - You assume positive intent, are humble and eager, expect the best from yourself, value partnership over perfection, and provide grace and understanding in stressful situations Why you'll love it here - 💰 Competitive Compensation. For this role our base salary is targeted at $130k-145k CAD. Final offer amounts are determined by factors such as experience and expertise - 🏆 Syndio Equity. So you can share in Syndio’s success - 🏝 20 days annually. We encourage our team to recharge when they need to, plus paid sick & safe time, compassion leave, and voting leave - 🏦 Pension Contribution - 📍 Remote-First with opportunities for local meet-ups in Calgary #LI-Remote Role Progression - Within 1 month, you’ll complete a comprehensive and supportive onboarding process and be able to make isolated contributions to the product, developer tooling, and infrastructure - Within 3 months, you’ll have a grasp of the complete set of components (services, tools, configuration, etc.) that make up the product and infrastructure. You will continue to make contributions. You will be a full member of the on-call rotation. - Within 12 months, you’ll be able to suggest and implement complex changes to the infrastructure, developer tooling, and capacity planning. Interview overview Below you'll find an outline of the interview plan for our Senior Site Reliability Engineer position. Please note that while this is what we expect the process to look like, we may ask you for supplemental information or require an additional step before making a final decision. - Recruiter Screen - 25min - Hiring Manager - 30min - Take-home Skills Evaluation - Loop - Three 30-45 min video calls with departmental peers and cross-functional team members - 2hrs max - Final Interview - Engineering Leader - 30min At Syndio, we're building a diverse team that values candor, curiosity, and community. If you share these values and are interested in joining us, we'd love to talk with you even if you don't 100% meet the "about you" listed here. We don't expect anyone to have all the answers, as long as you're willing to learn and grow with us. Syndio is an Equal Opportunity Employer. We are building an inclusive and collaborative workplace as we grow, and we welcome team members regardless of gender/identity, sexual orientation, race or cultural background, religion, physical disability and age.

View details: Senior Site Reliability Engineer (Calgary, Canada)

Canada

C$130K - C$145K / year

Apply

Principal DevOps Engineer

Zeta Global

We deliver better experiences for consumers and better results for your brand.

DevOps Engineer86 days ago

Full Time RemoteTeam 1,001-5,000Since 2007H1B Sponsor

Company Site LinkedIn

• Design, build, and operate production-grade CI/CD pipelines enabling multiple developers on multiple teams to deploy concurrently to production, multiple times daily, with zero-downtime guarantees. • Implement and optimize advanced deployment strategies including canary releases, blue/green deployments, rolling updates, incremental rollouts, and feature flag-gated releases via Statsig. • Build self-service deployment tooling that empowers developers to own their release process while enforcing safety guardrails, automated rollback triggers, and automate compliance gates. • Establish deployment observability with real-time canary analysis, automated health scoring, and progressive delivery metrics integrated with Grafana, Prometheus, and Honeycomb. • Champion CI/CD workflows using GitLab CI/CD, Helm charts, and Terraform to ensure infrastructure and application deployments are version-controlled, auditable, and reproducible. • Define and enforce SLOs/SLIs/SLAs across services, establishing error budgets that balance velocity with reliability. • Lead incident response processes, including on-call rotations, runbook development, blameless postmortems, and incident command structure. • Design and implement robust observability stacks leveraging Grafana, Prometheus, Loki, and Honeycomb for metrics, logging, tracing, and alerting at scale. • Proactively identify and eliminate reliability risks through chaos engineering, load testing, capacity planning, and failure mode analysis. • Reduce operational toil through automation, self-healing infrastructure patterns, and intelligent alerting to minimize mean time to detection (MTTD) and recovery (MTTR).