One platform for every conversation.
Senior Site Reliability Engineer
Location
Egypt
Posted
41 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer
Unifonic
• Owning the reliability, uptime, and scalability of critical production services 24/7. • Participating in the on-call rotation to respond to incidents, troubleshoot live production issues, and lead post-incident analysis. • Building robust operational playbooks, escalation paths, and improve Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR). • Ensuring operational excellence by proactively detecting and addressing reliability risks through SLO monitoring, chaos testing, and capacity planning. • Automating operational tasks to minimize human intervention. • Architecting, implementing, and managing infrastructure across AWS, Oracle Cloud Infrastructure (OCI), and OpenStack environments. • Optimizing cloud resources to balance performance, security, and cost-efficiency. • Managing Kubernetes clusters (EKS, OKE, Rancher RKE2) for scalability, availability, and performance. • Managing and optimizing high-performance messaging and caching systems including Kafka, RabbitMQ, and Redis. • Managing and optimizing production-grade MySQL and PostgreSQL databases. • Leading the planning and execution of comprehensive disaster recovery strategies. • Implementing advanced observability solutions (Prometheus, Grafana, CloudWatch). • Driving automation initiatives using Terraform, Helm, Jenkins, Tekton or GitLab CI/CD. • Integrating security best practices into infrastructure and applications. • Collaborating with cross-functional teams to foster SRE culture and mentoring junior engineers.
Job Requirements
- Bachelor's or master's degree in computer science, Engineering, or a related technical field.
- 8+ years of hands-on production experience in SRE, DevOps, or cloud engineering roles.
- Strong expertise in AWS, OCI, OpenStack environments.
- Deep understanding of Kubernetes ecosystems (EKS, OKE, Rancher RKE2).
- Proven experience with Kafka, RabbitMQ, Redis, and distributed messaging and caching systems.
- Solid experience managing MySQL and PostgreSQL in production environments.
- Expert-level scripting and automation skills (Python, Bash, Go).
- Advanced proficiency with Helm, Terraform, and modern CI/CD toolchains.
- Demonstrable experience with Linux system administration and troubleshooting.
- Must be available at night during the on-call schedule.
Benefits
- Competitive salary and bonus
- Unifonic share scheme (we are all owners!)
- 30 holiday days after the first anniversary
- Your Birthday off!
- Spend up to 25 days per year working from anywhere in the world!
- Paid leave and assistance for new parents
- LinkedIn learning license
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
This is a remote position. We are seeking a DevSecOps Engineer to support a large-scale Facets migration project within the healthcare payer environment. This role will focus on improving collaboration between development and operations while embedding security best practices across the software development lifecycle. The ideal candidate will help drive efficient deployments, resolve production issues, and build integrations that enhance system performance and user experience. Key Responsibilities: - Bridge development and operations teams to streamline software delivery and deployment processes - Build, maintain, and optimize CI/CD pipelines to support product releases and updates - Monitor applications and infrastructure, identifying and resolving production issues - Perform root cause analysis and implement long-term solutions to improve system stability - Integrate security best practices into the development lifecycle, including vulnerability management and secure coding standards - Develop and support integrations across systems to improve user experience and operational efficiency - Automate processes to reduce manual effort and improve reliability - Participate in code reviews to ensure quality, performance, and security compliance - Support deployment and integration efforts related to the Facets migration project - Maintain documentation for processes, pipelines, and security protocols Requirements Required Qualifications: - Experience in DevOps, DevSecOps, or a similar role - Strong understanding of CI/CD pipelines and automation tools - Experience with cloud platforms such as AWS or Azure - Knowledge of scripting or programming languages (such as Python, Bash, or PowerShell) - Experience with monitoring tools and incident management processes - Strong understanding of security best practices in software development - Strong problem-solving and analytical skills - Excellent communication skills
• Ensure the reliability, availability, and scalability of the systems and services in the Product Areas (PAs) to which they are assigned. • Develop and implement monitoring, observability, and alerting solutions integrated with the Agentic Engineering Platform. • Support teams in defining and tracking SLIs, SLOs, and error budgets. • Design and evolve on-call management across Product Areas: rotations, escalation, alerting tools, and incident management. • Work closely with the Engineering Platform to ensure platform capabilities reach and are adopted by product teams. • Actively contribute to the evolution of the Agentic Engineering Platform by bringing real feedback from Product Areas about frictions, gaps, and improvement opportunities. • Participate in and influence the building of a reliability-oriented engineering culture (SRE) across the company. • Support migrations of critical systems, environment segregation, and the deprecation of legacy technologies.
Senior DevOps Engineer Location: Canberra ACT Australia Job Description: About this role The DevOps Engineer will work within the Platform Engineering team and help provide the infrastructure, tooling, and automation that supports Karbon's engineering teams. This role focuses on improving deployment workflows, platform reliability, and operational visibility across our cloud platform. Our Engineering Standards Balance Speed and Quality Engineers are expected to balance delivery speed with a strong commitment to quality, meeting agreed timelines while producing reliable, maintainable, and well-tested solutions. Sound judgment in making trade-offs between velocity and long-term sustainability is essential. Collaborate Effectively Engineering is collaborative by default. Team members are expected to contribute constructively in design discussions, reviews, and planning, communicate clearly about progress and risks, and support shared team outcomes in both hybrid and distributed environments. Build and Maintain Systems Engineers are responsible for building new capabilities while maintaining and improving existing systems. This includes designing scalable solutions, reducing technical debt, supporting operational stability, and contributing to continuous improvement. Operate with Autonomy A high degree of autonomy is expected. Given clear objectives, engineers should independently translate problems into actionable technical approaches, proactively identify improvements, and continuously expand relevant technical expertise. Ownership and Accountability Ownership is fundamental. Engineers are accountable for the quality, performance, and customer impact of their work from design through post-release support, and are expected to follow through on commitments. AI-Enabled Engineering AI is reshaping how software is built, and we are committed to leveraging it as a force multiplier for creativity, impact, and capability. Engineers are expected to confidently apply strong technical fundamentals while embracing AI tools and approaches to enhance productivity, problem-solving, and innovation. Curiosity, adaptability, and enthusiasm for integrating AI into meaningful product development are essential. Contribute to Team Culture Engineers contribute positively to a culture of professionalism, transparency, low bureaucracy, and mutual respect, strengthening team performance through authenticity, curiosity, and collaboration. Some of your main responsibilities will include: - Building, maintaining, and improving CI/CD pipelines and deployment workflows using GitHub Actions and Jenkins to enable reliable production releases - Developing and maintaining Infrastructure as Code and automated provisioning using tools such as Terraform, ARM templates, CloudFormation. - Supporting configuration management using Chef - Supporting containerised workloads using Docker and Azure Container Apps - Building and maintaining monitoring, alerting, and observability systems using Datadog and OpenTelemetry - Monitoring the security, performance, and availability of infrastructure and applications - Investigating and resolving infrastructure and platform-related issues - Working with engineering teams to deploy infrastructure changes safely and efficiently - Contributing to disaster recovery, environment management, and operational resilience - Assisting with cloud cost optimisation - Delivering operational support for the platform as part of a team on-call roster rotation - Working closely with Engineering and Product teams to implement DevOps solutions and improve developer workflows About You Candidates with the following characteristics and experience are encouraged to apply: - 5+ years of experience working in DevOps, Platform Engineering, or Site Reliability Engineering - Experience supporting Azure cloud environments - Good understanding of DevOps practices and infrastructure fundamentals (Linux, Windows, DNS, TCP/IP, networking) - Experience building CI/CD pipelines using GitHub Actions, Jenkins, or similar tools - Experience working with Infrastructure as Code tools such as Terraform - Experience with Docker containerisation and cloud-native application deployments - Experience using configuration management tools such as Chef - Experience implementing monitoring and observability using tools such as Datadog and OpenTelemetry - Experience automating workflows using PowerShell, Bash, Python, or similar scripting languages - Experience working with Git-based repositories and modern development workflows - Familiarity with high-availability systems, microservice architectures, and distributed systems - Strong communication skills and ability to collaborate across engineering teams - A proactive approach to problem-solving and improving platform reliability AI is reshaping how software gets built, and at Karbon we're fully committed. We don't see AI as a replacement for engineers. We see it as a force multiplier that will elevate creativity, impact and capability to levels we've barely begun to imagine. We're looking for developers who are confident in their fundamentals, driven to grow, and excited to harness AI to build something meaningful. If you're energized by this future rather than cautious of it, you'll feel right at home here. Our Core Technology Stack We build modern, scalable software on a thoughtfully designed stack: - Frontend: TypeScript and JavaScript across Ember (today), React, and React Native. - Backend: .NET / C# (Web API, .NET Core) powering distributed services. - Data: SQL Server with performance and integrity at scale. - Cloud: Microsoft Azure. - Observability: Metrics, logging, alerting, and dashboards in Datadog - because we believe you can't improve what you don't measure. Our architecture continues to evolve as we scale. We invest in event-driven systems, well-defined microservices, and containerized deployments (Azure Container Apps) to build resilient, decoupled, and high-performing software. If you care about clean service boundaries, reliable systems, and shipping with confidence - you'll feel right at home here. Why Work at Karbon? - Gain global experience across the USA, Australia, New Zealand, UK, Canada and the Philippines - 4 weeks annual leave plus 5 extra "Karbon Days" off a year - Flexible working environment - Work with (and learn from) an experienced, high-performing team - Be part of a fast-growing company that firmly believes in promoting high performers from within - A collaborative, team-oriented culture that embraces diversity, invests in development, and provides consistent feedback - Generous parental leave
DevOps Engineer – Contract, Auto Test
Bespoke LabsBespoke Labs is a venture funded startup creating AI tools for data curation and post-training LLMs. (We are hiring!)
• Contract DevOps Engineers for project-based work • Responsibilities include infrastructure setup, CI/CD pipeline management, and cloud operations on AWS or GCP




