Naseej

تمكين #التحول_الرقمي في إدارة التعلم و المعرفة Harnessing the Power of #Digital_Transformation in Learning & KM

Senior DevOps Engineer

DevOps EngineerDevOps EngineerContract Remote SeniorTeam 501-1,000Since 1989H1B No SponsorCompany Site LinkedIn

Location

Egypt

Posted

36 days ago

Salary

Seniority

Senior

8 yrs expEnglishDocker Grafana Kubernetes Linux Prometheus

Job Description

• Support, operate, and troubleshoot applications in production (live) environments • Monitor application and system performance to ensure high availability and reliability • Analyze logs and proactively identify and resolve issues • Automate operational tasks using Shell/Bash scripting • Collaborate with development and infrastructure teams to resolve incidents and improve system performance • Implement and maintain monitoring, logging, and alerting solutions • Support CI/CD pipelines and test automation practices • Deploy, manage, and optimize containerized applications • Implement and maintain Infrastructure as Code (IaC) solutions

Job Requirements

8+ years of experience in application support and production environment operations
Strong experience with Linux (Ubuntu)
Proficiency in Shell/Bash scripting for task automation
Solid understanding of logging and performance monitoring fundamentals
Experience with test automation
Hands-on experience with container technologies: Docker and Kubernetes (K8s)
Experience with monitoring and observability tools: Prometheus, Grafana, ELK Stack
Experience with Infrastructure as Code (IaC)

Related Categories

DevOps Engineer

Related Job Pages

More Remote Jobs

More DevOps Engineer Jobs

DevOps Engineer

Orion Innovation

DevOps Engineer36 days ago

Full Time RemoteTeam 5,001-10,000Since 1992H1B No Sponsor

Company Site LinkedIn

• Lead the design and implementation of secure AWS infrastructure, ensuring VPC patterns, peering, and transit gateways follow strict security segmentation. • Architect and manage production-grade EKS clusters using Docker and Kubernetes, implementing advanced security controls including OPA/Gatekeeper and workload identity. • Design and maintain secure automation pipelines using GitHub Actions, ensuring security checks are integrated into the deployment lifecycle. • Build and maintain central identity and access systems using Keycloak, integrating OIDC/OAuth and LDAP across the enterprise. • Develop modular, reusable Terraform templates and YAML configurations that incorporate automated compliance checks and security best practices. • Manage and secure Postgres DB instances, including encryption strategies and secret management workflows (AWS KMS) to ensure zero-trust data handling. • Develop custom Python-based tooling to automate infrastructure audits, remediation of drift, and security response workflows.

AWS Docker Kubernetes PostgreSQL Python Terraform

View details: DevOps Engineer

Canada

Apply

Job Closed

Site Reliability Engineer, AWS

HRM Group

Accelerating Digital Evolution

DevOps Engineer36 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Will apply Site Reliability Engineering principles in complex cloud environments on AWS, contributing to the automation of operational tasks, incident management, and continuous improvement of the platform. • The role includes overseeing monitoring and observability, defining SLOs/SLIs, supporting release processes, and collaborating with development teams to reduce toil and increase service resilience.

AWS Cloud Docker EC2 Kubernetes Linux Python Terraform Go

View details: Site Reliability Engineer, AWS

United States

Apply

Job Closed

Principal Infrastructure & Site Reliability Engineer

Oracle

Oracle, headquartered in Austin, Texas, is a global leader in computing solutions. The company specializes in database management systems, cloud-engineered systems, and enterprise

DevOps Engineer36 days ago

Full Time Remote

Role Description Join Oracle's Health Data Intelligence (HDI) team as a Software Engineer 4, focused on Site Reliability Engineering for large-scale healthcare analytics platforms. In this role, you will: - Design, build, and operate highly reliable, scalable infrastructure and data pipelines that power mission-critical analytics globally. - Contribute to the next evolution of cloud operations by advancing automation, observability, and AI-assisted reliability practices. - Explore the use of Generative AI and intelligent automation to improve incident response, system resilience, and operational efficiency. - Work within a collaborative team to deliver robust solutions that handle massive datasets with precision and performance. - Continuously improve system reliability and operational excellence. U.S. citizenship is required for this position, as the successful candidate will be required to obtain (and maintain) a U.S. government security clearance after hire. Qualifications - Experience building and operating high-availability, fault-tolerant systems. - Strong understanding of distributed systems, performance monitoring, and resiliency patterns. - Experience with incident response, root-cause analysis, and production troubleshooting. - Hands-on experience applying Generative AI or Agentic AI (e.g., LangChain, AutoGPT, custom agents). - Ability to design or integrate AI-driven workflows for operational efficiency and reliability. - Familiarity with building or integrating autonomous agents for DevOps/SRE use cases. - Strong experience with multi-cloud environments (OCI, AWS/Azure). - Deep understanding of cloud infrastructure design, deployment, and resource optimization. - Experience managing hybrid or cross-cloud architectures. - Advanced competency in CI/CD pipelines (Jenkins, Kubernetes). - Infrastructure as Code (Terraform). - Observability tools (Prometheus, Grafana). - Strong focus on automation-first operations. - Proficiency in Data Warehousing platforms (e.g., Vertica, Snowflake). - Experience with ETL frameworks and large-scale data processing. - Understanding of columnar storage systems. - Experience supporting or integrating BI tools (Tableau, Power BI, Oracle Analytics). - Strong proficiency in Python, Java, or Go. - Experience with Docker, Kubernetes, and shell scripting. - Strong troubleshooting skills with ability to perform root-cause analysis. - Experience resolving complex production issues in distributed systems. Benefits - Competitive benefits that support our people with flexible medical, life insurance, and retirement options. - Encouragement for employees to give back to their communities through volunteer programs. Company Description Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. With AI embedded across our products and services, we help customers turn that promise into a better future for all. True innovation starts when everyone is empowered to contribute. We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling 1-888-404-2494 in the United States. Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law.

View details: Principal Infrastructure & Site Reliability Engineer

United States

$86.4K - $199.5K / year

Apply

Senior DevOps Engineer

Monad Foundation

DevOps Engineer36 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Build and maintain testnets and test automation for components of the Monad blockchain • Prototype new deployment patterns across validator and full-node topologies • Design and run performance benchmarks and load tests to surface bottlenecks before they reach production • Drive observability, alerting, and security across the stack • Define best practices for node operators & support them • Develop tooling for internal & external use • Participate in on-call rotation for production support

Ansible AWS Cloud Docker Grafana Kubernetes Node.js Prometheus Python Terraform Go

View details: Senior DevOps Engineer

United States

Apply

Job Closed

Senior DevOps Engineer

Job Description

Job Requirements

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer

Site Reliability Engineer, AWS

Principal Infrastructure & Site Reliability Engineer

Senior DevOps Engineer