Job Closed

This listing is no longer active.

LineTen

LineTen is a cloud-based, SaaS technology platform that enables businesses to aggregate technical transactions.

Site Reliability Engineer

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 51-200H1B No SponsorCompany Site LinkedIn

Location

Malaysia

Posted

98 days ago

Salary

Seniority

Senior

Bachelor DegreeEnglishCloud Docker Kubernetes

Job Description

• Ensure global coverage of our products. • Responsible for ensuring all engineering teams have a first class development experience. • Drive roll out of Docker/Kubernetes across all engineering workstations. • Ensure top-notch observability setup is in place. • Provide engineering support across all products. • Work with the Architecture team for product development direction. • Participate in post-incident reviews.

Job Requirements

Responsible for ensuring all engineering teams have a first class development experience via Tooling, Scripts and support
Drives roll out of Docker/Kubernetes across all engineering workstations regardless of o/s
Works with the SRE team and other Architects to ensure that delta between workstation and cloud is minimised and, where this is not possible, workarounds and solutions exist
Ensure that a top-notch observability setup is in place.
Is a “go-to” person within the business on all things Docker/Containers
Responsible for ensuring knowledge base and setup guides are in place, active and maintained
Provide engineering support across all products
Work with the Architecture team to gain an understanding of likely direction of product development
Provide training, support, and resources for engineering teams
Provide the engineering team with details of any code changes required to support other cloud-based PaaS products
Provide support to IT manager for device procurement for engineering teams
Provide QA teams with additional tooling/support as may be required
Support the engineering team with workstation set up issues
Participate in product scrums as required
Work with the engineering team on code reviews
Ensure Vendor dependencies are recorded/scoped
Fixing support escalation issues
Building software / scripts to automate and in general help engineering, operations and support teams perform their duties.
Participate in post-incident reviews
Improve the on-call process; reduce team burden while improving issue response times
Participate in knowledge transfer sessions - for the wider team to self serve
Capture, analyse and update metrics (SLI, SLO, SLA)
Create monitoring to improve availability and detect anomalies.

Benefits

We Are a Home-First Team: LineTen is committed to our home-first policy, which means that we honour remote working first, and offer office space in London, England and Porto, Portugal.
We Believe in Having Fun: Our WellUs team organises monthly events like Pet Zoom Calls, 45-minute Yoga classes, and after-hours cocktail lessons.
We Want You to Take a Break: We believe it is the quality of work that matters, not the hours spent “on the clock”. We offer flexible working hours and unlimited vacation.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevOps Mid

NEORIS

NEORIS is a Digital Accelerator that helps companies step into the future.

DevOps Engineer98 days ago

Full Time RemoteTeam 1,001-5,000H1B No Sponsor

Company Site LinkedIn

NEORIS ahora parte de EPAM es un acelerador Digital que ayuda a las compañías a entrar en el futuro, con más de 20 años de experiencia como Socios Digitales de algunas de las compañías más importantes del mundo. Somos más de 4,000 profesionales en 11 países, con una cultura multicultural y de startup donde fomentamos la innovación, el aprendizaje continuo y la generación de soluciones de alto impacto para nuestros clientes. Estamos en búsqueda de: DevOps Semi Senior - Mid! Principales responsabilidades: • Implementar, optimizar y mantener entornos Cloud Native, garantizando escalabilidad, seguridad y rendimiento. • Gestionar y automatizar pipelines de CI/CD utilizando GitHub, GitLab o Jenkins. • Diseñar, desplegar y administrar soluciones basadas en contenedores. • Configurar y mantener herramientas de observabilidad, asegurando una monitorización integral mediante Dynatrace, Prometheus y Grafana. • Colaborar con equipos de desarrollo y arquitectura para definir buenas prácticas DevOps. • Participar en la mejora continua de procesos, automatización y estándares técnicos. Requerimientos: Excluyentes: • Experiencia mínima de 2 a 4 años en roles DevOps (nivel semi senior). • Conocimientos sólidos en prácticas Cloud Native y gestión de contenedores. • Experiencia aplicando herramientas de observabilidad como Dynatrace, Prometheus y Grafana. • Manejo de GitHub o GitLab y experiencia en pipelines con Jenkins. • Conocimiento de herramientas de calidad y seguridad como Kiuwan. Deseables: • Experiencia con Kubernetes u otros orquestadores de contenedores. • Certificaciones en DevOps o Cloud. • Experiencia en entornos de alta disponibilidad o proyectos de transformación digital. • Conocimientos en automatización avanzada e Infrastructure as Code. Ofrecemos • Contrato indefinido con salario competitivo • Modalidad flexible y posibilidad de trabajo remoto. • Plan de carrera personalizado y formación continua (certificaciones, inglés, etc.). • Participación en proyectos estables con alto componente técnico. • Flexibilidad horaria y enfoque en la conciliación. • Beneficios sociales adaptados a tus necesidades Te invitamos a conocernos en http://www.neoris.com, Facebook, LinkedIn, Twitter o Instagram: @NEORIS. #LI-MO1

View details: DevOps Mid

Spain

Apply

Senior DevOps Engineer

Tekhqs

TekHQS is a global technology and AI-driven solutions company delivering scalable SaaS, Cloud, AI/ML, Blockchain/Web3, DevOps, and enterprise software solutions to startups and enterprise clients worldwide. With a team of 300+ professionals across the USA, UK, UAE, Qatar, Pakistan, and India, we specialize in building high-performance digital products across Logistics, FinTech, Healthcare, and emerging technology sectors. At TekHQS, we foster a culture of innovation, ownership, and continuous growth, empowering our teams to build impactful technology that drives real business transformation.

DevOps Engineer98 days ago

Full Time RemoteTeam 201-500

About the Role We are seeking a skilled DevOps Engineer to strengthen our infrastructure, automation, and CI/CD capabilities across multiple projects. The ideal candidate will drive automation, streamline CI/CD pipelines, and ensure reliable deployments across development and production environments. This role requires strong hands-on experience with AWS and/or Azure, containerization, orchestration tools, infrastructure as code, and continuous integration systems. Key Responsibilities Infrastructure & Cloud Management - Design and manage cloud infrastructure on AWS and Azure. - Deploy and maintain services such as EC2/VMs, S3/Blob Storage, RDS/Azure SQL, VPC/VNet, IAM/Azure AD, Load Balancers, and related services. - Ensure high availability, scalability, and security of production systems. Containerization & Orchestration - Build and manage Docker containers. - Deploy and maintain Kubernetes clusters (EKS/AKS preferred). - Optimize container orchestration for performance and cost efficiency. CI/CD & Automation - Design and maintain CI/CD pipelines using Jenkins (experience with Azure DevOps is a plus). - Automate build, test, and deployment processes. - Implement Infrastructure as Code using Terraform. - Automate configuration management using Ansible. Monitoring & Reliability - Implement monitoring, logging, and alerting mechanisms (CloudWatch, Azure Monitor, Prometheus, Grafana, etc.). - Troubleshoot production issues and ensure minimal downtime. - Improve system reliability and deployment velocity. Security & Compliance - Implement security best practices in infrastructure and pipelines. - Manage IAM roles, access controls, and secrets securely across AWS/Azure environments. - Conduct regular system audits and vulnerability checks. Collaboration - Work closely with development, QA, and product teams. - Support release planning and environment readiness. - Document processes, workflows, and infrastructure architecture. Preferred Qualifications - AWS and/or Azure Certifications (Associate or Professional level). - Experience with microservices architecture. - Experience with monitoring tools (Prometheus, Grafana, CloudWatch, Azure Monitor, etc.). - Knowledge of security best practices and DevSecOps concepts. Soft Skills - Strong analytical and troubleshooting skills. - Ability to work independently and within cross-functional teams. - Strong documentation and communication skills. - Proactive and ownership-driven mindset. - Ability to manage multiple environments and deadlines efficiently. Job Details Experience: 5 years Job Type: Fully Remote Location: 9 pm to 5 am About TekHQS TekHQS is a global technology and AI-driven solutions company delivering scalable SaaS, Cloud, AI/ML, Blockchain/Web3, DevOps, and enterprise software solutions to startups and enterprise clients worldwide. With a team of 300+ professionals across the USA, UK, UAE, Qatar, Pakistan, and India, we specialize in building high-performance digital products across Logistics, FinTech, Healthcare, and emerging technology sectors. At TekHQS, we foster a culture of innovation, ownership, and continuous growth — empowering our teams to build impactful technology that drives real business transformation.

View details: Senior DevOps Engineer

Pakistan

Apply

Senior Engineer, Site Reliability

Zensar

At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus. Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.

DevOps Engineer98 days ago

Full Time RemoteTeam 10,001

Role Description The Software Engineer / Site Reliability Engineer (SRE) will play a critical role in driving reliability, scalability, and performance for the Banking Solutions, Payments, and Capital Markets platforms. This role blends core SRE principles, performance engineering, and service health management to support large-scale, mission-critical systems. The ideal candidate will help modernize platforms through automation-first practices, data-driven reliability metrics, and proactive performance optimization, ensuring exceptional customer experience and business continuity in a highly regulated environment. What You Will Be Doing - Core SRE & Reliability Engineering - Design, implement, and operate highly available, resilient, and scalable systems aligned with SRE best practices. - Define and manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets to balance reliability and delivery velocity. - Build and maintain service health dashboards to provide real-time visibility into platform stability and customer experience. - Reduce toil through extensive automation of operational workflows, alerts, and remediation activities. - Monitoring, Observability & Service Health - Design and maintain end-to-end monitoring and observability solutions covering infrastructure, applications, APIs, and user journeys. - Implement advanced alerting strategies to reduce noise and improve mean time to detect (MTTD) and mean time to resolution (MTTR). - Leverage metrics, logs, and traces to drive root cause analysis and proactive incident prevention. - Enable reliability reporting for stakeholders using SLO compliance and service health metrics. - Performance Engineering & Testing - Lead performance engineering initiatives, including load testing, stress testing, endurance testing, and capacity validation. - Identify performance bottlenecks across application, middleware, database, and infrastructure layers. - Conduct capacity planning and performance tuning to support business growth and peak traffic scenarios. - Partner with development and QA teams to embed performance testing into CI/CD pipelines. - Incident Management & Operations - Lead and participate in incident response activities, including triage, mitigation, recovery, and post-incident reviews. - Drive blameless post-mortems and ensure corrective actions are tracked to completion. - Participate in on-call rotations, providing 24x7 support for critical production systems. - Continuously improve operational readiness and resilience. - Automation, CI/CD & Cloud Operations - Design and manage deployment pipelines, configuration management, and environment consistency across lower and production environments. - Implement Infrastructure as Code (IaC) practices for repeatable and secure cloud provisioning. - Collaborate with DevOps teams to improve deployment reliability, rollback mechanisms, and release safety. - Develop and test disaster recovery plans, backup strategies, and failover mechanisms. - Collaboration & Governance - Work closely with Development, QA, DevOps, Security, and Product teams to align on reliability and performance goals. - Ensure platforms meet security, compliance, and regulatory requirements common in financial services. - Act as a reliability and performance advocate throughout the SDLC. Qualifications - Strong experience in Core SRE practices, including reliability engineering, incident management, and automation. - Proven hands-on experience in Performance Engineering / Performance Testing for large-scale distributed systems. - Deep understanding and implementation experience with SLI / SLO / Error Budget frameworks. - Proficiency in cloud platforms (AWS, Azure, or Google Cloud). - Hands-on experience with containerization and orchestration (Docker, Kubernetes). - Strong background in monitoring, observability, and logging tools such as Prometheus, Grafana, Datadog, Splunk, ELK Stack. - Experience with CI/CD pipelines (Jenkins, GitLab CI/CD, Azure DevOps). - Proficiency in scripting and automation using Python, Bash, Terraform, Ansible. - Strong troubleshooting skills across application, infrastructure, and network layers. - Experience designing and running incident response and post-mortem reviews. - Ownership mindset with accountability for service reliability and customer outcomes. - Excellent communication, collaboration, and stakeholder management skills. Nice to Have (SRE+ Skills) - Experience with Keptn or similar tools for automated SLO-based quality gates and continuous delivery. - Programming experience in Java, especially for debugging, performance profiling, or building automation tools. - Familiarity with chaos engineering practices and tools. - Experience working in banking, payments, or capital markets domains. - Knowledge of security best practices and regulatory compliance in enterprise environment. Company Description At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus. Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace.

View details: Senior Engineer, Site Reliability

India

Apply

Job Closed

Site Reliability Engineer, SRE Team

Semrush

Your competitors' favorite marketing platform used by 10,000,000 marketers

DevOps Engineer98 days ago

Full Time RemoteTeam 1,001-5,000Since 2008H1B Sponsor

Company Site LinkedIn

• Collaborate with development teams to design and implement scalable, reliable, and efficient system architectures • Establish and refine SLOs in partnership with stakeholders to guarantee service reliability and performance • Read and write code in Python/Go • Induce application failure and work to recover it from that state • Debug applications using metrics and add traces/metrics as needed • Participate in on-call duties to provide constant support • Lead the changes in common engineering practices in the Company • Possible night shifts (on-call)

Cloud Google Cloud Platform Kubernetes Python Go

View details: Site Reliability Engineer, SRE Team

Cyprus

Apply

Job Closed

Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Mid

Senior DevOps Engineer

Senior Engineer, Site Reliability

Site Reliability Engineer, SRE Team