Job Closed
This listing is no longer active.
Senior Database Reliability Engineer (DBRE) & Architect (worldwide remote)
Location
Armenia
Posted
36 days ago
Salary
0
Seniority
Senior
Job Description
Senior Database Reliability Engineer (DBRE) & Architect (worldwide remote)
CloudLinux
CloudLinux is transforming the Linux infrastructure market by ensuring security and stability for over 500,000 servers worldwide. Our products - CloudLinux OS, TuxCare, and Imunify360 - are the de facto standard in the hosting industry and Enterprise segment. We are seeking a visionary engineer to lead the evolution of our data platform. In 2025, we are shifting from classic database administration to an Internal Database-as-a-Service (DBaaS) model. We need a specialist who doesn’t just "configure backups," but designs resilient distributed systems, writes code to automate infrastructure, and transforms databases into a reliable service for product teams. If you are tired of endless tickets and want to build platforms capable of processing petabytes of data, this role is for you. Your Challenges & Responsibilities: - DBaaS Architecture: Design and implement a self-service platform based on Terraform and Ansible, enabling the deployment of HA clusters (PostgreSQL and ClickHouse, MongoDB, Redis) in a heterogeneous environment (Bare Metal + OpenNebula + Kubernetes + Public Clouds). You will turn infrastructure into a product. - Scaling ClickHouse: Manage exponentially growing analytics clusters (12+ clusters, tens of terabytes of data). You will tackle sharding, table engine optimization (ReplicatedMergeTree), and building reliable S3 backup pipelines under high load. - Data Platform & Analytics Support: Maintain and scale the infrastructure for Apache Airflow and Redash. You will ensure the reliability of ETL pipelines and visualization tools, bridging the gap between raw infrastructure and the data analytics team. - Reliability as Code: Implement SRE practices in data management. Replace manual incident response with automated self-healing mechanisms. Define and implement SLO/SLI for all databases. - Stack Modernization: Lead the migration process from legacy solutions to modern cloud patterns. Participate in decision-making regarding the implementation of Kubernetes operators for stateful workloads. - Expertise & Mentorship: Serve as the technical authority for product teams, helping them optimize data schemas and SQL queries for high-load systems. Our Tech Stack: - Databases: PostgreSQL 15+ (Patroni, PgBouncer), ClickHouse (Sharded/Replicated), MongoDB, Redis, Kafka - Data & Analytics: Apache Airflow, Redash (Infrastructure & Integration). - Infrastructure: Own 3+DC colocation (OpenNebula, Kubernetes, Bare Metal), AWS, Google Cloud, Azure, DO – Hybrid Cloud. - Automation & IaC: Terraform, Ansible, Python/Go, GitLab, Jenkins, Gerrit. - Observability: VictoriaMetrics, Grafana, Loki. Why CloudLinux? - Culture: A Remote-first company with an "Employees First" principle. We value results, not hours in the office. - Impact: Your architectural decisions will determine the stability of services used by thousands of companies around the world. - Growth: We support professional development and pay for training and conferences.
Job Requirements
- What We Expect From You:
- AI-Augmented Engineering: You don't view AI as a replacement for deep technical fundamentals, but as a high-leverage tool. We actively use AI agents (Claude, Codex, Gemini, etc.) to automate boilerplate, analyze complex logs, and speed up research. We expect you to be open to modern workflows and integrate AI into your day-to-day operations, allowing you to focus your brainpower on the true architectural challenges.
- Deep PostgreSQL Expertise (5+ years): You know MVCC internals, understand locking mechanics, can configure Patroni and PgBouncer "with your eyes closed," and have experience with seamless major version upgrades under load.
- ClickHouse Mastery: Experience operating large clusters, understanding ZooKeeper/ClickHouse Keeper, sharding, replication internals, and the ability to diagnose performance issues at the data-part level.
- Engineering Mindset (SRE/DevOps): You hate doing the same task twice by hand. Experience writing complex Terraform modules and Ansible roles is mandatory. Programming skills in Python or Go for automation are a huge plus.
- Hybrid Environment Experience: You understand the differences between running DBs on Bare Metal vs. Kubernetes vs. Cloud and know how to optimize TCO and disk subsystem performance (NVMe, Network Storage).
- Systems Approach: You see the big picture - from the network packet to the application business logic. You understand the importance of security (FIPS, Audit logs) and Disaster Recovery.
- Nice to Have:
- Experience building an Internal Developer Platform (IDP).
- Experience operating databases in Kubernetes (CloudNativePG, Altinity Operator).
- Experience working in Cloud and Hosting providers on similar services.
Benefits
- What's in it for you?
- A focus on professional development.
- Interesting and challenging projects.
- Fully remote work with flexible working hours, which allows you to schedule your day and work from any location worldwide.
- Paid 24 days of vacation per year, 10 days of national holidays, and unlimited sick leaves.
- Compensation for private medical insurance.
- Co-working and gym/sports reimbursement.
- Budget for education.
- The opportunity to receive a reward for the most innovative idea that the company can patent.
- By applying for this position, you agree with CloudLinux Privacy Policy and give us your consent to maintain and process your personal data with this respect. Please read our Privacy Policy for more information.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
CloudLinux is transforming the Linux infrastructure market by ensuring security and stability for over 500,000 servers worldwide. Our products - CloudLinux OS, TuxCare, and Imunify360 - are the de facto standard in the hosting industry and Enterprise segment. We are seeking a visionary engineer to lead the evolution of our data platform. In 2025, we are shifting from classic database administration to an Internal Database-as-a-Service (DBaaS) model. We need a specialist who doesn’t just "configure backups," but designs resilient distributed systems, writes code to automate infrastructure, and transforms databases into a reliable service for product teams. If you are tired of endless tickets and want to build platforms capable of processing petabytes of data, this role is for you. Your Challenges & Responsibilities: - DBaaS Architecture: Design and implement a self-service platform based on Terraform and Ansible, enabling the deployment of HA clusters (PostgreSQL and ClickHouse, MongoDB, Redis) in a heterogeneous environment (Bare Metal + OpenNebula + Kubernetes + Public Clouds). You will turn infrastructure into a product. - Scaling ClickHouse: Manage exponentially growing analytics clusters (12+ clusters, tens of terabytes of data). You will tackle sharding, table engine optimization (ReplicatedMergeTree), and building reliable S3 backup pipelines under high load. - Data Platform & Analytics Support: Maintain and scale the infrastructure for Apache Airflow and Redash. You will ensure the reliability of ETL pipelines and visualization tools, bridging the gap between raw infrastructure and the data analytics team. - Reliability as Code: Implement SRE practices in data management. Replace manual incident response with automated self-healing mechanisms. Define and implement SLO/SLI for all databases. - Stack Modernization: Lead the migration process from legacy solutions to modern cloud patterns. Participate in decision-making regarding the implementation of Kubernetes operators for stateful workloads. - Expertise & Mentorship: Serve as the technical authority for product teams, helping them optimize data schemas and SQL queries for high-load systems. Our Tech Stack: - Databases: PostgreSQL 15+ (Patroni, PgBouncer), ClickHouse (Sharded/Replicated), MongoDB, Redis, Kafka - Data & Analytics: Apache Airflow, Redash (Infrastructure & Integration). - Infrastructure: Own 3+DC colocation (OpenNebula, Kubernetes, Bare Metal), AWS, Google Cloud, Azure, DO – Hybrid Cloud. - Automation & IaC: Terraform, Ansible, Python/Go, GitLab, Jenkins, Gerrit. - Observability: VictoriaMetrics, Grafana, Loki. Why CloudLinux? - Culture: A Remote-first company with an "Employees First" principle. We value results, not hours in the office. - Impact: Your architectural decisions will determine the stability of services used by thousands of companies around the world. - Growth: We support professional development and pay for training and conferences.
CloudLinux is transforming the Linux infrastructure market by ensuring security and stability for over 500,000 servers worldwide. Our products - CloudLinux OS, TuxCare, and Imunify360 - are the de facto standard in the hosting industry and Enterprise segment. We are seeking a visionary engineer to lead the evolution of our data platform. In 2025, we are shifting from classic database administration to an Internal Database-as-a-Service (DBaaS) model. We need a specialist who doesn’t just "configure backups," but designs resilient distributed systems, writes code to automate infrastructure, and transforms databases into a reliable service for product teams. If you are tired of endless tickets and want to build platforms capable of processing petabytes of data, this role is for you. Your Challenges & Responsibilities: - DBaaS Architecture: Design and implement a self-service platform based on Terraform and Ansible, enabling the deployment of HA clusters (PostgreSQL and ClickHouse, MongoDB, Redis) in a heterogeneous environment (Bare Metal + OpenNebula + Kubernetes + Public Clouds). You will turn infrastructure into a product. - Scaling ClickHouse: Manage exponentially growing analytics clusters (12+ clusters, tens of terabytes of data). You will tackle sharding, table engine optimization (ReplicatedMergeTree), and building reliable S3 backup pipelines under high load. - Data Platform & Analytics Support: Maintain and scale the infrastructure for Apache Airflow and Redash. You will ensure the reliability of ETL pipelines and visualization tools, bridging the gap between raw infrastructure and the data analytics team. - Reliability as Code: Implement SRE practices in data management. Replace manual incident response with automated self-healing mechanisms. Define and implement SLO/SLI for all databases. - Stack Modernization: Lead the migration process from legacy solutions to modern cloud patterns. Participate in decision-making regarding the implementation of Kubernetes operators for stateful workloads. - Expertise & Mentorship: Serve as the technical authority for product teams, helping them optimize data schemas and SQL queries for high-load systems. Our Tech Stack: - Databases: PostgreSQL 15+ (Patroni, PgBouncer), ClickHouse (Sharded/Replicated), MongoDB, Redis, Kafka - Data & Analytics: Apache Airflow, Redash (Infrastructure & Integration). - Infrastructure: Own 3+DC colocation (OpenNebula, Kubernetes, Bare Metal), AWS, Google Cloud, Azure, DO – Hybrid Cloud. - Automation & IaC: Terraform, Ansible, Python/Go, GitLab, Jenkins, Gerrit. - Observability: VictoriaMetrics, Grafana, Loki. Why CloudLinux? - Culture: A Remote-first company with an "Employees First" principle. We value results, not hours in the office. - Impact: Your architectural decisions will determine the stability of services used by thousands of companies around the world. - Growth: We support professional development and pay for training and conferences.
CloudLinux is transforming the Linux infrastructure market by ensuring security and stability for over 500,000 servers worldwide. Our products - CloudLinux OS, TuxCare, and Imunify360 - are the de facto standard in the hosting industry and Enterprise segment. We are seeking a visionary engineer to lead the evolution of our data platform. In 2025, we are shifting from classic database administration to an Internal Database-as-a-Service (DBaaS) model. We need a specialist who doesn’t just "configure backups," but designs resilient distributed systems, writes code to automate infrastructure, and transforms databases into a reliable service for product teams. If you are tired of endless tickets and want to build platforms capable of processing petabytes of data, this role is for you. Your Challenges & Responsibilities: - DBaaS Architecture: Design and implement a self-service platform based on Terraform and Ansible, enabling the deployment of HA clusters (PostgreSQL and ClickHouse, MongoDB, Redis) in a heterogeneous environment (Bare Metal + OpenNebula + Kubernetes + Public Clouds). You will turn infrastructure into a product. - Scaling ClickHouse: Manage exponentially growing analytics clusters (12+ clusters, tens of terabytes of data). You will tackle sharding, table engine optimization (ReplicatedMergeTree), and building reliable S3 backup pipelines under high load. - Data Platform & Analytics Support: Maintain and scale the infrastructure for Apache Airflow and Redash. You will ensure the reliability of ETL pipelines and visualization tools, bridging the gap between raw infrastructure and the data analytics team. - Reliability as Code: Implement SRE practices in data management. Replace manual incident response with automated self-healing mechanisms. Define and implement SLO/SLI for all databases. - Stack Modernization: Lead the migration process from legacy solutions to modern cloud patterns. Participate in decision-making regarding the implementation of Kubernetes operators for stateful workloads. - Expertise & Mentorship: Serve as the technical authority for product teams, helping them optimize data schemas and SQL queries for high-load systems. Our Tech Stack: - Databases: PostgreSQL 15+ (Patroni, PgBouncer), ClickHouse (Sharded/Replicated), MongoDB, Redis, Kafka - Data & Analytics: Apache Airflow, Redash (Infrastructure & Integration). - Infrastructure: Own 3+DC colocation (OpenNebula, Kubernetes, Bare Metal), AWS, Google Cloud, Azure, DO – Hybrid Cloud. - Automation & IaC: Terraform, Ansible, Python/Go, GitLab, Jenkins, Gerrit. - Observability: VictoriaMetrics, Grafana, Loki. Why CloudLinux? - Culture: A Remote-first company with an "Employees First" principle. We value results, not hours in the office. - Impact: Your architectural decisions will determine the stability of services used by thousands of companies around the world. - Growth: We support professional development and pay for training and conferences.
Senior DevOps Engineer for T-Cloud Public + Sign on Bonus 3000 eur
Deutsche Telekom IT Solutions SlovakiaGrowing bigger, getting better. An IT company which creates values for its customers and helps its region to improve.
Company Description Our brand Deutsche Telekom IT Solutions Slovakia entered the life of Košice region in 2006 under the name of T-Systems Slovakia and ever since has been inextricably linked with the region when became one of the founding members of Košice IT Valley. We have managed to grow from scratch to the second largest employer in the eastern part of the country with more than 3900 employees. Our goal is to proactively find new ways to improve and continuously transform into the type of company providing innovative information and communication technology services. Job Description Purpose As a Senior DevOps Engineer on the T-cloud Public Marketplace team, you will be the cornerstone of our infrastructure, designing and building the robust, scalable platform that powers our ecosystem. You will take full ownership of our cloud infrastructure, leveraging your deep expertise to solve complex challenges in security, automation, and performance. This is a high-impact role where your work will directly influence the success of a key Deutsche Telekom product. WHAT WILL YOU DO? - Own our IaC Foundation: Architect, implement, and manage our core infrastructure using Terraform and Terragrunt, ensuring it is scalable, reliable, and cost-effective. - Master CI/CD & Automation: Design, build, and optimize sophisticated CI/CD pipelines in GitLab to enable rapid, safe, and reliable deployments for development teams. - Champion Kubernetes & Cloud-Native Tech: Design, deploy, and maintain complex Kubernetes workloads, including Helm charts, sidecars, and batch jobs, to support a microservices architecture. - Drive Security & Reliability: Implement and enhance security controls, replication strategies, and monitoring services to meet strict Deutsche Telekom compliance standards. - Collaborate & Elevate: Partner directly with development teams to provide expert guidance on building and configuring cloud-native applications, including optimizing Dockerfiles and application runtimes. Qualifications YOU WILL SUCCEED IF YOU: - 5+ years of professional DevOps or SRE experience in a cloud-native environment. - Expert-level, hands-on proficiency with Terraform and Terragrunt for managing production-grade infrastructure. - Deep, practical knowledge of Kubernetes (including its API and object model) and a proven track record of designing and packaging applications with Helm. - Extensive experience building and maintaining complex CI/CD pipelines, preferably in GitLab CI. - Strong scripting or programming skills in Python or Go for automation and tooling. - A solid understanding of software development lifecycles and build processes. - Proficiency with Git and experience working in a collaborative, agile environment. - Professional programming/scripting experience in Python or Go. - In-depth understanding of cloud environments (e.g., AWS, GCP, Azure) and microservices Additional Information Benefits We believe in balance between work and personal life. An attractive and extensive work-life balance portfolio guarantees lasting motivation for employees and thus a better quality of life, promotes physical and mental well-being and contributes to a positive work environment. All this with the aim of providing more freedom in reconciling work, career growth, private life and individual lifestyle. Therefore we offer to our employees over 25 different benefits to improve their personal and professional life in these areas: - Financial benefits - Benefits with focus on learning and development - Benefits with focus on health and sport - Benefits with focus on family and work – life balance - Other benefits For more information about our benefits click to Benefits Salary Final salary is negotiable. We are offering base salary depending on seniority level and previous experience of candidate. In addition to base salary we provide variable part and other financial benefits. Base salary will not be lower than 2600 € /brutto. Additional information * Please be informed that our remote working possibility is only available within Slovakia due to European taxation regulation. - Location: Kosice - Company: Deutsche Telekom System Solutions Slovakia - Language: English - Job category: Technical positions - Compensation: from EUR 2600 - monthly

