Job Closed
This listing is no longer active.
Revolutionising financial crime prevention for banks and fintechs through privacy protected federated learning
Site Reliability Engineer
Location
India
Posted
75 days ago
Salary
0
Seniority
Senior
Job Description
Site Reliability Engineer
Tookitaki
• Ensure high availability, performance, and scalability of our platforms. • Collaborate with engineering, DevOps, and client success teams. • Build and maintain monitoring, alerting, and logging systems using tools like Prometheus, Grafana, and ELK. • Respond to incidents and outages, conduct post-mortems, and implement corrective actions. • Automate infrastructure provisioning and application deployment using Terraform, Ansible, or Helm. • Contribute to CI/CD pipelines, improve reliability and speed of software delivery (GitLab CI, Jenkins, etc.). • Manage and troubleshoot Docker containers and Kubernetes clusters. • Operate within AWS (preferred) or GCP environments. • Implement and monitor TLS/SSL, RBAC, SSO, and secure API practices. • Work closely with developers, infra engineers, and support teams to ensure production readiness.
Job Requirements
- Bachelor’s degree in Computer Science, Engineering, or related technical field.
- 3–6 years in Site Reliability Engineering, DevOps, Platform Engineering, or a related role.
- Experience with production environments and live system debugging.
- Kubernetes, Docker, Helm – experience deploying and scaling services.
- Linux administration and command-line debugging.
- Hands-on with AWS (preferred) or GCP cloud platforms.
- Scripting in Bash and Python for automation and monitoring tasks.
- Experience with monitoring and alerting tools like Prometheus, Grafana, ELK, or Datadog.
- Familiarity with databases (e.g., MariaDB, ScyllaDB) and SQL/CQL querying.
- Strong problem-solving and debugging skills.
- Ability to work in on-call rotations and high-pressure production environments.
- Excellent communication and documentation abilities.
Benefits
- Competitive compensation
- Work on a globally recognized RegTech platform transforming financial crime prevention.
- Exposure to cutting-edge AI and big data infrastructure (Spark, Kafka, ScyllaDB, Flink).
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Support the automation, maintenance, and reliability of LGY development and deployment pipelines. • Work closely with development, operations, and release management teams to support CI/CD processes. • Monitor pipeline health and troubleshoot deployment failures. • Implement DevOps best practices to improve pipeline efficiency and reliability. • Collaborate with development teams to ensure applications are deployment ready. • Document DevOps processes, pipeline configurations, and operational procedures.
DevOps Specialist
Weekday (YC W21)We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
• Build, manage, and scale cloud infrastructure • Automate deployment processes • Ensure reliability and security of systems
DevOps Engineer
Weekday (YC W21)We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
• Constructing, managing, and scaling cloud infrastructure • Automating deployment processes • Maintaining system reliability and security
• As a Senior DevOps / Platform Engineer, you play a key role in the stability, security, and scalability of our entire platform. • You take responsibility for the infrastructure of a safety-critical product that protects thousands of lone workers every day. • You ensure our systems run robustly, performantly, and with high availability. • Further development of our Kubernetes infrastructure (multi-cloud, multi-AZ, zero-downtime deployments). • Maintenance and expansion of our Infrastructure-as-Code landscape with OpenTofu/Terraform. • Set up and continuously improve observability (logs, metrics, traces), including dashboards. • Support operating dedicated hosting setups for enterprise customers. • Ensure enterprise customer compliance requirements and ISO 27001. • Take responsibility for the security fundamentals of the infrastructure. • Help design backup, recovery, and disaster-recovery strategies for our production environments. • Participate in on-call/incident rotation (focused on stability & disaster recovery). • Build effective alerting and incident management. • Develop and maintain CI/CD pipelines to automate the build, test, and deployment process.



