Fast, Fresh, Actionable Insights at Scale!
Site Reliability Engineer
Location
India
Posted
67 days ago
Salary
0
Seniority
Senior
Job Description
Site Reliability Engineer
StarTree
• Leverage various monitoring and alerting services to solve intricate programming problems at scale. • Manage and tune multiple critical customer-facing Apache Pinot clusters • Monitor availability, read/write latencies, and other key telemetry to proactively identify SLO misses and help mitigate issues • Build a rapport with and work closely with customers to mitigate and resolve incidents • Execute disaster recovery strategies with minimal downtime • Collaborate with other engineers to understand and troubleshoot systems and use the experience gained to influence the roadmap of other teams
Job Requirements
- 5+ years of experience as an engineer (SRE, SDET, or development)
- Experience managing highly available production facing distributed systems and in-depth knowledge of Java are a plus
- Experience with cloud platforms such as AWS, GCP, or Azure
- Experience with Kubernetes and container orchestration
- Familiarity with streaming systems, such as Kafka, Pulsar, Flume, Flink, Spark, or similar
- Knowledge of standard methodologies related to security, performance, and disaster recovery
- Strong troubleshooting and critical thinking skills
Benefits
- Health insurance
- Flexible work arrangements
- Professional development
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Lead 8-12 enterprise deployments quarterly, driving $10M+ in eARR with ownership of quality, timelines, and outcomes. • Define end-to-end implementation strategy, managing complex integrations, stakeholder alignment, and risk. • Optimize deployment frameworks by identifying inefficiencies and implementing scalable, automated solutions and tooling. • Strategically partner with Engineering and Product, influencing the roadmap with client insights and technical feedback. • Lead cross-functional execution across Sales, CS, Product, and Engineering for seamless customer lifecycle delivery. • Institutionalize best practices via scalable documentation, playbooks, and knowledge systems to elevate team consistency. • Support organizational scalability by forecasting capacity, mitigating pipeline constraints, and reducing future eARR risk. • Leverage AI as a core layer for accelerating solution design, implementation, and continuous deployment improvement.
• Act as a technical partner to developers—debugging pipelines, build failures, and deployment/configuration issues. • Diagnose CI/CD failures across GitLab, Helm, Kubernetes, and container build systems. • Work closely with engineers across the company to understand pain points and opportunities for tooling enhancements. • Leverage AI tooling to help yourself and other teams be more productive in pipeline development and troubleshooting. • Measure and optimize the performance, reliability, and maintainability of CI/CD pipelines and platform services. • Drive initiatives to reduce waste in the development lifecycle. • Design and evolve Kubernetes-based deployment workflows using GitLab, Helm, and best-practice DevOps patterns. • Own platform and infrastructure components that accelerate development.
• Act as a technical partner to developers—debugging pipelines, build failures, and deployment/configuration issues. • Diagnose CI/CD failures across GitLab, Helm, Kubernetes, and container build systems. • Work closely with engineers across the company to understand pain points and opportunities for tooling enhancements. • Leverage AI tooling to help yourself and other teams be more productive in pipeline development and troubleshooting. • Continue building new AI tooling into the pipelines and deployments, partnering with the infrastructure, security, and developer teams to automate as much as we can and find problems early. • Promote best practices in DevOps, CI/CD, deployment reliability, and platform operations. • Measure and optimize the performance, reliability, and maintainability of CI/CD pipelines and platform services. • Drive initiatives to reduce waste in the development lifecycle: Implement guardrails and automation that improve quality without slowing velocity. • Build, maintain, and scale CI/CD tooling.
• Act as a technical partner to developers—debugging pipelines, build failures, and deployment/configuration issues. • Diagnose CI/CD failures across GitLab, Helm, Kubernetes, and container build systems. • Work closely with engineers across the company to understand pain points and opportunities for tooling enhancements. • Leverage AI tooling to help yourself and other teams be more productive in pipeline development and troubleshooting. • Promote best practices in DevOps, CI/CD, deployment reliability, and platform operations. • Build, maintain, and scale CI/CD tooling. • Own platform and infrastructure components that accelerate development—observing bottlenecks, reducing friction, and eliminating repetitive work.


