Job Closed
This listing is no longer active.
Hiring now
DevOps Engineer
Location
India
Posted
106 days ago
Salary
0
Seniority
Senior
Job Description
DevOps Engineer
Jetbro
• Audit Prometheus scrape targets, exporters, and metric endpoints • Review Grafana dashboards, alert rules, and data sources • Assess log coverage across Kibana and Loki • Map monitoring coverage across application, infrastructure, database, ingress, and platform layers • Identify missing exporters, stale dashboards, broken panels, and alert gaps • Analyze historical metrics to establish performance baselines • Define SLOs, KPIs, warning thresholds, and breach thresholds • Suggest Prometheus alert rules and Alertmanager routing strategies • Implement KPI and SLO alerts within Grafana alert management • Evaluate Kubernetes cluster topology and infrastructure usage patterns • Recommend architecture optimizations based on observed load and behavior • Document findings in structured audit and advisory reports • Participate in weekly syncs and structured handover sessions
Job Requirements
- 3+ years of experience in DevOps or Platform Engineering
- Strong hands-on experience with Kubernetes production environments
- Experience working with Prometheus for metrics collection and alerting
- Experience configuring and reviewing Grafana dashboards and alerts
- Exposure to log management systems such as Kibana or Loki
- Strong understanding of observability across application, infra, DB, and ingress layers
- Experience defining or working with KPIs and SLOs
- Experience analyzing historical performance data
- Ability to troubleshoot production-level monitoring gaps
- Strong documentation and communication skills
Benefits
- A chance to work on a greenfield project and influence architectural decisions.
- Competitive compensation and benefits.
- Flexible work environment (remote or hybrid options available).
- A collaborative and innovative team culture.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Manager
BlastPointA.I.-driven customer intelligence tools that give companies the power to discover & engage the humans in their data.
• Ensure high availability, fault tolerance, and scalability of cloud services • Optimize performance and cost efficiency across AWS environments • Lead and mentor a small team of DevOps engineers, fostering a culture of innovation, collaboration, and accountability • Balance hands-on contributions with strategic leadership, leading by example to ensure smooth execution of DevOps initiatives • Design, deploy, and maintain BlastPoint’s AWS-based infrastructure using Terraform • Own the SOC 2 certification and compliance monitoring process • Implement security best practices, including IAM policies, encryption, vulnerability management, and incident response. • Enhance and maintain CI/CD pipelines using GitHub Actions to improve developer productivity and deployment speed • Collaborate with software engineers to streamline build, testing, and release processes • Implement observability, logging, and monitoring solutions to proactively detect and resolve issues. • Establish best practices for disaster recovery, data backup, and infrastructure resilience.
• Proactively explore and implement AI tools, LLM integrations, and MCP (Model Context Protocol) to reduce routine database toil, optimize query performance, and accelerate incident resolution. • Support our data warehouse ecosystem by optimizing Snowflake performance, including application packaging and testing. • Own the deep-level optimization of MSSQL (crucial for on-call stability) and PostgreSQL at the server, database, and query levels. • Forecast resource utilization across platforms. Identify cost-saving opportunities, optimize Snowflake credit usage, and right-size AWS infrastructure. • Automate all data infrastructure using Terraform, AWS, Docker, and Kubernetes. You will manage containerized data services and stateful workloads. • Manage and optimize deployment pipelines using GitLab and Octopus Deploy, ensuring safe, repeatable database schema changes. • Create technical documentation, including runbooks, "how-to" guides for developer self-service, and clear architectural diagrams. • Serve as the subject matter expert for SQL Server, Postgres, and Snowflake in a 24/7/365 on-call rotation.
• Deliver the ADO Environment Current-State Assessment Report identifying gaps in configurations, pipelines, and workflow structures (Deliverable A1) • Develop and execute the ADO Configuration Modernization Plan; implement updated ADO configurations including work item hierarchies, custom fields, sprint boards, and Kanban views (Deliverables A2/A3) • Design and deploy reusable CI/CD pipeline templates for Azure Databricks notebook deployment, data validation, and automated reporting (Deliverable A5) • Configure end-to-end DataOps integration: ADO Repos → Databricks notebooks → automated Power BI dashboard refresh — reducing manual effort by 80%+ through workflow automation • Build Power Automate workflows for governance approvals, policy triggers, and document routing integrated with ADO and SharePoint • Design and deploy GMCB's SharePoint Knowledge Management Library including architecture, document taxonomy, metadata schema, and content migration (Deliverables C1/C2) • Develop ADO Analytics dashboards for sprint velocity, governance compliance, data quality, and operational KPIs • Implement traceable work item linkage between ADO Epic-Feature-Story-Task structures and Azure Databricks development artifacts • Develop the ADO Integration Deployment Package including configuration documentation, runbooks, and administrator guides (Deliverable A5) • Support Agile pilot sprints by configuring and validating ADO Board workflows; provide hands-on technical support during adoption phase
Senior DevOps Engineer
Slingshot AerospaceWe build space simulation and analytics solutions to bring clarity to complex environments and create a safer world.
• Partner with offshore and onshore engineering teams to design, implement, and scale cloud-native infrastructure supporting a new customer portal and ongoing platform refactoring efforts • Architect, build, and maintain Kubernetes-based environments that power production systems, ensuring scalability, resilience, and security • Lead Infrastructure as Code initiatives (primarily Terraform) to automate provisioning, configuration, and environment consistency across AWS • Design, implement, and optimize CI/CD pipelines to improve deployment velocity, reliability, and developer experience • Integrate and operationalize MLOps practices, enabling efficient deployment, monitoring, and lifecycle management of machine learning workflows • Embed DevSecOps best practices across the platform, incorporating security controls, compliance requirements, and monitoring into the development lifecycle • Drive automation initiatives that reduce manual processes and increase system reliability and repeatability • Collaborate closely with Platform, Engineering, and cross-functional stakeholders to gather requirements, troubleshoot issues, and continuously improve system architecture • Monitor system performance, identify bottlenecks, and proactively implement improvements to optimize availability and cost efficiency • Support incident response and root cause analysis efforts, driving long-term fixes and ensuring lessons learned translate into system improvements




