Job Closed
This listing is no longer active.
Orchestrating billions of remarkable experiences in more than 100 countries – through cloud, digital and AI technology.
Senior Operations Reliability Engineer – Microsoft 365, Enterprise Tools
Location
India
Posted
102 days ago
Salary
0
Seniority
Senior
Job Description
Senior Operations Reliability Engineer – Microsoft 365, Enterprise Tools
Genesys
• Resolve Microsoft 365 and enterprise tool incidents through hands-on troubleshooting and remediation, escalating complex issues when needed to senior analysts or platform engineering teams. • Monitor observability, AIOps, and event management platforms to identify anomalies, service degradations, and emerging incidents affecting collaboration and productivity services. • Perform incident triage and correlation to determine probable cause and appropriate routing for deeper investigation. • Validate automated remediation workflows and assist in identifying repeated manual operational tasks that could be automated. • Participate in early-stage automation and AI-readiness activities by documenting remediation steps, key patterns, and operational edge cases. • Reduce alert noise by suggesting adjustments to thresholds, suppression logic, or detection rules related to collaboration and enterprise tools. • Support post-incident reviews by providing relevant data, timelines, and insights related to service behavior and user impact. • Collaborate with Cloud, Network, IAM, Endpoint, Messaging, and ServiceNow teams to support incident resolution and improve operational processes. • Ensure accuracy of event data, alerts, and service mappings to support effective correlation within monitoring and CMDB systems.
Job Requirements
- Bachelor’s degree in an IT-related field or equivalent experience.
- 5+ years of experience supporting Microsoft 365, collaboration platforms, or enterprise productivity tools in an operational role.
- Strong working knowledge of Microsoft 365 services, including Exchange Online, Teams, SharePoint, and OneDrive.
- Experience troubleshooting user-impacting issues related to mail flow, authentication, collaboration performance, and service availability.
- Familiarity with Microsoft 365 admin tools, service health dashboards, and diagnostic logs.
- Experience supporting additional enterprise tools or SaaS platforms beyond Microsoft 365.
- Solid understanding of incident management, event correlation, and operational troubleshooting methodologies.
- Experience collaborating with engineering teams and working within cross-functional operational environments.
- Ability to interpret service telemetry and logs to identify symptoms, underlying issues, and potential root causes.
- Strong written and verbal communication skills to explain findings to both technical and non-technical audiences.
- Motivated to develop deeper skills in automation, AIOps, and proactive reliability engineering.
Benefits
- Health insurance
- Flexible working hours
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Manage complex DevSecOps pipelines for embedded systems and declarative pipelines using tools like GitLab or GitHub Actions. • Design, code, test, integrate, and document software solutions. • Participate in reviews of software components and systems. • Coach, review, and delegate tasks to junior professionals. • Follow established development and configuration management processes for software products. • Operate in a collaborative SAFe Agile environment. • Speak and present elegantly in front of customers. • Work effectively in a consortium environment. • Plan, task, and execute work within a DevSecOps (DSO) Pipeline in Crucible. • Travel up to 10%.
• Automation of site reliability infrastructure, monitoring, and self-healing systems. • Definition and ownership of Service Level Objectives for production and development deployments. • Infrastructure-as-code for production and development systems, in collaboration with the infrastructure engineering team. • Responding to in-hours alerts (we run a follow-the-sun model to avoid out-of-hours paging). • Conducting RCAs in collaboration with the feature teams. • Building resilience to prevent future outages. • Organization-wide analysis of incident cause, frequency, and severity, to guide prioritization of future changes. • Design reviews for architectural changes: reviewing for scalability, reliability, and capacity planning. • Public and internal status and uptime dashboards.
• Own end-to-end deployment, publishing, and configuration for iOS and Android mobile applications. • Manage App Store Connect and Google Play Console workflows, including signing, provisioning, and compliance. • Automate mobile build and release processes to improve consistency and reduce manual effort. • Design, build, and maintain Ansible automation for deployments, APIs, IIS configuration, certificate rotation, and environment standardization. • Use Terraform to provision and manage infrastructure in a repeatable, auditable manner. • Operate and tune IIS in Windows-based production environments, including performance optimization and safe restarts. • Support containerized workloads (Docker/Kubernetes) and help guide their adoption as part of the platform’s future state.
Senior MultiCloud DevOps – Platform Engineer, Harness
XebiaCreating Digital Leaders. Digital Transformation Consultancy Services and Solutions
• leading the design, implementation, and operation of delivery pipelines • owning and operating the Harness platform • engineering Delivery Pipelines, Environments, and Infrastructure • embedding automated tests, security scans, and quality gates into CI/CD pipelines • building and integrating Observability and Performance Tooling • onboarding product and engineering teams onto the Harness platform




