In neighborhoods and communities everywhere, we deliver the promise of home.
Data Center Ops Resilience Engineer
Location
United States
Posted
44 days ago
Salary
$94.9K - $136.1K / year
Seniority
Senior
Job Description
Data Center Ops Resilience Engineer
Guild Mortgage
• Support the design, administration, monitoring, and continuous improvement of enterprise resilience capabilities across data center and infrastructure environments. • Maintain and enhance solutions supporting high availability, disaster recovery, backup, replication, and recovery operations. • Identify, assess, and help remediate cybersecurity risks , resilience gaps, and operational weaknesses through the application of sound engineering practices and established standards. • Partner with infrastructure, security, and application teams to ensure resilience and recoverability requirements are incorporated into technical solutions and operational processes from the beginning. • Participate in resilience testing activities, including failover exercises, disaster recovery validation, backup recovery testing, and operational readiness reviews. • Support monitoring, alerting, incident response, escalation processes, and service restoration efforts related to resilience platforms and technologies. • Develop, maintain, and improve technical documentation, operational procedures, standards, and runbooks. • Contribute to process improvement initiatives that strengthen platform stability, recoverability, and overall security posture. • Participate in troubleshooting, root cause analysis, and corrective action planning for resilience-related incidents and service disruptions. • Support cross-platform resilience efforts involving compute, storage, backup, network, and platform services. • Perform other duties as assigned.
Job Requirements
- A combination of education and experience may be considered in lieu of the Bachelor’s degree.
- Bachelors Degree directly related to the position or equivalent, preferred or equivalent computer-related degree from a technical school, or similar training.
- Minimum five years experience supporting one or more of the following areas: high availability, disaster recovery, backup, replication, recovery, cyber security or infrastructure operations.
- Strong analytical, troubleshooting, and problem-solving skills.
- Strong technical aptitude with the ability to work across multiple infrastructure platforms On Prem, Hybrid and cloud operational domains.
- Ability to collaborate effectively across technical teams and work within established operational processes.
- Strong interpersonal and customer service skills, with the ability to build trusted relationships across technical and business teams, communicate complex concepts clearly, and respond to stakeholder needs with professionalism and urgency.
- Demonstrates active listening, empathy, and solutions-oriented mindset to ensure a high-quality service experience and positive outcomes.
- Possess an “Engineering Spirit”: The ability to identify legacy and inefficient practices, challenge outdated operational approaches, and drive modernization through the adoption of industry best practices and proven real-world technology solutions.
- The ability to evaluate the environment through both a security and resilience lens, identify and make gaps visible, and ensure remediation is driven through the appropriate operational, risk, and governance channels.
- Demonstrated ability to identify technical or operational inefficiencies and contribute to sustainable improvements.
- Experience with resilience-related technologies is strongly desired, and prior IBMi knowledge is a plus.
- Knowledge of and exposure to Active Directory/Entra, enterprise Backup solutions, enterprise Replications solutions and Server Virtualization.
- Self-starter with the demonstrated ability to learn/adapt to new technologies and techniques.
- Ability to organize and manage multiple priorities simultaneously in a fast-paced, deadline-driven environment.
- Passionate about delivering excellence in customer service within a team environment.
- Ability to be patient and train less experienced team members; respond to questions, build capability.
- Ethical, with a commitment to company values.
- Excellent verbal and written communication skills.
- Highly organized and detail-oriented; ability to work in a fast-paced, metrics-driven environment.
- Proficiency in Microsoft Office Suite, Word, Excel, Wiki, collaborative cloud-based programs, and third-party software applications required.
Benefits
- medical
- dental
- vision
- life insurance
- AD&D
- LTD
- 401(k) with employer match
Related Guides
Related Categories
Related Job Pages
More Operations Jobs
• Lead the strategic deployment and governance of learning delivery across Patient Service Center (PSC). • Ensure that learning initiatives are executed through scalable, disciplined delivery models that align to business priorities, operational capacity, and compliance standards. • Provide enterprise-level oversight of delivery feasibility, capacity trade-offs, and execution readiness. • Design and build the foundational learning delivery & operations strategy for PSC that aligns with business priorities and future capability needs. • Translate PSC's learning strategy into scalable delivery models by defining how curricula, learning plans, and capability priorities are deployed across the enterprise. • Serve as a strategic liaison for PSC Learning & Development (L&D). • Design & govern enterprise-wide learning delivery standards. • Provide strategic input on delivery feasibility, capacity planning, and trade-off decisions. • Anticipate delivery risks and constraints and proactively recommend mitigation strategies. • Drive continuous evolution of delivery strategy.
• Oversee daily learning operations activities, including training schedule inputs, enrollments, completions, certifications, and LMS data accuracy • Ensure training plans and calendars are operationally supported and executed on time • Maintain accurate learning records, training catalogs, and documentation to support audit readiness • Support onboarding, recurring training and priority initiatives by ensuring operational coordination and readiness • Identify and resolve issues related to data integrity, timelines, and learning administration • Directly manage a team of Learning Facilitators to include goal setting, work prioritization, coaching, and performance feedback • Support team development by providing clear expectations, ongoing feedback, and opportunities to build program management and communication capabilities • Conduct regular check-ins, feedback discussions, and performance evaluations • Serve as a key point of coordination between Learning Operations, Instructional Designers and PSC partners on operational timelines and readiness • Compile and deliver learning operations, metrics, reports, and status updates to leadership • Gather feedback from stakeholders to identify opportunities for process improvements.
Product & Operations Manager – Fintech
Sioux digital 1:1Geramos resultados de negócio para nossos clientes, promovendo experiências digitais relevantes para as pessoas.
• Build and implement operational processes from scratch; • Manage the fintech's daily operations, ensuring efficient and secure functioning; • Work with the technology team on product evolution (new features, improvements, and integrations); • Organize international financial flows (cross-border, FX, crypto, payments); • Create and monitor operational and financial performance indicators (KPIs); • Establish control, governance, and compliance routines; • Interface with external partners (banks, payment providers, financial platforms); • Identify bottlenecks and propose scalable solutions; • Support the building and growth of the team.
Product & Operations Manager – Smartphones B2B
Sioux digital 1:1Geramos resultados de negócio para nossos clientes, promovendo experiências digitais relevantes para as pessoas.
• Structure and organize the B2B smartphone sales operation; • Create processes from scratch (sales, orders, invoicing, logistics, and after-sales); • Manage the day-to-day operation, ensuring efficiency and scalability; • Work with the technology team on the evolution of systems (ERP, CRM, and product); • Organize the financial flows of the operation (billing, invoicing, reconciliation); • Interface with international partners and suppliers; • Define and monitor operational and commercial KPIs; • Identify opportunities for improvement and efficiency gains; • Support the structuring and growth of the team.


