We create honest financial products that improve lives.
Manager, Software Engineering – Resilience Engineering
Location
Canada
Posted
42 days ago
Salary
$178K - $228K / year
Seniority
Senior
Job Description
Manager, Software Engineering – Resilience Engineering
Affirm
• Define and drive the vision for resilience engineering at Affirm, with a focus on production load testing and chaos engineering as first-class engineering practices. • Lead and mentor a team of engineers building platforms and tooling for safe production experimentation. • Partner with infrastructure, product, and security leadership to embed resilience validation into the software development lifecycle. • Establish best practices for safely testing system limits and failure scenarios in production. • Own the design and evolution of platforms that enable safe, controlled production load testing and fault injection. • Ensure strong safeguards are in place, including isolation boundaries, approval workflows, and automated rollback mechanisms to protect real users. • Build systems that provide end-to-end observability, traceability, and auditability for all resilience experiments. • Drive reliability improvements by systematically identifying weaknesses through load testing and chaos experiments. • Establish monitoring, alerting, and incident response practices tailored to proactive resilience validation. • Work closely with engineering teams to design and execute production load tests and chaos experiments safely. • Partner with infrastructure teams to build guardrails around tests and experimentations. • Enable teams to adopt resilience practices by providing reusable tooling, frameworks, and standardized workflows. • Identify systemic weaknesses and lead cross-functional efforts to improve reliability and fault tolerance. • Evangelize a culture of “test failure before failure tests you” across the organization.
Job Requirements
- Proven experience leading engineering teams in reliability, infrastructure, or distributed systems.
- Hands-on experience with production load testing, chaos engineering, or large-scale system validation.
- Experience with leveraging a chaos engineering vendor such as Gremlin, Harness, or something similar.
- Strong understanding of failure modes in distributed systems, including latency, partial failure, and cascading outages.
- Experience building or operating systems with strong safety guarantees (isolation, rate limiting, guardrails, auditability).
- Familiarity with cloud-native environments (AWS, Kubernetes) and observability tooling.
- Strong programming background (e.g., Python, Kotlin, Java, or similar).
- Excellent problem-solving skills and the ability to balance long-term resilience investments with immediate business needs.
- Strong communication and leadership skills, with a track record of influencing engineering practices across teams.
Benefits
- Health care coverage - Affirm covers all premiums for all levels of coverage for you and your dependents
- Flexible Spending Wallets - generous stipends for spending on Technology, Food, various Lifestyle needs, and family forming expenses
- Time off - competitive vacation and holiday schedules allowing you to take time off to rest and recharge
- ESPP - An employee stock purchase plan enabling you to buy shares of Affirm at a discount
Related Guides
Related Categories
Related Job Pages
More Engineering Manager Jobs
Senior Engineering Manager
MasTec IncMasTec Utility Services is a proud subsidiary of MasTec (NYSE: MTZ), a Fortune 500 Company ranked by Energy News-Record as one of the leading contractors in the country. MUS is part of the MasTec Power Delivery segment. We are certified as a minority-controlled company by the National Minority Suppliers Development Council (NMSDC). Our rich diversity of people and ideas makes us a stronger, more innovative organization.
Role Description The Senior Engineering Manager is responsible for leading and directing a team of Engineering Managers or discipline Engineers on large scale multi-disciplined projects to ensure a high level of communication, coordination, quality, and delivery from design conception through completion. The Senior Manager will act as the subject matter expert and key interface point and liaison between the Engineering Team and the Project Operations team, as well as between the Engineering team and the Client. This onsite position can be located at one of our office locations: Fargo, ND; Indianapolis, IN; Clinton, IN; or Phoenix, AZ. The position may be open to remote with 20% travel but preference will go to someone able to be in the office. Qualifications - Bachelor's degree in Engineering or related field or equivalent and direct renewable energy construction experience - Professional Engineer certification preferred - Seven to ten years of engineering experience/knowledge of construction techniques, estimating and construction management - Five years of prior experience leading a Project Design Team during the Proposal, Design, Permit and Construction Phases Requirements - Take reasonable care of your own and others’ health and safety and of those who may be affected by the day-to-day delivery of this role by taking personal responsibility for working toward the Company’s Zero Injury principles - Actively support MasTec Renewables Key Results and lead by example - Strong data analysis and advanced problem-solving skills - Ability to read and interpret engineering and technical documents - Advanced understanding of terms, nomenclature and business models used in the construction industry - Ability to interact and collaborate professionally with all levels within the organization and with clients, subcontractors and suppliers - Proficient in Microsoft Office, Excel, Primavera, Viewpoint, Procore, InEight and other Construction Software - Write reports, business correspondence and document project activities - Effectively present information and respond to questions from project managers, construction managers, clients, customers and the general public - Must be prepared to demonstrate/document past experience leading efforts to identify and evaluate constructability and risk issues, assisting in the preparation of the project schedule, evaluation of subcontractor bids and negotiating contracts with Design Professionals Benefits - Compensation $141,600-$178,000 / year, commensurate with experience - Competitive pay with ongoing performance review and merit increase - 401(k) with company match & Employee Stock Purchase Plan (ESPP) - Flexible spending account (Healthcare & Dependent care) - Medical, Dental, and Vision insurance (plan choice) - coverage for spouse, domestic partner, and children - Diabetes Management, Telehealth Coverage, Prescription Drug Plan, Pet Insurance, Weight Management Drug Discount - Discounted National Gym Membership Network - Paid Time Off, Paid Holidays, Bereavement Leave - Military Leave, including Differential Pay and Benefits Continuation - Employee Assistance Program - Short and long-term disability, life insurance, and accidental death & dismemberment - Voluntary life insurance, accident, critical illness, hospital indemnity coverage - Emergency Travel Assistance Program - Group legal plan
Director of Software Engineering
FlosumThe only Salesforce Continuous Deployment tool that's easy to set up, 100% secure, requires no code & keeps all metadata
• Lead, coach, and grow a team of 10 full-stack developers across Node.js services and Salesforce customizations. • Drive measurable improvements in developer velocity using DORA metrics: lead time, deployment frequency, change failure rate, and MTTR. • Compress delivery schedules by breaking epics into small, independently shippable slices and challenging inflated estimates with data. • Architect end-to-end solutions spanning Node.js microservices, APIs, event-driven patterns, and Salesforce (Apex, Lightning, integrations). • Own solution performance: latency, throughput, scalability, and cost—set SLOs and hold the team accountable to them. • Establish estimation, sprint, and release discipline; remove blockers; enforce accountability without burning the team out. • Partner with Product, QA, and Infrastructure to align roadmap, capacity, and dependencies. • Recruit, onboard, and performance-manage engineers; build a culture of ownership and speed.
Director of Software Engineering
FlosumThe only Salesforce Continuous Deployment tool that's easy to set up, 100% secure, requires no code & keeps all metadata
• Lead, coach, and grow a team of 10 full-stack developers across Node.js services and Salesforce customizations. • Drive measurable improvements in developer velocity using DORA metrics: lead time, deployment frequency, change failure rate, and MTTR. • Compress delivery schedules by breaking epics into small, independently shippable slices and challenging inflated estimates with data. • Architect end-to-end solutions spanning Node.js microservices, APIs, event-driven patterns, and Salesforce (Apex, Lightning, integrations). • Own solution performance: latency, throughput, scalability, and cost—set SLOs and hold the team accountable to them. • Establish estimation, sprint, and release discipline; remove blockers; enforce accountability without burning the team out. • Partner with Product, QA, and Infrastructure to align roadmap, capacity, and dependencies. • Recruit, onboard, and performance-manage engineers; build a culture of ownership and speed.
Director of Software Engineering
FlosumThe only Salesforce Continuous Deployment tool that's easy to set up, 100% secure, requires no code & keeps all metadata
• Lead, coach, and grow a team of 10 full-stack developers across Node.js services and Salesforce customizations. • Drive measurable improvements in developer velocity using DORA metrics: lead time, deployment frequency, change failure rate, and MTTR. • Compress delivery schedules by breaking epics into small, independently shippable slices and challenging inflated estimates with data. • Architect end-to-end solutions spanning Node.js microservices, APIs, event-driven patterns, and Salesforce (Apex, Lightning, integrations). • Own solution performance: latency, throughput, scalability, and cost—set SLOs and hold the team accountable to them. • Establish estimation, sprint, and release discipline; remove blockers; enforce accountability without burning the team out. • Partner with Product, QA, and Infrastructure to align roadmap, capacity, and dependencies. • Recruit, onboard, and performance-manage engineers; build a culture of ownership and speed.

