The world’s most trusted WordPress technology company, powering your freedom to create on WordPress.
Senior Production Engineer
Location
Australia
Posted
4 days ago
Salary
0
Seniority
Senior
Job Description
Senior Production Engineer
WP Engine
• Curate an engineering culture that is biased toward shipping while maintaining very high standards of stability, performance, security, and scalability. • Monitor, alert, investigate, and resolve production infrastructure challenges across public cloud environments (including GCP and AWS). • Actively identify opportunities and implement automated, scalable solutions to optimize code and operational tasks, reducing toil across the platform. • Serve as a technical point of contact during critical issues, facilitate cross-functional communication, track alert trends, and perform root cause analysis (RCA) to drive structural improvements. • Execute production change and security requests, safely managing code deployments and patching pipelines. • Partner closely with Product Management and Engineering Management to align technical solutions and execution strategies with business goals and roadmap priorities. • Further an ecosystem of learning and professional development by coaching, mentoring, and leveling up exceptional engineers across the organization.
Job Requirements
- 5+ years of software engineering, DevOps, or Site Reliability Engineering (SRE) experience building and running production-ready configurations at internet scale.
- Strong history of system architecture, distributed microservices, and self-healing systems with efficient resource and network utilization.
- Advanced expertise executing containerized workloads using Kubernetes orchestration and Docker across public cloud hosting providers.
- Proficient in structural and automated scripting using object-oriented languages, with strong experience in Python and Go preferred.
- In-depth understanding of the modern web serving stack and infrastructure automation tools (e.g., Linux, Nginx/Apache, MySQL, PHP, Ansible, Terraform).
- Hands-on familiarity implementing metrics, monitoring, and alerting systems (such as Prometheus or EFK stacks) to maximize infrastructure visibility.
- A forward-thinking learner with superb analytical, troubleshooting, and root cause analysis capabilities who loves a team-first environment.
Benefits
- Company Stock Options (Every employee is an owner in the company)
- Superannuation Program
- Employee Assistance Program
- Supplemental Maternity & Paternity Pay
- Generous Vacation Time (Who doesn't like time off)
- One-time $745 AUS Home Office Stipend
- Company Wellness Days
Related Guides
Related Categories
Related Job Pages
More Production Engineer Jobs
Staff Site Reliability Engineer – Production Engineering
DropboxDropbox is the one place to keep life organized and keep work moving.
• Define and evolve Dropbox’s company-wide technical reliability strategy to support the changing engineering environment created by AI-assisted and agentic software development. • Set multi-year reliability goals, standards, and roadmaps across observability, debugging, incident management, service health, and operational readiness. • Lead cross-team initiatives that reduce reliability risk as software delivery velocity, pull request volume, service complexity, and incident volume increase. • Partner with engineering leaders and platform teams to improve monitoring, alerting, debugging, SLOs, SLAs, and incident response systems at company scale. • Identify emerging reliability risks introduced by AI-enabled development workflows and design scalable systems, processes, and guardrails to mitigate them. • Provide technical leadership and mentorship to engineers across teams, raising engineering quality, reliability judgment, and operational excellence. • Drive clear communication and alignment with senior stakeholders on reliability priorities, tradeoffs, risks, and execution progress.
Staff Site Reliability Engineer – Production Engineering
DropboxDropbox is the one place to keep life organized and keep work moving.
• Define and evolve Dropbox’s company-wide technical reliability strategy to support the changing engineering environment created by AI-assisted and agentic software development. • Set multi-year reliability goals, standards, and roadmaps across observability, debugging, incident management, service health, and operational readiness. • Lead cross-team initiatives that reduce reliability risk as software delivery velocity, pull request volume, service complexity, and incident volume increase. • Partner with engineering leaders and platform teams to improve monitoring, alerting, debugging, SLOs, SLAs, and incident response systems at company scale. • Identify emerging reliability risks introduced by AI-enabled development workflows and design scalable systems, processes, and guardrails to mitigate them. • Provide technical leadership and mentorship to engineers across teams, raising engineering quality, reliability judgment, and operational excellence. • Drive clear communication and alignment with senior stakeholders on reliability priorities, tradeoffs, risks, and execution progress.
Senior Production Engineer
Veeam SoftwareYour Single Backup and Data Management Platform for Cloud, Virtual and Physical
• Own the reliability, performance, and operability of complex, business‑critical production services and workflows. • Own complex and escalated production issues from support, and drive long‑term fixes in collaboration with engineering, including code, configuration, and architecture changes. • Proactively identify and address systemic risks that are identified during the problem‑solving process, and convert them into long‑term engineering improvements. • Lead production efficiency initiatives, and define, develop, and maintain processes, run‑books, and knowledge base integrity across multiple services or domains. • Define, build, and maintain production monitoring systems for critical services, ensuring deep visibility into system health and user experience. • Continuously improve alerting to minimize noise and ensure actionable, well‑documented runbooks with clearly owned responses. • Define and maintain SLIs/SLOs for key services, and use error budgets to guide operational and product decisions, influencing priorities where necessary. • Turn manual processes into robust automation, and champion automation patterns and tooling adoption across teams. • Own and drive the post‑mortem review process and actions arising from incident analysis, ensuring high‑quality follow‑up and measurable reliability improvements. • Collaborate with the support organization as a senior escalation point and systematically feed back knowledge, tooling enhancements, and improvement recommendations. • Collaborate with developers throughout the lifecycle of changes, from design through rollout and patch delivery, ensuring safe deployments and efficient incident mitigation. • Lead or significantly contribute to design reviews to ensure services are operable with minimal manual intervention in production (automation, safe deployments, clear run‑books, resilience patterns), and share learnings through documentation and feedback. • Mentor and coach other engineers in production engineering practices (observability, incident handling, automation, design for failure), helping to raise the operational bar across the organization.
Product Owner — AI Reliability Engineering
General Dynamics Mission SystemsWe develop mission critical solutions for those that lead, serve and protect the world we live in.
Role Description ROLE AND POSITION OBJECTIVES: - What You'll Own: - The backlog: Define, prioritize, and maintain the pod's work backlog based on business value, stakeholder input, and technical feasibility. Every item has a clear definition of done. - The value case: Articulate why each initiative matters — cost savings, efficiency gains, risk reduction, user impact. You quantify value, not just describe it. - Stakeholder alignment: Manage expectations across executive leadership, functional organizations, and the pod team. When priorities conflict, you make the call and own the decision. - Risks and milestones: Identify risks early, escalate what you can't resolve, and ensure the pod hits its commitments. You track progress through outcomes delivered, not activities completed. - Acceptance and validation: You are the final voice on whether what the pod builds meets the business need. You work with Domain SMEs to validate but the decision is yours. - What You Won't Own: - Technical architecture or engineering decisions — that's the Lead Architect's job. - Day-to-day task management or sprint mechanics — the team self-organizes. - People management, performance reviews, or HR administration. - What Makes This Role Different: - You are not a proxy or a facilitator. You have real decision authority over what the pod builds and in what order. - You are working on enterprise-scale AI modernization — replacing legacy ERP, HRM, CRM, and manufacturing systems with AI-native applications. These are hard, consequential problems. - You will work directly with ELT-level stakeholders and have the backing of the CDAIO organization. - The pod model is new. You will help define how it works, not just operate within a playbook someone else wrote. Qualifications - Bachelor’s degree plus 8 years of experience in product ownership, product management, or business analysis in a technology-driven environment. - Demonstrated experience owning a product backlog and making prioritization decisions — not just writing user stories for someone else to prioritize. - Experience working directly with engineering teams on software delivery — you understand what it takes to ship software and can have credible conversations with engineers. - Strong communication skills — you can present to executives and translate between business needs and technical reality without losing fidelity in either direction. - Experience with enterprise systems (ERP, CRM, HRM, MES, or similar) — you understand the complexity of business processes these systems support. - U.S. citizenship required. Department of Defense Secret security clearance is required at time of hire. Requirements - Experience in a product owner or product manager role during a system modernization or legacy replacement effort. - Familiarity with AI/ML concepts — you don't need to build models, but you should understand what AI can and cannot do so you can make informed trade-off decisions. - Experience in manufacturing, defense, or complex enterprise environments where process compliance and change management matter. - Track record of managing stakeholders who have competing priorities — you have made unpopular decisions and stood behind them. Benefits - Remote — 100% telework. - 9/80 schedule. - Defense industry experience is not required. - Salary Note: This estimate represents the typical salary range for this position based on experience and other factors (geographic location, etc.). Actual pay may vary. This job posting will remain open until the position is filled. - Combined Salary Range: USD $124,397.00 - USD $138,003.00 /Yr. Company Description General Dynamics Mission Systems (GDMS) engineers a diverse portfolio of high technology solutions, products and services that enable customers to successfully execute missions across all domains of operation. With a global team of 12,000+ top professionals, we partner with the best in industry to expand the bounds of innovation in the defense and scientific arenas. Given the nature of our work and who we are, we value trust, honesty, alignment and transparency. We offer highly competitive benefits and pride ourselves in being a great place to work with a shared sense of purpose. You will also enjoy a flexible work environment where contributions are recognized and rewarded. If who we are and what we do resonates with you, we invite you to join our high-performance team!



