Atlassian logo
Atlassian

Atlassian is a publicly-traded computer software business specializing in collaboration, development, and issue-tracking software for teams. As an employer, Atlassian maintains a t

Senior Engineering Manager, SRE

DevOps EngineerDevOps EngineerOtherRemoteLeadTeam 11,000Since 2012

Location

Worldwide

Posted

33 days ago

Salary

0

Seniority

Lead

No structured requirement data.

Job Description

Senior Engineering Manager, SRE

Atlassian

Role Description We're looking for a Senior Engineering Manager to lead a team of Site Reliability Engineers who are supporting the build of an exciting new infrastructure platform. - Your team will be responsible for using software engineering principles to reliably scale the Cloud infrastructure that underpins some of our products as well as the products themselves. - You’re an experienced manager, coaching engineers and technical leaders who report to you, supporting them in their professional development to unlock their potential, and encouraging them to step outside their comfort zone to grow and excel. - You roll up your sleeves and aren’t afraid to get hands-on to help your team, when the right opportunity calls. - You’ll also play an important role in the organization's leadership team, working with other engineering managers, architects, and technical program managers to steer the organization by contributing to the strategy and helping determine the right problems for the teams to invest in solving. Qualifications - Experience managing & growing technical leaders and teams. - A drive for operational excellence and experience with teams responsible for running mission-critical production services. - A passion for driving cultural change in technical excellence, quality and efficiency. - Familiarity with agile software development methodologies. Requirements - 5+ years experience implementing reliability & scale principles and practices. - 3+ years experience influencing teams outside your own organization with data and insights. - A demonstrated ability to foster an innovation culture in your teams. - 5+ years experience with large scale distributed systems and microservices. - 3+ years experience developing and implementing a long term strategy for a team.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Jusbrasil logo

SRE Partner – Affirmative Action Position for Persons with Disabilities

Jusbrasil

💻 Descomplicamos o acesso à informação jurídica por meio da tecnologia

DevOps Engineer33 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

• Ensure reliability, availability and scalability of systems and services in the Product Areas (PAs) where assigned. • Develop and implement monitoring, observability and alerting solutions integrated with the Agentic Engineering Platform. • Support teams in defining and tracking SLIs, SLOs and error budgets. • Structure and evolve on-call management in the PAs: rotation, escalation, alerting tools and incident management. • Work closely with the Engineering Platform to ensure platform capabilities reach and are adopted by product teams. • Actively contribute to the evolution of the Agentic Engineering Platform by bringing real feedback from PAs about friction points, gaps and opportunities for improvement. • Participate in and influence the building of a reliability-oriented (SRE) engineering culture across the company. • Support migrations of critical systems, environment segregation and deprecation of legacy technologies.

Brazil
AlpacaDB logo

Operations Reliability Engineer - Automations

AlpacaDB

AlpacaDB, Inc., also known as Alpaca and Alpaca Securities, is an API stock and crypto brokerage platform that enables services to embed investing and developer

DevOps Engineer33 days ago

Role Description As an Operations Reliability Engineer , you will embed directly within brokerage operations functions to systematically eliminate manual work and replace it with durable, auditable software systems. You start by immersing yourself in operational workflows: observing, documenting, and deeply understanding processes end-to-end before designing solutions. Every recurring manual process is treated as a system defect, and every fix you ship is measured by its real-world impact on efficiency and reliability. You will work closely with licensed brokerage staff, domain experts, and platform engineers to build automations and tooling that allow Alpaca's operations to scale globally without scaling headcount linearly. The ideal candidate is equally comfortable shadowing an operational process and architecting the backend service that replaces it. Things You Get To Do - Design, build, test, deploy, and monitor production automations and UIs that remove manual steps and reduce operation time. - Partner with frontend engineers to productize ops tooling so global teams can run functions with predictable staffing. - Execute operational procedures to surface painful manual processes prior to automation. - Instrument and report baseline and outcome metrics (MTTC, manual-steps removed, queue sizes, ops satisfaction) and iterate based on measured impact. - Produce Platform Opportunity Briefs / RFCs for higher-level platform tooling and automations. - Collaborate with licensed BD leadership, Compliance, and Security to build auditable, safe automations with role-based access and clear runbooks. - Own the full lifecycle of the systems you build, including automated deployment (CI/CD with tools like ArgoCD and Terraform), proactive monitoring, On-call support rotations and incident response, following a "you build it, you run it" philosophy. - Build systems with auditability, traceability, and data lineage as a first-class concern to ensure transparency for our auditors and regulators. Qualifications - 5+ years of professional software engineering experience, with a proven track record of shipping and operating complex, large-scale systems in production. - Strong business sense and understanding of operations. - Deep, hands-on expertise in Golang, including a strong command of its concurrency models (goroutines, channels), memory management, and standard library. - Proven track record of building user-facing features end-to-end with Typescript/React. - Proficient with SQL and relational databases, preferably PostgreSQL. - Demonstrated ability to reason about human workflows as systems, not just software services. - Experience with observability, tracing, continuous profiling. - Exceptional analytical and problem-solving skills, with the ability to deconstruct complex requirements into clear technical components and excellent communication skills for working in a cross-functional environment. - High ownership mindset with bias toward durable, structural fixes over tactical patches. Requirements - Knowledge of service oriented architectures. - Experience with major cloud platforms (we primarily use GCP). - Financial market (exchange, broker-dealers, clearing, etc.) knowledge. - Experience with Docker and Kubernetes. - A passion for financial markets or the desire to learn. - Knowledge of Agile/Scrum methodologies. - Demonstrable experience in designing, building, and reasoning about distributed systems, including a strong understanding of microservices architecture and API design patterns (e.g., REST, gRPC). - Experience with capacity planning and benchmarking. Benefits - Competitive Salary & Stock Options. - Health Benefits. - New Hire Home-Office Setup: One-time USD $500. - Monthly Stipend: USD $150 per month via a Brex Card.

Worldwide
Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

• Collaborate closely with the development team to deploy and maintain application infrastructure. • Assist in the development and support of tooling to streamline the deployment and maintenance of our products. • Work with Kubernetes, Docker, Helm and ArgoCD to deploy applications from development through to production environments. • Support both in-house and third-party applications, including handling deployments, upgrades, and troubleshooting. • Write and manage automation pipelines for application deployment and maintenance. • Provision and manage infrastructure using Terraform. • Document processes and best practices clearly and concisely.

United Kingdom
PlayOn! Sports logo

Senior Site Reliability Engineer

PlayOn! Sports

The nation's leading high school media company providing live streaming and digital ticketing services.

DevOps Engineer33 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

• Contribute to system observability i.e implementing, improving metrics, alerting, and dashboards for better insight and faster recovery. • Develop automation, tooling, and monitoring solutions to support high service availability. • Partner with application and quality engineering teams to implement best practices in reliability, release automation, and testing. • Drive operational excellence through proactive incident prevention, blameless postmortems, and capacity planning. • Participate in on-call rotations to support critical services and ensure rapid response to incidents.

United States