PayPal logo
PayPal

PayPal offers a fast, secure way for sellers and buyers to conduct transactions online and on the go. From its beginning in 1998, the financial technology compa

Senior Site Reliability Engineer

Location

California

Posted

73 days ago

Salary

0

Seniority

Senior

Job Description

Senior Site Reliability Engineer

PayPal

Title: Senior Site Reliability Engineer Location: San Jose United States Job Description: The Company Requisition ID R0134918 Time Type Full time The Company PayPal has been revolutionizing commerce globally for more than 25 years. Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, PayPal empowers consumers and businesses in approximately 200 markets to join and thrive in the global economy. We operate a global, two-sided network at scale that connects hundreds of millions of merchants and consumers. We help merchants and consumers connect, transact, and complete payments, whether they are online or in person. PayPal is more than a connection to third-party payment networks. We provide proprietary payment solutions accepted by merchants that enable the completion of payments on our platform on behalf of our customers. We offer our customers the flexibility to use their accounts to purchase and receive payments for goods and services, as well as the ability to transfer and withdraw funds. We enable consumers to exchange funds more safely with merchants using a variety of funding sources, which may include a bank account, a PayPal or Venmo account balance, PayPal and Venmo branded credit products, a credit card, a debit card, certain cryptocurrencies, or other stored value products such as gift cards, and eligible credit card rewards. Our PayPal, Venmo, and Xoom products also make it safer and simpler for friends and family to transfer funds to each other. We offer merchants an end-to-end payments solution that provides authorization and settlement capabilities, as well as instant access to funds and payouts. We also help merchants connect with their customers, process exchanges and returns, and manage risk. We enable consumers to engage in cross-border shopping and merchants to extend their global reach while reducing the complexity and friction involved in enabling cross-border trade. Our beliefs are the foundation for how we conduct business every day. We live each day guided by our core values of Inclusion, Innovation, Collaboration, and Wellness. Together, our values ensure that we work together as one global team with our customers at the center of everything we do – and they push us to ensure we take care of ourselves, each other, and our communities. Job Summary: At PayPal, Senior Site Reliability Engineers (SREs) drive the reliability, performance, and availability of our global mobile and backend systems. As part of our new Mobile SRE team, you’ll bridge the gap between iOS and Android clients and the backend services that power them, delivering seamless experiences for millions of customers. In this hands-on role, you’ll implement reliability standards, build automation, and enhance observability across the stack. By developing actionable insights, automating key workflows, and advancing operational excellence, you’ll help ensure PayPal’s platforms deliver reliable, high-performance experiences customers can trust worldwide. Job Description: Essential Responsibilities: - Take ownership of system performance monitoring, identify inefficiencies, and lead initiatives to improve the overall availability and reliability of digital platforms and applications. - Lead and manage the response to complex, high-priority incidents, ensuring prompt resolution and a thorough root cause analysis to prevent future occurrences. - Design and implement advanced automation frameworks to improve operational efficiency, streamline processes, and reduce human error. - Lead reliability-focused initiatives, ensuring systems are highly available, resilient, and scalable, and promote best practices across engineering teams. - Enhance the monitoring infrastructure by identifying key metrics, optimizing alerting, and improving system observability to ensure the reliability of large-scale systems. - Forecast resource requirements and lead capacity planning activities to ensure systems can scale effectively to meet growing user demand. - Ensure robust disaster recovery strategies are in place and conduct regular testing to ensure systems can recover quickly from failures. - Partner with engineering and product teams to identify opportunities for improving system architecture, focusing on scalability, reliability, and fault tolerance. - Provide mentorship and technical guidance to junior site reliability engineers, fostering skill development and knowledge sharing. - Drive continuous improvement across operational workflows, identifying areas for optimization, cost reduction, and performance enhancement. - Minimum Qualifications: - 3+ years relevant experience and a Bachelor’s degree OR Any equivalent combination of education and experience. - Additional Responsibilities & Preferred Qualifications: Additional Qualifications - Proven experience in Site Reliability Engineering, software development, or systems engineering, with a focus on end-to-end system reliability and performance. - Strong understanding of backend architectures, including APIs, data flows, and cross-system dependencies. - Hands-on experience developing monitoring, observability, and alerting solutions using tools such as Datadog, Firebase Crashlytics, or Sentry. - Skilled in automation and tooling development using Python, Go, or similar languages to reduce manual processes and improve efficiency. - Experience implementing SLIs/SLOs and leveraging metrics to drive measurable improvements in reliability and availability. - Solid foundation in distributed systems, cloud infrastructure (AWS, GCP, or Azure), and CI/CD pipelines for reliable software delivery. - Strong debugging and problem-solving skills, capable of diagnosing and resolving complex issues across mobile, API, and backend systems. - Effective collaborator and communicator, skilled at partnering across mobile, backend, and SRE teams to deliver cohesive reliability outcomes. - Demonstrated ability to mentor engineers and foster a culture of observability, automation, and operational excellence. Preferred Qualifications - Understanding of mobile (iOS and Android) applications - Experience improving incident response workflows, postmortem and on-call models processes. - Background in performance optimization, fault tolerance, and disaster recovery for large-scale systems. - Experience collaborating within distributed or global engineering teams. Subsidiary: PayPal Travel Percent: 0 The base pay for this role will depend on where you work and the relevant experience and expertise you bring. The expected range of pay for this role by location is: Primary Location | Pay Range: San Jose, California | ($129,500.00 - $191,950.00 Annually) Additional Location(s) | Pay Range: No other locations are assigned to this requisition currently. Additional compensation for this role may include an annual performance bonus, equity, or other incentive compensation, as applicable. PayPal does not charge candidates any fees for courses, applications, resume reviews, interviews, background checks, or onboarding. When making an application directly, we will never ask you to share passwords, one-time passcodes (OTP), or verification codes. Any such request is a red flag and likely part of a scam. All communication regarding your application will come from official PayPal email domains. If you suspect fraudulent activity, please report it immediately. To learn more about how to identify and avoid recruitment fraud please visit https://careers.pypl.com/contact-us. For the majority of employees, PayPal's balanced hybrid work model offers 3 days in the office for effective in-person collaboration and 2 days at your choice of either the PayPal office or your home workspace, ensuring that you equally have the benefits and conveniences of both locations. Our Benefits: At PayPal, we’re committed to building an equitable and inclusive global economy. And we can’t do this without our most important asset-you. That’s why we offer comprehensive, choice-based programs, to support all aspects of personal wellbeing—physical, emotional, and financial—delivering meaningful value where it matters most. We strive to create a flexible, balanced work culture with a holistic approach to benefits, including generous paid time off, healthcare coverage for you and your family, and resources to create financial security and support your mental health. Who We Are: Click Here to learn more about our culture and community. Commitment to Diversity and Inclusion PayPal provides equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, pregnancy, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state, or local law. In addition, PayPal will provide reasonable accommodations for qualified individuals with disabilities. If you are unable to submit an application because of incompatible assistive technology or a disability, please contact us at paypalglobaltalentacquisition@paypal.com. Belonging at PayPal: Our employees are central to advancing our mission, and we strive to create an environment where everyone can do their best work with a sense of purpose and belonging. Belonging at PayPal means creating a workplace with a sense of acceptance and security where all employees feel included and valued. We are proud to have a diverse workforce reflective of the merchants, consumers, and communities that we serve, and we continue to take tangible actions to cultivate inclusivity and belonging at PayPal. Any general requests for consideration of your skills, please Join our Talent Community. We know the confidence gap and imposter syndrome can get in the way of meeting spectacular candidates. Please don’t hesitate to apply. When you become part of our Talent Community, we’ll keep you posted about future job opportunities that you may be a match for, as well as career-related events.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Bluefish AI logo

Staff DevOps Engineer

Bluefish AI

AI Marketing Suite for Brands

DevOps Engineer73 days ago
Full TimeRemoteTeam 11-50Since 2024H1B No Sponsor

• Design and maintain AWS infrastructure using best practices (e.g., IaC, security, cost optimization, monitoring). • Define and implement infrastructure standards, policies, and governance (e.g., IAM, group permissions, access control) • Develop, operate, and improve CI/CD pipelines using GitHub Actions and other tools to support fast and safe deployment of our applications and services. • Collaborate with engineering teams to define infrastructure requirements and help shape deployment strategies for frontend and backend applications. • Proactively monitor system performance, availability, and reliability. Implement observability tooling and incident response processes. • Automate provisioning, scaling, and recovery of infrastructure using tools like Terraform. • Identify and resolve infrastructure bottlenecks (e.g., database scaling, system performance constraints). • Lead initiatives around infrastructure security and compliance, including access control, vulnerability management, secrets management, encryption, and audit readiness. • Ensure systems meet relevant security and compliance standards (e.g., ISO 27001, SOC 2, GDPR), and support internal and external audits. • Mentor engineers on DevOps best practices, advocate for clean and scalable infrastructure design, and help build a strong engineering culture.

Germany

Data-DevOps Engineer

K2 Insurance Services

K2 Insurance Services is a specialty insurance platform headquartered in San Diego, California, that delivers innovative risk management solutions through a diversified portfolio o

DevOps Engineer73 days ago

• Develop, optimize, and maintain complex SQL queries, views, and python-based notebooks • Build and support ELT data pipelines from structured and unstructured data sources • Work with Databricks to source data from relational databases to store, retrieve, and transform data efficiently • Collaborate with analysts and stakeholders to interpret business requirements, translate them into efficient data models, and build scalable SQL logic that supports analytics, reporting, and downstream consumption • Support production workloads, including triaging failed jobs, root-cause analysis, and implementing long‑term stability improvements • Use Azure DevOps for source control, build and release pipelines, and CI/CD workflows • Maintain and expand Infrastructure as Code (IaC) to support consistent, repeatable environment provisioning and configuration • Contribute to the enhancement, optimization, and ongoing maintenance of existing CI/CD pipelines • Support deployment and release processes across development and production environments • Create, maintain, and update environment and release documentation, including deployment notes, pipeline changes, and release announcements • Partner with data engineers and analysts to understand data requirements • Document data and release pipelines, database structures, and operational processes • Participate in agile ceremonies such as sprint planning and daily standups • Ensure high‑quality delivery by following established engineering standards and contributing to the continuous improvement of workflows • Contribute to knowledge sharing by updating internal documentation and code as systems evolve • Assist with incident triage, troubleshooting, and post‑release validation under guidance from senior team members

California
$105K - $125K / year
Job Closed
Kpler logo

Senior DevOps Engineer – Cloud, ML Infrastructure

Kpler

Facilitating efficient and sustainable trade.

DevOps Engineer73 days ago
Full TimeRemoteTeam 201-500Since 2014H1B Sponsor

• Design, operate, and improve Kpler’s cloud-native infrastructure (Kubernetes, networking, compute, storage). • Contribute to Infrastructure as Code, CI/CD pipelines, and platform automation. • Ensure high availability, reliability, and security of production systems. • Improve observability, monitoring, alerting, and incident response processes. • Reduce MTTR and failure rates through structured reliability improvements. • Optimize infrastructure cost and performance, including compute-intensive workloads. • Support and help standardize ML/GPU-based workloads within the existing platform model. • Collaborate closely with ML engineers, data engineers, and backend teams to ensure production-grade deployments. • Contribute to architectural decisions shaping the evolution of the platform.

Greece
Deutsche Telekom IT Solutions logo

DevOps Engineer

Deutsche Telekom IT Solutions

As Hungary’s most attractive employer in 2025 (according to Randstad’s representative survey), Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group. The company provides a wide portfolio of IT and telecommunications services with more than 5300 employees. We have hundreds of large customers, corporations in Germany and in other European countries. DT-ITS received the Best in Educational Cooperation award from HIPA in 2019, acknowledged as the Most Ethical Multinational Company in 2019. The company continuously develops its four sites in Budapest, Debrecen, Pécs and Szeged and is looking for skilled IT professionals to join its team.

DevOps Engineer73 days ago
Full TimeRemoteTeam 5,001-10,000

Role Description Are you an expert in deploying, observing, and maintaining distributed fleets of devices? Do you build infrastructure that scales effortlessly and recovers automatically from mass reconnections? Join our team to oversee the operational backbone of our edge-to-cloud ecosystem. If you love automating complex deployments and diving deep into observability metrics, you are the right fit for us! Our project, GroundOS, is not just another screen manager. It is a next-generation Universal Display System (UDS) built to power the future of global mobility. We are building an "Operating System for Reality" that orchestrates massive, data-driven signage networks across critical infrastructure, from major international airports to sprawling public transport systems. GroundOS moves beyond static displays; it uses a state-of-the-art digital twin to process and react to real-time operational data. To guarantee continuous operation, the platform features a resilient, offline-first edge architecture that ensures screens keep running smoothly even if the network fails. Join us to blend high-performance Rust edge computing with modern TypeScript cloud services and help us set a new global standard for how hundreds of millions of passengers experience their journey. - Manage the deployment, observability, and lifecycle of thousands of remote mini-PCs alongside Cloud components. - Execute Over-The-Air (OTA) updates reliably across a massive edge fleet. - Configure and manage NATS JetStream, including Leaf Nodes for edge-cloud bridging, stream retention, and cluster HA. - Setup and maintain tracing and metrics using OpenTelemetry to monitor cross-system health. - Architect resilient systems capable of withstanding mass fleet reconnection events (thundering herd) without performance loss. - Manage secrets, certificates, and secure mTLS communication between edge devices and the central control plane. - Lead incident management and root-cause analysis for fleet-wide issues. - Design scalable operations workflows to keep maintenance effort constant as the fleet grows. Qualifications - Extensive experience with infrastructure automation and remote fleet management. - High proficiency in containerization (Docker), specifically optimized for edge devices (multi-arch builds, ARM/x64). - Deep operational knowledge of NATS JetStream or similar high-throughput event brokers. - Strong background in observability, tracing, and metric collection. - Solid understanding of Zero-Trust security architectures and certificate management. - Ability to remain calm and analytical during high-pressure incident response situations. - Expert knowledge of agile development. - Solid knowledge of Scrum. - Experience working in agile projects and teams. - Excellent English skills, both written and spoken (B2–C1). - Excellent technical and analytical skills, as well as problem-solving abilities. - Ability to handle stressful situations and work independently. Advantages - Experience with Google Clouds GKE for the central cloud control plane. - Prior experience with specific edge orchestration tools. Additional Information - Please be informed that our remote working possibility is only available within Hungary due to European taxation regulation. Company Description As Hungary’s most attractive employer in 2025 (according to Randstad’s representative survey), Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group. The company provides a wide portfolio of IT and telecommunications services with more than 5300 employees. We have hundreds of large customers, corporations in Germany and in other European countries. DT-ITS received the Best in Educational Cooperation award from HIPA in 2019, acknowledged as the Most Ethical Multinational Company in 2019. The company continuously develops its four sites in Budapest, Debrecen, Pécs and Szeged and is looking for skilled IT professionals to join its team.

Hungary