Described as the world's top internet television network, Netflix is a publicly-traded entertainment company offering video-on-demand and streaming media. As an
Site Reliability Engineer 4 - Live SRE
Location
United States
Posted
44 days ago
Salary
$250K - $413K / year
Seniority
Mid Level
No structured requirement data.
Job Description
Site Reliability Engineer 4 - Live SRE
Netflix
At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next. About the role In this role, you will support our live streaming events by focusing on cloud traffic(API Gateway, IPC between microservices). You will prepare and execute various load tests to ensure both individual critical applications and overall cloud infrastructure can handle sudden increases in API traffic, especially at the start of events. You will also implement end-to-end observability and visualize the data to achieve the desired availability at scale. You will impact multiple areas of the live event lifecycle, from the planning phase through testing and event launch days. Responsibilities - Drive continual improvement in observability, monitoring, and scalability with the primary goal to solve the thundering herd problem with cloud traffic (API gateway, IPC between microservices) for live streaming. - Implement, automate, execute, and analyze the results from a broad range of live streaming delivery focused functional, performance, resilience, and fault injection testing. - Write and review code, develop documentation, and debug complex problems between systems and components. - Coordination, collaboration, and partnership across multiple stakeholders for the smooth execution of live-streaming events - Participate in an on-call rotation and be able to work with flexible hours based on the live events schedule Qualifications - 5+ years service reliability/operational experience running large scale, high performance systems & internet services with focus on traffic at scale. - Knowledge of and proven experience with L4 Load Balancer, HTTP cache, and reverse proxy technologies. - Expert-level knowledge of Unix or Linux systems and TCP/IP network fundamentals. - Proficient understanding of networking principles, transport, and application protocols, especially DNS, TLS, and HTTP(s) etc. - Proficient in a programming language such as Go, Python, Rust etc. - Experience with using real time and BigData analytics processing technologies (Kafka, time series database and Presto/Trino, Spark SQL etc) - Ability to work in a highly collaborative environment and to communicate effectively with internal and external partners - Preferred - B.S. in Computer Science, Electrical or Computer Engineering (or equivalent professional experience) Generally, our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $250,000.00 - $413,000.00. This compensation range will vary based on location. Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more details about our Benefits here. Netflix is a unique culture and environment. Learn more here. Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner. We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service. Job is open for no less than 7 days and will be removed when the position is filled.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer
CSGCSG delivers innovative customer engagement solutions that help you acquire, monetize, engage and retain customers.
• Operate and improve cloud infrastructure on AWS and Azure, ensuring availability, performance, and security • Build and maintain infrastructure using Terraform and configuration management tools (Chef or equivalent) • Develop automation and tooling using PowerShell, Bash, and Python • Implement and maintain CI/CD pipelines and DevOps practices using Azure DevOps • Troubleshoot and resolve issues across Linux and Windows systems • Support infrastructure lifecycle activities: provisioning, patching, monitoring, and optimization • Collaborate with engineering teams to improve reliability and operational efficiency
Cloud Operations Engineer
MongoDBMongoDB, originally called 10gen, is a software development company. Since 2007, MongoDB has created an open-source, document-oriented database to help clients
MongoDB Atlas is the premier multi-cloud database-as-a-service built and operated by the makers of MongoDB. The Cloud Operations Engineering team at MongoDB is a worldwide team responsible for the consistent operational success of every MongoDB Atlas customer. As a Cloud Operations Engineer, you will help ensure the success of our Atlas customers, whether they are early startups or large multinational companies, cloud-native or just getting started with a digital transformation to the cloud. You are excited about the core mission of MongoDB, and the opportunity to join the team responsible for operating Atlas, the fastest-growing multi-cloud database-as-a-service in the world. You are prepared to be one of the early members of a 24/7/365 global cloud operations team. Cloud Operations Engineers will be responsible for day-to-day duties such as creating and monitoring system’s alert dashboards, reviewing critical events and system logs, accessing customer instances that underpin their production databases and performing server administration duties including performance troubleshooting. Applicants must be critical thinkers who are quick to detect, resolve, or escalate issues that are sometimes broad in scope and difficult to trace. At MongoDB you will grow your career and skills, wear multiple hats, and be part of an operations team that works at the frontier of Cloud services and database systems. The Federal Risk and Authorization Management Program (FedRAMP) is a US government-wide program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services. Our FedRAMP program requires that anyone who is accessing customer data or metadata inside the Authorization Boundary be a US Person on US Soil. In order for us to triage and assign cases, it is necessary to be able to identify available resources at any given time. For this reason the FedRamp team is composed of three separate shifts: first shift, second shift, and third shift. This job posting is for the Third Shift, in which your working hours would be 11pm-8am ET. We are looking to speak to candidates who are based in the United States for our remote or hybrid working models Monday to Friday for the first 3-6 months depending on ramping speed. Once considered ramped, they will transition to a permanent Wednesday-Sunday (preferred) or Saturday-Wednesday 11pm-8am ET work week to provide weekend coverage alongside other peers. Saturdays and Sundays are considered fully online workdays and not an on-call shift. Due to the 24/7 nature of our support organization, certain events throughout the year will require volunteering for coverage outside one’s normal work days or work hours (i.e. regional offsites, regional holidays, etc). These are typically announced weeks in advance with a sign-up system that considers equitability. Responsibilities - Successfully coordinate and collaborate with a global team of Cloud Operations Engineers who are tasked with ensuring our uptime guarantees to our Atlas customer base - Help scale the worldwide Cloud Operations Engineering team with the strategic implementation and refinement of new processes and tools - Assist in scoping, designing and deploying systems that reduce Mean Time to Resolve for customer incidents - Monitor and detect emerging customer-facing incidents on the Atlas platform; assist in their proactive resolution - Automate routine monitoring and troubleshooting tasks - Diagnose live incidents, differentiate between platform issues versus usage issues, and take the next steps toward resolution - Assist in performing root cause analysis after incident recovered; identifying any breakdowns in processes or workflows that contributed to the event and what changes need to be made to prevent similar events - Contribute to documentation of corner case scenarios, troubleshooting workflows and SOPs. - Work alongside our product management, cloud engineering and support organizations by identifying areas for improvement in the management applications powering the Atlas infrastructure - Inform executive leadership and escalation management personnel of major outages - Coordinate and participate in a weekly on-call rotation, where you will handle short term customer incidents (proactively from automated monitoring or through reactive alerts via our Technical Services team) Requirements - Experience with being an on call DevOps, SRE, or Cloud Operations engineer (at least 2 years) - Expertise with Linux system administration, configuration, troubleshooting - Experience in monitoring, system performance data collection and analysis, and reporting - Knowledge of database operations and concepts - Expertise with networking technologies like DNS, TCP/IP, etc. - Familiarity with Amazon Web Services and other Cloud infrastructure platforms (e.g. GCP, Azure) - Knowledgeable about a wide range of web and internet technologies - Capability to write small programs/scripts to solve both short-term systems problems - A CS/CE degree or equivalent experience - At least 1 of the following programming languages: Java, Go, Python, Javascript - A keen interest in learning new things Special Requirements - Be a US Citizen Nice To Have - MongoDB - Splunk - Kubernetes Benefits include - Competitive salary, equity, pension and health insurance - Regular performance, compensation and development reviews - 20 weeks Maternity & Paternity leave to spend time with new arrivals About MongoDBMongoDB is built for change, empowering our customers and our people to innovate at the speed of the market. We have redefined the database for the AI era, enabling innovators to create, transform, and disrupt industries with software. MongoDB’s unified database platform, the most widely available, globally distributed database on the market, helps organizations modernize legacy workloads, embrace innovation, and unleash AI. Our cloud-native platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available across AWS, Google Cloud, and Microsoft Azure. With offices worldwide and over 60,000 customers, including 75% of the Fortune 100 and AI-native startups, relying on MongoDB for their most important applications, we’re powering the next era of software. Our compass at MongoDB is our Leadership Commitment, guiding how and why we make decisions, show up for each other, and win. It’s what makes us MongoDB. To drive the personal growth and business impact of our employees, we’re committed to developing a supportive and enriching culture for everyone. From employee affinity groups, to fertility assistance and a generous parental leave policy, we value our employees’ wellbeing and want to support them along every step of their professional and personal journeys. Learn more about what it’s like to work at MongoDB, and help us make an impact on the world! MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request an accommodation due to a disability, please inform your recruiter. MongoDB, Inc. provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type and makes all hiring decisions without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Req ID: 1273369056 MongoDB’s base salary range for this role is posted below. Compensation at the time of offer is unique to each candidate and based on a variety of factors such as skill set, experience, qualifications, and work location. Salary is one part of MongoDB’s total compensation and benefits package. Other benefits for eligible employees may include: equity, participation in the employee stock purchase program, flexible paid time off, 20 weeks fully-paid gender-neutral parental leave, fertility and adoption assistance, 401(k) plan, mental health counseling, access to transgender-inclusive health insurance coverage, and health benefits offerings. Please note, the base salary range listed below and the benefits in this paragraph are only applicable to U.S.-based candidates. MongoDB’s base salary range for this role in the U.S. is: $90,000—$176,000 USD
Wonolo (Work Now Locally) is disrupting the $140BN temporary staffing industry. Founded in 2014, Wonolo's mission is to help people find consistent work. Through our two-sided tech marketplace, we connect hundreds of businesses in need of front-line workers with 2 million underemployed workers in local markets across the United States, within minutes. Wonolo has raised over $200M in funding which will continue to help us empower the in-demand workforce by democratizing access to flexible work, opportunities to learn new skills, a living wage, and comprehensive portable benefits and perks. We are looking for an experienced Senior DevOps Engineer to join our team. In this role, you will help define and evolve the infrastructure architecture, operational standards, and engineering best practices that support our development teams. This role will have a broad impact across infrastructure, automation, CI/CD, observability, and developer enablement. You will work closely with engineers across the organization to improve system reliability, scalability, and efficiency, while helping build a strong foundation for sustainable growth. The ideal candidate is a collaborative, self-driven problem solver who is comfortable leading technical initiatives, proposing pragmatic improvements, and implementing solutions with a high degree of ownership. You will work extensively with technologies such as Nomad, Kubernetes, Buildkite, Terraform, Ansible, and Datadog to build and operate resilient, scalable, and well-observed systems. We welcome qualified candidates located anywhere in Canada #LI-Remote. What you'll do: - Own the design and execution of projects that strengthen infrastructure, scalability, reliability, and operational maturity across the organization - Establish and drive infrastructure standards, architecture decisions, and engineering best practices across teams - Build and operate the automation that powers provisioning, configuration management, deployments, and core platform workflows - Lead improvements across CI/CD, observability, and developer enablement to increase engineering velocity and system confidence - Take ownership of complex infrastructure and production issues, driving both immediate resolution and sustainable long-term fixes Who you are: - Extensive hands-on experience with AWS - Experience with Terraform and Ansible - Programming or scripting experience in Python or Ruby and Bash - Experience with CI/CD systems such as Buildkite, CircleCI, Jenkins, or similar - Strong knowledge of Docker and container-based systems - Experience with Kubernetes required; experience with Nomad is a strong plus - Strong Linux fundamentals and comfort with shell scripting - Familiarity with Datadog or similar monitoring and observability platforms - Strong troubleshooting and systems thinking skills - A strong preference for automation, standardization, and repeatable processes over manual workarounds - A collaborative mindset and the ability to work independently with a high degree of ownership If you have read up to this point, we hope you are excited about this opportunity to work at Wonolo! Even if your experience does not check every bullet point, we still highly encourage you to apply. The best hires do not always check off every box of a job description. Nice to haves: - Background in platform engineering or developer enablement - Familiarity with Temporal or similar orchestration and workflow frameworks - Hands-on work with PostgreSQL administration, operations, or performance management - Exposure to AI observability, monitoring, or operational support for AI-driven systems Pay Range: The expected pay range for this position is $160,000 - 207,000 CAD per year. Please note that individual total compensation for this position will be determined at the Company's sole discretion and may vary based on several factors, including but not limited to, location, skill level, and years and depth of relevant experience. Additionally, this role is currently eligible to participate in Wonolo’s equity plan as well as a range of health and wellbeing, retirement savings, and other benefits within a holistic total rewards offering. Benefits and Perks: - The opportunity for growth in a mission-driven and well-funded start-up - Meaningful equity - We pay 100% of the medical/dental/vision insurance premiums for you - Generous parental leave plan - Cell phone reimbursement and company laptop - Retirement plans as well as life and disability insurance - Access to no-cost on-demand mental health support, including counselling, mindfulness and meditation, and wellbeing courses - We encourage a healthy work-life balance and offer flexible schedules, an open vacation policy, and the ability to work from anywhere in Canada (no more commutes!) - Team outings, happy hours, company off-sites, and more! About Wonolo: Wonolo is a two-sided job marketplace that serves over 2 million front-line workers, providing them access to flexible and consistent job opportunities across the United States within minutes, at companies such as Peloton, Coca-Cola, Neiman Marcus, Papa John's, and thousands more. We are a remote-first company with 200+ full-time employees, and quickly scaling our team within the United States, Canada, and Latin America. We are well-funded and backed by leading investors including Sequoia Capital, Bain Capital, and Leeds Illuminate, among others. Learn more about us: Wonolo raises $140M to continue supporting over 1 million laborers and front-line workers Wonolo is one of Glassdoor's best tech companies to work for in 2021 Why G2 Venture Partners Invested in Wonolo Yong Kim (CEO) on why he's passionate about empowering the in-demand workforce Commitment to Diversity, Inclusion, Equity, and Belonging Wonolo welcomes you as you and celebrates our collective diversity. We work to serve the underserved, and we are built on the strength of our entire community. We know that diverse backgrounds, perspectives, and experiences make for the best teams, and help to drive high performance. We strive to ensure that our team represents different cultures, perspectives, and backgrounds, as these empower our team to come together to make the best decisions and the biggest impact. Wonolo is an equal-opportunity employer. We work to ensure all people feel supported, empowered, and connected at work. A big part of this effort is through our support for members and allies of Employee Resource Groups. Individuals seeking to work at or with Wonolo are considered without regard to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, sexual orientation, gender identity, or any other protected status under all applicable laws, regulations, and ordinances. Wonolo Privacy Statement By providing your personal information and/or submitting your application, you agree that Wonolo may use your personal information for the purposes of carrying out its recruitment and hiring process, which may include, but is not limited to, reviewing your qualifications, verifying your information, communicating with you about the recruitment process, and retaining your personal data as otherwise needed for recruitment-related activities. Information you provide Wonolo as part of the recruitment process is accessible only to those Wonolo employees and other third-party service providers involved with Wonolo's recruitment, interview, and onboarding process. Wonolo does not disclose your personal information to any third party in a manner that would be considered a sale under applicable laws. By providing your personal information as an applicant for this position or any other position at Wonolo, you agree that your personal data may be transferred and/or disclosed to Wonolo's third-party providers. This may include transfers to servers and databases outside the country where you provided Wonolo with your personal data. Wonolo does not accept agency or consulting resumes. Please do not forward resumes to our job postings, email alias, Wonolo employees or any other organization location. Wonolo is not responsible for any fees related to unsolicited resumes.
• Підтримувати та розвивати гібридну інфраструктуру компанії: Google Cloud, Proxmox VE, Linux-сервери та внутрішні сервіси; • Забезпечувати стабільну роботу серверів і систем: моніторинг, алертинг, резервне копіювання, відновлення, участь у вирішенні інцидентів; • Покращувати CI/CD-процеси та автоматизовувати рутинні технічні задачі; • Супроводжувати Atlassian-продукти: Jira, Bitbucket, Confluence; • Допомагати співробітникам у технічних питаннях: VPN, доступи, облікові записи, внутрішні сервіси; • Впорядковувати та автоматизовувати процеси керування доступами через ролі, групи, політики та відповідні інструменти; • Документувати інфраструктуру, сервіси та ключові технічні процеси.



