Reddit, Inc. logo
Reddit, Inc.

Dive into anything

Staff Site Reliability Engineer – Site Experience

DevOps EngineerDevOps EngineerFull TimeRemoteLeadTeam 501-1,000Since 2005H1B No SponsorCompany SiteLinkedIn

Location

United Kingdom

Posted

9 days ago

Salary

0

Seniority

Lead

Postgraduate Degree8 yrs expEnglishCloudDistributed SystemsLinuxPythonGo

Job Description

Staff Site Reliability Engineer – Site Experience

Reddit, Inc.

• Lead Reliability Engineering for User Experience • Drive reliability, scalability, and operational excellence for critical user facing systems and services. Improve performance and resiliency across APIs, content delivery, feed generation, search, messaging, and real-time experiences. • Partner with product and infrastructure engineering teams to design systems that remain highly available and performant under massive global load. Guide architectural decisions around failover, redundancy, graceful degradation, traffic management, and capacity planning. • Identify systemic risks and reliability bottlenecks across services, dependencies, deployments, and infrastructure. Build proactive mitigation strategies and drive engineering improvements that reduce incidents and improve service health. • Eliminate repetitive operational work through automation and tooling. Build systems that improve deployment safety, incident response, remediation workflows, and reliability guardrails • Lead complex incident response efforts across engineering teams. Drive blameless postmortems, identify root causes, and ensure sustainable long-term fixes are implemented. • Define and champion best practices around reliability engineering, SLIs/SLOs, capacity management, release engineering, and operational maturity across the company. • Provide technical leadership and mentorship to engineers across SRE and software engineering teams. Help shape reliability culture and raise the operational excellence bar across the organization.

Job Requirements

  • 8+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or related roles operating large scale distributed systems.
  • Strong collaboration and communication skills with the ability to influence technical direction across teams.
  • Strong experience supporting high traffic, user facing production environments.
  • Deep understanding of one or more: distributed systems, networking, Linux systems, cloud native architectures.
  • Experience designing highly available systems with strong operational and reliability practices.
  • Strong programming skills in languages such as Go, Python, or similar.
  • Strong understanding of observability systems including metrics, logging, tracing, and alerting.
  • Experience improving reliability through SLOs, automation, incident management, and performance optimization.
  • Demonstrated ability to troubleshoot complex issues across applications, infrastructure, networking, and services.

Benefits

  • Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Group Personal Pension Scheme with Employer match
  • Private Medical and Dental Scheme
  • Income Replacement Programs
  • Bike to Work scheme
  • Flexible Vacation & Paid Volunteer Time Off
  • Generous Paid Parental Leave

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Lead Software - DevOps Engineer

UnitedHealth Group

UnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of

DevOps Engineer9 days ago

Title: Lead Software/ DevOps Engineer - Remote (EST/CST) Location: Basking Ridge NJ United States Job Description: Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. You'll enjoy the flexibility to work telecommute * from anywhere within the U.S. as you take on some tough challenges. Willingness to work standard business hours in EST, CST, or MST required. Primary Responsibilities: - Collaborating with software developers, engineers, and operations teams - Monitoring sites and software to make sure they're performing properly (including on-call shifts) - Anticipating potential problems before they occur (and coming up with solutions) - Conducting post-incident reviews - Documenting your work to turn findings into repeatable actions - Hands on Developer - Mentoring and coaching junior engineers - Conduct regular system audits and capacity planning exercises to identify areas for improvement and ensure readiness for future growth - Participate in on-call rotations and respond to incidents in a timely manner, ensuring quick resolution and effective communication with stakeholders - Establish and maintain best practices for monitoring, logging, and alerting using tools like Datadog, Prometheus, and Grafana - Configure and maintain services such as load balancers, relational & NoSQL databases, and messaging systems while ensuring high availability and performance - Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI What are the reasons to consider working for UnitedHealth Group? Put it all together - competitive base pay, a full and comprehensive benefit program, performance rewards, and a management team who demonstrates their commitment to your success. Some of our offerings include: - Paid Time Off which you start to accrue with your first pay period plus 8 Paid Holidays - Medical Plan options along with participation in a Health Spending Account or a Health Saving account - Dental, Vision, Life& AD&D Insurance along with Short-term disability and Long-Term Disability coverage - 401(k) Savings Plan, Employee Stock Purchase Plan - Education Reimbursement - Employee Discounts - Employee Assistance Program - Employee Referral Bonus Program - Voluntary Benefits (pet insurance, legal insurance, LTC Insurance, etc.) You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear directions on what it takes to succeed in your role as well as provide development for other roles you may be interested in. Required Qualifications: - Bachelor's degree in CS or IT or engineering related field - 10+ years of experience in object-oriented programming language JAVA</li> - 5+ years of experience as a Lead Software Engineer, DevOps Engineer or in IT Operations - 3+ years of experience with any one public cloud platform like AWS or Azure or GCP - 2+ years of experience with container technologies like Docker and Kubernetes - 1+ years of experience with automation and scripting tools such as Python, Bash, PowerShell, and Perl Preferred Qualifications: - Excellent communication and interpersonal skills, with the ability to work collaboratively with development teams, stakeholders, and management - Experience in problem-solving skills on complex technical issues and a proactive attitude towards identifying and addressing potential issues - Experience with public cloud platforms, hybrid cloud environments, and migration strategies - Experience with REST API design, micro services, and event driven architecture - Experience with configuration and deployment management tools such as Ansible, Terraform - Experience with configuration and maintenance of services such as load balancers, relational & NoSQL databases, and messaging systems - Experience in monitoring and alerting tools such as Datadog, Prometheus, and Grafana - Experience with incident response and post-mortem analysis - Demonstrated excellent communication and interpersonal skills, with the ability to work collaboratively with development teams, stakeholders, and management - All Telecommuters will be required to adhere to UnitedHealth Group's Telecommuter Policy. Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $112,700 to $193,200 annually based on full-time employment. We comply with all minimum wage laws as applicable. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants. At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location, and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups, and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission. UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations. UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment. #RPO #GREEN

Worldwide
$112.7K - $193.2K / year
Full TimeRemoteTeam 5,001-10,000H1B Sponsor

• Tuning systems to optimize performance and to operate more reliably • Providing ongoing technical assistance in areas including model database management, configuration management, and simulation runs • Managing the rollout and activation of new features and platform changes • Developing monitoring tools and automate processes to help scale our systems better • Troubleshooting complex application issues, service incidents, performance and availability issues • Providing expertise developing code that provides predictive results from analytical trending and modeling

India

AWS DevOps Engineer

Miratech

Miratech helps visionaries change the world. We are a global IT services and consulting company that brings together enterprise and start-up innovation. Today, we support digital transformation for some of the world's largest enterprises. By partnering with both large and small players, we stay at the leading edge of technology, remain nimble even as a global leader, and create technology that helps our clients further enhance their business. We are a values-driven organization and our culture of Relentless Performance has enabled over 99% of Miratech's engagements to succeed by meeting or exceeding our scope, schedule, and/or budget objectives since our inception in 1989. Miratech has coverage across 5 continents and operates in 25 countries around the world. Miratech retains nearly 1000 full-time professionals, and our annual growth rate exceeds 25%.

DevOps Engineer9 days ago

Role Description We are seeking a skilled and experienced DevOps Engineer with 4+ years of experience to join our dynamic team. The ideal candidate will be responsible for designing, deploying and managing AWS cloud infrastructure while ensuring scalability, reliability and security across multiple environments. This role involves building and maintaining Infrastructure as Code (IaC) using Terraform Enterprise, hosted on GitHub Enterprise and integrated with robust CI/CD pipelines. - Deploy, manage, and maintain AWS infrastructure across development, staging, and production environments - Work with AWS services including Amazon Connect, Lambda, S3, EventBridge and Data Bridges - Build and maintain scalable, reusable and secure Infrastructure as Code (IaC) using Terraform Enterprise - Develop, implement and manage CI/CD pipelines for automated application and infrastructure deployments - Collaborate with cross-functional teams to ensure highly available, secure and performant cloud solutions - Monitor, troubleshoot and optimize cloud infrastructure and deployment processes - Maintain clean, well-documented and reusable infrastructure code aligned with best practices and organizational standards - Participate in code reviews and contribute to infrastructure design discussions Qualifications - 4+ years of experience in DevOps, Cloud Engineering or Platform Engineering - Strong hands-on experience with AWS services and cloud infrastructure - 1+ year of experience in Python scripting/automation - Expertise in Terraform Enterprise and Infrastructure as Code (IaC) principles - Experience with CI/CD tools such as Jenkins, GitHub Actions, or similar platforms - Strong understanding of Git/GitHub workflows and version control best practices - Experience with cloud infrastructure deployment and automation strategies Requirements - AWS Certifications (Solutions Architect, DevOps Engineer or equivalent) - Familiarity with Agile methodologies and tools such as Jira - Experience with monitoring and logging tools such as CloudWatch, Grafana or similar solutions - Understanding of security best practices in cloud environments Benefits - Culture of Relentless Performance: join an unstoppable technology development team with a 99% project success rate and more than 30% year-over-year revenue growth. - Competitive Pay and Benefits: enjoy a comprehensive compensation and benefits package, including health insurance, language courses, and a relocation program. - Work From Anywhere Culture: make the most of the flexibility that comes with remote work. - Growth Mindset: reap the benefits of a range of professional development opportunities, including certification programs, mentorship and talent investment programs, internal mobility and internship opportunities. - Global Impact: collaborate on impactful projects for top global clients and shape the future of industries. - Welcoming Multicultural Environment: be a part of a dynamic, global team and thrive in an inclusive and supportive work environment with open communication and regular team-building company social events. - Social Sustainability Values: join our sustainable business practices focused on five pillars, including IT education, community empowerment, fair operating practices, environmental sustainability, and gender equality.

Worldwide
ContractRemoteTeam 11-50H1B No Sponsor

• Your responsibilities as a technical mentor will be divided into three major tasks. • 1:1 video calls • Virtual Technical Deep-Dive Sessions • Group Q&A sessions • Provide personalized support by addressing learner questions related to course content and projects. • Prior to the call mentors are expected to review the students' previous project submission(s) - provided by the Mentor Success Team when available - and/or any specific areas of the Nanodegree content or project that the student has expressed confusion or difficulty with. • For concept deep-dives, mentors will come prepared with a slide presentation that reviews a Nanodegree concept in more detail, share different use-cases for the concept to broaden student perspectives and understanding, and hold a Q&A session with the learners. • For project walkthroughs, a mentor will come prepared with a slide presentation that outlines each element of the rubric to help learners understand expectations and typical problem areas they may encounter. This can include a mentor “grading” a sample submission to demonstrate what mentors are looking for when reviewing projects. • Host regular sessions (via Slack or video) to address learner queries related to projects and coursework. No prior preparation required.

United States