Site Reliability Engineer (SRE)

Location

United States

Posted

3 days ago

Salary

$100K - $150K / year

Seniority

Mid Level

No structured requirement data.

Job Description

Site Reliability Engineer (SRE)

Bright Vision Technologies

Role Description We are seeking an experienced Site Reliability Engineer to ensure the availability, performance, and operational excellence of large-scale distributed systems in production. As an SRE you will live at the boundary between development and operations, applying strong software engineering principles to infrastructure and operations problems, and continually pushing the platform toward higher reliability with lower operational toil. The ideal candidate will combine deep systems knowledge with strong programming skills, a measurement-driven mindset, and the discipline to design, automate, and operate complex services so that reliability becomes a first-class engineering deliverable rather than a reactive concern. Key Responsibilities - Define, instrument, and continually refine service-level objectives (SLOs), service-level indicators (SLIs), and error budgets for critical services. - Lead incident response and resolution for production issues, ensuring high-quality post-incident reviews. - Design and implement comprehensive monitoring, logging, and tracing strategies. - Build and maintain robust on-call processes, runbooks, and escalation paths. - Automate operational toil aggressively by writing production-grade tooling. - Architect and operate large-scale Kubernetes clusters and container-based workloads. - Design CI/CD pipelines that promote safe, frequent, and observable releases. - Lead capacity planning and performance engineering activities. - Partner closely with application development teams to embed reliability practices early in design. - Strengthen the platform’s resiliency through chaos engineering and fault injection. - Drive continuous improvement of security posture in collaboration with security teams. - Contribute to the technical roadmap for reliability tooling and observability platforms. - Mentor engineers across the organization on SRE practices. Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related technical discipline. - Five or more years of SRE, DevOps, or production engineering experience supporting large-scale distributed systems. - Strong programming skills in at least one of Python, Go, or Java. - Deep, hands-on experience operating Linux at scale. - Production experience operating Kubernetes and container-based workloads. - Strong working knowledge of observability tooling. - Hands-on experience designing and operating CI/CD pipelines. - Solid understanding of distributed system design. - Demonstrated experience leading incident response and conducting effective post-incident reviews. - Excellent communication and documentation skills. Preferred Qualifications - Experience defining and operationalizing SLOs and error budgets in real production environments. - Exposure to chaos engineering practices and tools. - Hands-on experience with at least one major cloud platform (AWS, Azure, or GCP). - Background in capacity planning, performance engineering, or large-scale load testing. - Familiarity with service mesh technologies. How to Apply For immediate consideration, please send your resume to [email protected] or contact us at (908) 505-3545. Learn more about Bright Vision Technologies at www.bvteck.com .

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Blend360 logo

DevOps Engineer

Blend360

Optimizing business performance through people, data, tech & analytics

DevOps Engineer3 days ago
Full TimeRemoteTeam 501-1,000H1B Sponsor

• Design and implement infrastructure as code (IaC) following established architecture patterns and standards. • Develop reusable modules using Terraform and/or AWS CDK for services such as Amazon Web Services Glue, S3, Lake Formation, IAM, Step Functions, and EventBridge. • Build and maintain CI/CD pipelines for automated testing, linting, and deployments across development, staging, and production environments. • Implement monitoring and observability solutions, including dashboards, alerts, and automated notifications using CloudWatch. • Manage environment provisioning, release orchestration, and deployment automation. • Collaborate with engineering and platform teams to ensure infrastructure reliability, scalability, and operational excellence.

Argentina
Unit4 logo

Cloud Operations Engineer

Unit4

The Next-Generation in Smart Enterprise Resource Planning.

DevOps Engineer3 days ago
Full TimeRemoteTeam 1,001-5,000Since 1980H1B No Sponsor

• Problem-solving customer business processing issues • Building better solutions for customers • Learning market skills such as Azure, DevOps, troubleshooting, virtualization, operating systems, and the application stack • Collaborating with a team of experts for mentorship and guidance

Portugal
Azapi Solutions logo

DevOps, Cloud

Azapi Solutions

A Azapi Solutions é uma empresa apaixonada por desafios e comprometida em impulsionar o sucesso profissional de seus colaboradores! Somos uma renomada empresa de consultoria especializada em tecnologia da informação, com uma equipe ampla e altamente qualificada de consultores. Desenvolvemos projetos abrangentes em áreas como desenvolvimento de software, Business Intelligence, SAP e outras tecnologias avançadas. Oferecemos uma variedade de soluções para atender às necessidades específicas do setor de TI. Estamos em busca de talentos excepcionais espalhados pelo mundo! Se você é uma pessoa qualificada e determinada a alcançar o sucesso profissional, esta é a sua oportunidade de decolar na carreira!

DevOps Engineer3 days ago
ContractRemoteTeam 51-200Since 2018H1B No Sponsor

• Work on an international project focused on automation, CI/CD and containerized environments • Build and maintain CI/CD pipelines • Collaborate with teams to ensure the delivery of efficient solutions

Portugal
Full TimeRemoteTeam 201-500H1B No Sponsor

Role Description As a Senior DevOps Engineer II, you will play a crucial role in ensuring the reliability, scalability, and efficiency of our de-coupled Drupal systems. You will collaborate with cross-functional teams to enhance content deployment processes, optimize system performance, and contribute to the overall success of our digital solutions. What you’ll do: - Proactively maintain and enhance de-coupled Drupal systems to ensure optimal performance and reliability. - Identify and implement improvements to streamline deployment processes. - Rapidly respond to and troubleshoot system issues, ensuring minimal downtime and maximum availability. Qualifications - At least 8+ years of experience as a DevOps Engineer, or similar role. - Proven experience leading a team of DevOps Engineers with a focus on maintaining and enhancing de-coupled Drupal systems for large government organizations. - You employ strong interpersonal and communication skills, with a proven ability to influence and build consensus across a broad range of backgrounds, organizational levels and personalities. - You have a developer background and insider understanding how DevOps integrates with developers. - You have a strong track record of incubating/building DevSecOps capabilities, API platform strategies, driving platform execution with engineering and launching cloud solutions. - Strong background in Site Reliability Engineering (SRE) principles and practices. - Expertise in AWS services and infrastructure optimization. - Experience scaling very large statically-generated websites using technologies like Next.js, Gatsby, or Hugo. - Experience integrating statically-generated websites with database-driven backends. - Experience working with Drupal as a decoupled CMS. - A strong background working with the challenges of integrating many disparate systems with each other. - Ability to obtain and maintain a Public Trust, which requires United States Citizenship. Requirements - SALARY RANGE: $165,000 - $170,000 - The salary range for this position is determined based on qualifications, skills, and relevant experience. - The final salary offered will be determined based on several factors including: - The candidate's professional background and relevant work experience. - The specific responsibilities of the role and organizational needs. - Internal equity and alignment with current team compensation. - This role is also eligible for additional compensation, subject to the terms and policies of MetroStar, which may include: - Performance-based bonuses. - Company-paid training and/or certifications. - Referral bonuses. Benefits - Health, dental, and vision insurance. - 401(k) retirement plan with company match. - Paid time off (PTO) and holidays. - Parental Leave and dependent care. - Flexible work arrangements. - Professional development opportunities. - Employee assistance and wellness programs. Application Process To apply for this position, please submit your resume via the form below or through our careers page: MetroStar Careers Application Deadline: Applications will be accepted on a rolling basis until the position is filled; candidates are encouraged to apply as early as possible for full consideration. Commitment to Non-Discrimination All qualified applicants will receive consideration for employment based on merit and without regard to sex, race, ethnicity, age, national origin, citizenship, religion, physical or mental disability, medical condition, genetic information, pregnancy, family structure, marital status, ancestry, domestic partner status, sexual orientation, gender identity or expression, veteran or military status, status as a protected veteran, or any other status protected by applicable federal, state, local, or international law. Additional Information In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire. Not ready to apply now? Sign up to join our newsletter here .

United States
$165K - $170K / year