Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 10,001+Since 2013H1B SponsorCompany SiteLinkedIn

Location

Illinois

Posted

26 days ago

Salary

$109.5K - $208.5K / year

Seniority

Senior

Job Description

Senior Site Reliability Engineer

AbbVie

• Architect new and existing systems to enhance performance, reliability, and scalability • Build, implement, iterate over CI/CD pipelines • Assist with the Management, Development, Design, and Deployment of microservice and containerized applications • Implement strong security controls in distributed systems/agents • Coordinate with engineers and developers to automate deployments and configurations across various platforms • Abstract the complexity of Observability implementation by writing scalable automation • Identify opportunities for improvement around observability and process • Standardization and development of alerts/notifications and response to monitoring tools • Work alongside application teams to implement Observability in day-to-day operations • Contribute to post-mortems and provide root cause analysis and implementation of resulting action items • Promote DevOps best-practices within the team • Participate and promote Agile/Scrum • Contribute to hybrid cloud production containerization service offering • Design and implement standards, policies, and procedures for automation and integrations • Working alongside application subject matter experts, learn our toolsets and suggest/implement new features to streamline operations

Job Requirements

  • Bachelor’s Degree with 7 years’ experience; Master’s Degree with 6 years’ experience; PhD with 2 years’ experience
  • Treat best practices for security as a requirement, not an afterthought
  • Knowledge of Cloud Platform administration (AWS, GCP, Azure)
  • Familiarity with Observability pillars
  • Experience in working in high-scale environments and understanding of distributed architectures
  • Knowledge of Agile / DevOps methodologies
  • Experience with CI/CD tools (Github Actions, Bamboo, Jenkins, Azure DevOps)
  • Familiarity with running docker workloads using orchestration tools (Kubernetes / Amazon ECS)
  • Ability to work both independently without direction and within a group for day-to-day activities
  • Passion for learning new concepts and processes quickly, and adapting to a changing environment
  • Comfortable working in and administering Linux and Windows environments
  • Preferred: Exposure and implementation of SPIRE/SPIFFE
  • Direct experience with Terraform/Crossplane
  • Proficiency working with development tools and scripting languages (git / mercurial / subversion; Python / Elixir / Go)
  • Integrating MCP Servers with authorization controls
  • Knowledge of database management systems (NoSQL, Relational Databases, and associated query languages)
  • AWS Cloud Practitioner / Azure AZ-900 Certification
  • Deep experience in implementation and design of serverless architecture solutions
  • Demonstrated experience in deployment of containerized applications (Kubernetes, etc)
  • Experience with data management and pipeline technologies (Apache Storm, Kafka, Flink, Spark, Hadoop, etc)
  • Prior experience working in an Agile team
  • Solid understanding of observability solutions using OpenTelemetry, Prometheus/Grafana or similar application
  • Excellent understanding of distributed system architectures and telemetry
  • Excellent Experience in Deploying and managing large Kubernetes Distributed Platforms
  • Proficiency in GitOps practices and Infrastructure as Code systems (such as Terraform, ArgoCD, Helm)

Benefits

  • paid time off (vacation, holidays, sick)
  • medical/dental/vision insurance
  • 401(k) to eligible employees
  • short-term incentive programs

Related Categories

Related Job Pages

More DevOps Engineer Jobs

ContractRemoteTeam 5,001-10,000H1B No Sponsor

• Design, deploy, and manage scalable infrastructure solutions on Google Cloud Platform (GCP), ensuring high availability and performance • Develop and maintain CI/CD pipelines to automate application deployment and testing processes • Implement Infrastructure as Code (IaC) solutions using tools such as Terraform to manage cloud resources efficiently • Manage containerized applications using Kubernetes and Docker, optimizing deployment strategies • Monitor system performance, troubleshoot issues, and implement proactive solutions to prevent downtime • Establish and maintain comprehensive logging, monitoring, and alerting systems for cloud infrastructure • Collaborate with development teams to understand application requirements and provide technical guidance on cloud best practices • Implement and enforce cloud security policies, compliance standards, and disaster recovery procedures • Automate routine operational tasks using scripting languages such as Python, Bash, or Go • Document infrastructure architecture, processes, and procedures to ensure organizational knowledge retention • Participate in on-call rotation to provide timely incident response and resolution • Continuously analyze and optimize cloud costs and resource utilization

Portugal
DevOps Engineer26 days ago
Full TimeRemoteTeam 5,001-10,000Since 1995H1B No Sponsor

• Implementar e evoluir infraestrutura em Azure utilizando boas práticas de automação e escalabilidade. • Atuar com Infraestrutura como Código utilizando Terraform, garantindo padronização e reutilização. • Construir e manter pipelines de CI/CD utilizando Azure DevOps e YAML. • Gerenciar serviços Azure como Container Apps, APIM, Key Vault, Application Insights e Data Lake. • Aplicar boas práticas de FinOps, incluindo controle de custos, tagging e otimização de recursos. • Implementar políticas de segurança utilizando Managed Identity, RBAC e princípios de zero-trust. • Apoiar times no uso eficiente da plataforma cloud, promovendo boas práticas e padronização. • Monitorar e otimizar ambientes garantindo performance, confiabilidade e segurança.

Brazil
Full TimeRemoteTeam 1,001-5,000

Role Description We are seeking a strategic and results-oriented Site Reliability Engineer (Golden Signals Lead) to define and drive the observability roadmap across all platforms. - Job Title: Site Reliability Engineer - Location: Remote, In-office, or Hybrid - Department: IT Operations - Reports To: Manager of Observability & Reliability - Job Type: Full-Time Employee (FTE) This role is responsible for establishing a consistent and scalable approach to monitoring and alerting, leveraging golden signals to enhance system reliability and operational efficiency. The successful candidate will collaborate closely with the ZEIT SRE team, engineering leads, and India-based resources to build a unified observability strategy aligned with organizational goals. Key Responsibilities - Observability Roadmap Development: - Define a unified vision for observability across all platforms, with golden signals as the foundation for monitoring and alerting. - Develop and maintain a comprehensive roadmap to improve observability, reduce tool redundancy, and standardize practices across platforms. - Establish and track key performance indicators (KPIs) to measure progress and ensure accountability for roadmap milestones. - Collaboration and Alignment: - Partner with the ZEIT SRE team and engineering leads to break down silos and promote consistent observability practices. - Drive cross-platform collaboration to reduce operational inconsistencies and define a 'north star' approach for observability. - Facilitate knowledge sharing to ensure alignment on current and future observability initiatives. - Monitoring and Alerting: - Standardize the implementation of golden signals across applications to improve system reliability and incident detection. - Optimize alerting tools and reduce redundant or ineffective monitoring interfaces ('panes of glass'). - Lead efforts to enhance observability while minimizing operational overhead for platform teams. - Maintain and enhance observability dashboards, delivering actionable insights into application health and performance. - Operational Support and Improvement: - Identify and address gaps in existing observability practices, prioritizing long-term scalability and reliability. - Collaborate with India-based resources to execute observability build-outs efficiently and with high quality. - Reduce client, provider, and print facility-raised issues through proactive monitoring and early detection. - Reporting and Continuous Improvement: - Measure and report on observability success metrics, including actionable alert volume and reduced issue escalations. - Continuously evaluate and refine observability strategies based on stakeholder feedback and evolving organizational needs. Qualifications - Educational Background: - Bachelor’s degree in Computer Science, Information Technology, or a related field (or equivalent experience). - Experience: - Minimum of 5 years of experience in Site Reliability Engineering, DevOps, or a related role with a strong focus on observability. - 5+ years of hands-on experience with .NET (C#), including advanced knowledge of ASP.NET Core, Web APIs, and performance optimization. - Demonstrated success in designing and implementing monitoring and alerting solutions across complex IT environments. - Technical Skills: - Deep understanding of SRE principles and golden signals for system monitoring. - Proficiency with observability tools such as Prometheus, Grafana, Splunk, New Relic, or Datadog. - Familiarity with cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes). - Advanced proficiency in scripting languages such as PowerShell. - Experience in front-end development using React.js. - Advanced knowledge of .NET. - Soft Skills: - Strong leadership and collaboration abilities, with a proven ability to align diverse teams toward common goals. - Excellent analytical and problem-solving skills, with a proactive approach to identifying and resolving issues. - Clear and effective communication skills, capable of conveying technical concepts to stakeholders at all levels. Preferred Qualifications - Experience with building observability roadmaps and scaling solutions in enterprise environments. - Certifications in cloud or DevOps-related disciplines (e.g., AWS Certified DevOps Engineer, Kubernetes Administrator). Location and Workplace Flexibility We have offices in Atlanta GA, Boston MA, Morristown NJ, Plano TX, St. Louis MO, St. Petersburg FL, and Hyderabad, India. We foster a hybrid and remote friendly culture, and all our employee's work locations are based on the needs of the position and determined by the Leadership team. In-office work and activities, if applicable, vary based on the work and team objectives in accordance with Company policies. Equal Employment Opportunity Zelis is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. We welcome applicants from all backgrounds and encourage you to apply even if you don’t meet 100% of the qualifications for the role. We believe in the value of diverse perspectives and experiences and are committed to building an inclusive workplace for all. Accessibility Support We are dedicated to ensuring our application process is accessible to all candidates. If you are a qualified individual with a disability or a disabled veteran and require a reasonable accommodation with any part of the application and/or interview process, please email TalentAcquisition@zelis.com. Disclaimer The above statements are intended to describe the general nature and level of work being performed by people assigned to this classification. They are not to be construed as an exhaustive list of all responsibilities, duties, and skills required of personnel so classified. All personnel may be required to perform duties outside of their normal responsibilities, duties, and skills from time to time.

Florida
Celestica International LP logo

Senior Devops AI tools Engineer

Celestica International LP

Celestica (NYSE, TSX: CLS) enables the world’s best brands. Through our recognized customer-centric approach, we partner with leading companies in Aerospace and Defense, Communications, Enterprise, HealthTech, Industrial, Capital Equipment and Energy to deliver solutions for their most complex challenges. Leader in design, manufacturing, hardware platform and supply chain solutions Global expertise and insight at every stage of product development Headquartered in Toronto, with talented teams spanning 40+ locations in 13 countries

DevOps Engineer27 days ago
Full TimeRemoteTeam 10,001

Role Description The DevTestOps engineer handles daily requests from the engineering team. This includes answering engineering support questions, granting tool access, and customizing Azure DevOps pipelines. DevTestOps engineers may also investigate infrastructure issues, manage VMs, explore new AI Tools and develop new automated engineering pipelines. - CI/CD Management: Architecting and managing robust, scalable, and secure pipelines to support diverse applications. - Automation: Developing high-quality Python scripts and tools to automate build, testing, and deployment processes. - Monitoring: Designing custom dashboards to provide real-time visibility across different tools. - Troubleshooting: Resolving complex technical issues across the full stack, including application bugs and infrastructure failures. - Collaboration: Working with software development, hardware engineering, QA, IT, manufacturing, and other cross-functional teams to streamline workflows. Qualifications - Experience: 5 to 10 years of relevant work. - Technical Skills: Strong coding proficiency in one of programming languages, like Python, SQL, Scripting languages. - Knowledge: Google Enterprise Tools, Azure DevOps (or Git/Jenkins), Jira, qTest, VM Management, Black Duck, and build tools. - Foundations: Solid understanding of data structures, algorithms, and operating systems, and large software development process. - Soft Skills: Analytical mindset with a service-oriented mentality focused on keeping customers happy. Requirements - Duties of this position are performed in a normal office environment. - Duties may require extended periods of sitting and sustained visual concentration on a computer monitor or on numbers and other detailed data. - Repetitive manual movements (e.g., data entry, using a computer mouse, using a calculator, etc.) are frequently required. - Occasional travel may be required. Benefits - Salary range: 101K - 150K, which includes Base Salary and target Short-Term Incentive (STI) compensation. - A comprehensive benefits package is offered in addition to this range. Company Description Celestica, Inc. (NYSE: CLS; TSX: CLS) is a technology leader dedicated to driving customer success and market advancements. With deep expertise in design, engineering, manufacturing, supply chain, and platform solutions, Celestica enables critical data center infrastructure for AI, cloud, and hybrid cloud and advances technologies in high-growth markets. - ATS: Serves customers in complex, regulated and high-reliability markets such as Industrial & Smart Energy, Aerospace & Defense, Semiconductor Capital Equipment, and HealthTech. - CCS: Focuses on high-performance technology solutions and services for the data center, serving hyperscalers, digital native customers and enterprises.

United States
$101K - $150K / year