Job Closed

This listing is no longer active.

Founded in 1969, ICF is a global advisory and technology services company headquartered in Reston, Virginia. It delivers data-driven solutions across energy, en

Site Reliability Engineer

DevOps EngineerDevOps EngineerOther Remote Senior Company Site

Location

Virginia

Posted

135 days ago

Salary

$108.5K - $184.4K / year

Seniority

Senior

Bachelor Degree5 yrs expEnglishAirflow Groovy Jenkins

Job Description

• Define and maintain SLIs, SLOs, and SLAs for the Internet-based Quality Improvement and Evaluation System (iQIES) application • Performance tuning that will model load scenarios, forecasting capacity, and optimize scaling strategies • Design and optimize the observability stack through New Relic, CloudWatch, and Jenkins CI/CD pipelines • Participate in root cause analysis for operational issues and improve incident response process • Participate in creating, monitoring, and optimizing actionable alerts to respond to issues in a timely manner • Develop tools and scripts • Develop and maintain Jenkins CI/CD pipelines, using declarative Jenkinsfiles and foundational Groovy for pipeline logic and enhancements • Deploy services to Fargate, EKS, Lambda, Airflow, Databases • Manage security groups and access controls • Thoroughly understand fundamentals like security groups, IAM, managing RDS • Apply patch management and hardening practices • Align with DevOps and Technical Leads to ensure overall strategy • Actively participate in releases and product launches with expectation of being online during release windows

Job Requirements

5+ years experience in a software development environment and a Bachelor’s degree; OR 3+ years experience in a software development environment and a Master’s degree
5+ years supporting a high ‑ availability production environment (cloud or on ‑ prem)
3+ years of working in a SRE role in a large scale cloud implementing high availability and scalability
3+ years of experience focused on SRE, DevOps, or Platform Engineering
Must be able to obtain and maintain a public trust clearance
Candidate must reside in the US, be authorized to work in the US, and work must be performed in the US
Must have lived in the US 3 full years out of the last 5 years

Benefits

Reasonable Accommodations are available
Health insurance
401(k) matching

Related Categories

DevOps Engineer

Related Job Pages

DevOps Engineer Jobs in Virginia More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps – Platform

Saipos | Sistema para Restaurante

Tornando o dia a dia do seu restaurante mais simples, ágil e inteligente. 🐿️

DevOps Engineer135 days ago

Full Time RemoteTeam 51-200Since 2017H1B No Sponsor

Company Site LinkedIn

• Plan, implement, and maintain scalable, reliable, and secure infrastructure on AWS (Lambda, ECS, RDS, ElastiCache, CloudWatch, S3, IAM); • Manage and continuously improve CI/CD pipelines using tools such as Bitbucket Pipelines and Jenkins; • Automate infrastructure provisioning and management using tools such as Terraform, CloudFormation, or AWS CDK; • Ensure effective system observability and monitoring with tools like CloudWatch, Prometheus, and Grafana; • Proactively implement infrastructure security practices (DevSecOps), IAM policies, audits, and continuous vulnerability analysis; • Lead technical incident response, conduct root cause analyses, corrective actions, and preventive measures (post-mortem); • Mentor the team and promote a DevOps culture, automation, and continuous improvement of processes and tools used; • Collaborate directly with internal teams to define development and deployment standards aligned with market best practices.

AWS Docker Grafana Jenkins Kubernetes Prometheus Terraform

View details: Senior DevOps – Platform

Brazil

Apply

Job Closed

Site Reliability Engineer

Linus Health

Bringing earlier detection to brain health.

DevOps Engineer135 days ago

Other RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Leverage infrastructure as code (Terraform) to build and maintain complex production and analytics workflows including networking and containerized services. • Rapidly diagnose and resolve faults in system services as part of a 24/7 on-call rotation focused on actionable alerting and eliminating toil. • Improve speed of delivery by developing and maintaining CI/CD pipelines. • Develop infrastructure automation leveraging Terraform, Python and Typescript. • Improve system availability, security, compliance, cost effectiveness and performance. • Estimate work, prioritize tasks, track dependencies, report progress, highlight blockers • Participate in continuous improvement initiatives, advocate for SRE best practices, and stay current with emerging technologies and trends. • Be part of a team where your focus will be on building, measuring, and refining the systems infrastructure that runs our software.

AWS Java Python Terraform TypeScript

View details: Site Reliability Engineer

United States

Apply

Job Closed

Dev Ops Engineer, Level 5

Scratch Financial

Scratch Financial is the world's simplest patient financing solution.

DevOps Engineer135 days ago

Other RemoteTeam 11-50Since 1912H1B Sponsor

Company Site LinkedIn

• Participates as a technical expert providing advanced knowledge in vendor devices and management systems • Plans and directs development teams and troubleshoots internal application issues • Provides technical solutions for network engineering and operational problems • Interfaces with vendors and engineering organizations • Provides leadership to Network Engineers and the CIEC Development team

AWS Azure Java Perl Python Ruby

View details: Dev Ops Engineer, Level 5

Maryland

$92.1K - $216.0K / year

Apply

Job Closed

Engineering Manager – Infrastructure/SRE

Dealfront

DevOps Engineer135 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Build, lead, and develop a high-performing team of Site Reliability Engineers responsible for our hybrid cloud infrastructure in AWS, with an on-premise extension in Hetzner . • Design, document, and lead the implementation of reliable and secure infrastructure solutions following industry best practices. • Oversee technical analysis, cost estimation and optimization, platform and system design, architectural compliance, resource planning, and delivery milestones. • Engage in hands-on technical work alongside the team to maintain deep understanding of the infrastructure, and lead incident response during critical issues. • Define team goals and strategy, building strong relationships with internal stakeholders across the organisation. • Manage and coordinate the on-call rotation, including escalation processes, across infrastructure and software engineering teams. • Champion engineering best practices and drive continuous improvement in production environment quality and reliability.

AWS Kubernetes Terraform

View details: Engineering Manager – Infrastructure/SRE

Germany

Apply

Job Closed

Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevOps – Platform

Site Reliability Engineer

Dev Ops Engineer, Level 5

Engineering Manager – Infrastructure/SRE