Job Closed

This listing is no longer active.

ICF logo
ICF

Founded in 1969, ICF is a global advisory and technology services company headquartered in Reston, Virginia. It delivers data-driven solutions across energy, en

Site Reliability Engineer

Location

Virginia

Posted

135 days ago

Salary

$108.5K - $184.4K / year

Seniority

Senior

Bachelor Degree5 yrs expEnglishAirflowGroovyJenkins

Job Description

Site Reliability Engineer

ICF

• Define and maintain SLIs, SLOs, and SLAs for the Internet-based Quality Improvement and Evaluation System (iQIES) application • Performance tuning that will model load scenarios, forecasting capacity, and optimize scaling strategies • Design and optimize the observability stack through New Relic, CloudWatch, and Jenkins CI/CD pipelines • Participate in root cause analysis for operational issues and improve incident response process • Participate in creating, monitoring, and optimizing actionable alerts to respond to issues in a timely manner • Develop tools and scripts • Develop and maintain Jenkins CI/CD pipelines, using declarative Jenkinsfiles and foundational Groovy for pipeline logic and enhancements • Deploy services to Fargate, EKS, Lambda, Airflow, Databases • Manage security groups and access controls • Thoroughly understand fundamentals like security groups, IAM, managing RDS • Apply patch management and hardening practices • Align with DevOps and Technical Leads to ensure overall strategy • Actively participate in releases and product launches with expectation of being online during release windows

Job Requirements

  • 5+ years experience in a software development environment and a Bachelor’s degree; OR 3+ years experience in a software development environment and a Master’s degree
  • 5+ years supporting a high ‑ availability production environment (cloud or on ‑ prem)
  • 3+ years of working in a SRE role in a large scale cloud implementing high availability and scalability
  • 3+ years of experience focused on SRE, DevOps, or Platform Engineering
  • Must be able to obtain and maintain a public trust clearance
  • Candidate must reside in the US, be authorized to work in the US, and work must be performed in the US
  • Must have lived in the US 3 full years out of the last 5 years

Benefits

  • Reasonable Accommodations are available
  • Health insurance
  • 401(k) matching

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Saipos | Sistema para Restaurante logo

Senior DevOps – Platform

Saipos | Sistema para Restaurante

Tornando o dia a dia do seu restaurante mais simples, ágil e inteligente. 🐿️

DevOps Engineer135 days ago
Full TimeRemoteTeam 51-200Since 2017H1B No Sponsor

• Plan, implement, and maintain scalable, reliable, and secure infrastructure on AWS (Lambda, ECS, RDS, ElastiCache, CloudWatch, S3, IAM); • Manage and continuously improve CI/CD pipelines using tools such as Bitbucket Pipelines and Jenkins; • Automate infrastructure provisioning and management using tools such as Terraform, CloudFormation, or AWS CDK; • Ensure effective system observability and monitoring with tools like CloudWatch, Prometheus, and Grafana; • Proactively implement infrastructure security practices (DevSecOps), IAM policies, audits, and continuous vulnerability analysis; • Lead technical incident response, conduct root cause analyses, corrective actions, and preventive measures (post-mortem); • Mentor the team and promote a DevOps culture, automation, and continuous improvement of processes and tools used; • Collaborate directly with internal teams to define development and deployment standards aligned with market best practices.

Brazil
Job Closed
Linus Health logo

Site Reliability Engineer

Linus Health

Bringing earlier detection to brain health.

DevOps Engineer135 days ago
OtherRemoteTeam 51-200H1B No Sponsor

• Leverage infrastructure as code (Terraform) to build and maintain complex production and analytics workflows including networking and containerized services. • Rapidly diagnose and resolve faults in system services as part of a 24/7 on-call rotation focused on actionable alerting and eliminating toil. • Improve speed of delivery by developing and maintaining CI/CD pipelines. • Develop infrastructure automation leveraging Terraform, Python and Typescript. • Improve system availability, security, compliance, cost effectiveness and performance. • Estimate work, prioritize tasks, track dependencies, report progress, highlight blockers • Participate in continuous improvement initiatives, advocate for SRE best practices, and stay current with emerging technologies and trends. • Be part of a team where your focus will be on building, measuring, and refining the systems infrastructure that runs our software.

United States
Job Closed
Scratch Financial logo

Dev Ops Engineer, Level 5

Scratch Financial

Scratch Financial is the world's simplest patient financing solution.

DevOps Engineer135 days ago
OtherRemoteTeam 11-50Since 1912H1B Sponsor

• Participates as a technical expert providing advanced knowledge in vendor devices and management systems • Plans and directs development teams and troubleshoots internal application issues • Provides technical solutions for network engineering and operational problems • Interfaces with vendors and engineering organizations • Provides leadership to Network Engineers and the CIEC Development team

Maryland
$92.1K - $216.0K / year
Job Closed
Full TimeRemoteTeam 201-500H1B No Sponsor

• Build, lead, and develop a high-performing team of Site Reliability Engineers responsible for our hybrid cloud infrastructure in AWS, with an on-premise extension in Hetzner . • Design, document, and lead the implementation of reliable and secure infrastructure solutions following industry best practices. • Oversee technical analysis, cost estimation and optimization, platform and system design, architectural compliance, resource planning, and delivery milestones. • Engage in hands-on technical work alongside the team to maintain deep understanding of the infrastructure, and lead incident response during critical issues. • Define team goals and strategy, building strong relationships with internal stakeholders across the organisation. • Manage and coordinate the on-call rotation, including escalation processes, across infrastructure and software engineering teams. • Champion engineering best practices and drive continuous improvement in production environment quality and reliability.

Germany
Job Closed