Job Closed

This listing is no longer active.

AgilityFeat logo
AgilityFeat

Nearshore Staff Augmentation & Software Development

Senior SRE DevOps Engineer

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 11-50Since 2010H1B No SponsorCompany SiteLinkedIn

Location

Virginia

Posted

106 days ago

Salary

$5K - $7K / month

Seniority

Senior

Job Description

Senior SRE DevOps Engineer

AgilityFeat

• Implement SLI/SLO frameworks with error budgets, driving data-informed reliability decisions across the platform • Design release strategies including blue/green deployments, canary releases, automatic rollback, and version tracking • Lead incident response, author post-mortems, and build automated runbooks that reduce MTTR • Develop internal tooling, automation frameworks, and self-service platforms in TypeScript/Python to improve developer productivity and operational efficiency • Write reliability-focused services: health checkers, auto-remediation controllers, capacity managers, deployment orchestrators, and chaos testing frameworks • Build and maintain production AWS infrastructure using IaC (Terraform/CloudFormation), with focus on ECS, EKS/Kubernetes, and microservices orchestration • Build and maintain end-to-end CI/CD pipelines for backend services, mobile apps (iOS/Android), and IoT firmware across on-prem and AWS cloud environments • Define and enforce security policies: network segmentation, IAM, secrets management, encryption, compliance auditing, vulnerability management, and incident response • Build comprehensive observability with OpenTelemetry, distributed tracing, custom metrics exporters, and alerting across WebSocket connections, message delivery pipelines, and real-time communication services • Manage PostgreSQL (RDS), Redis/ElastiCache, SQS, S3, and NLB/ALB configurations including Elastic IPs for SIP/RTP traffic

Job Requirements

  • 7+ years in SRE/DevOps/Platform Engineering with a strong software development background
  • Proficiency in at least one backend language (TypeScript/Node.js, Python, or Go) for building internal tools, CLIs, operators, and automation services
  • Deep AWS expertise: ECS, EKS, RDS, ElastiCache, SQS, VPC networking, IAM, CloudWatch
  • Strong IaC proficiency (Terraform, CloudFormation, or Pulumi) including module design, state management, and drift detection
  • Proven CI/CD pipeline design on both on-prem and cloud (GitHub Actions, CodeBuild/CodePipeline, self-hosted runners)
  • Container orchestration at scale: Docker, ECS task definitions, Kubernetes, Helm, with experience writing custom controllers or operators
  • Solid security background: network security, secrets management, compliance, incident response
  • Experience implementing SLI/SLO frameworks, error budgets, and toil reduction strategies
  • Production PostgreSQL, Redis, and message queue operations (SQS, Redis Streams)
  • Strong understanding of distributed systems patterns: circuit breakers, retries, backpressure, graceful degradation.

Benefits

  • A role where engineering and operations merge, you'll ship code that keeps the platform running
  • Technically challenging environment spanning cloud, IoT, telecom, and satellite systems
  • Full ownership of the infrastructure stack with direct impact on reliability and scale
  • Competitive compensation, flexible remote work and a great work environment

Related Categories

Related Job Pages

More DevOps Engineer Jobs

RSM US LLP logo

DevOps Software Engineer

RSM US LLP

Experience the power of being understood.

DevOps Engineer106 days ago
OtherRemoteTeam 10,001+Since 1926H1B Sponsor

• Orchestrates public and private cloud infrastructure utilizing automation and continuously improving the process. • Automate and accelerate the testing, release, and deployment cycles through authored scripts for configuration and provisioning. • Achieve maximum system automation and integration through Infrastructure as Code (IaC), Web Services and scripting technologies and tools. • Develop and employ continuous delivery system practices via cloud services and infrastructure. • Execute and automate Continuous Integrations pipelines for various development projects using a core suite of tools. • Monitors, scales, and optimizes distributed services in the cloud infrastructure. • Integrates closely with enterprise solution development teams on identifying, problem solving and resolving issues that impact software releases and service delivery. • Develops and implements technical standards, procedures, and techniques for the resolution of Enterprise IT system problems to ensure maximum application availability and performance. • Develop proof of concepts architecture for application and automation initiatives. • Drives new ideas and innovative solutions to resolve problems. • Engages with other engineering teams to improve the lifecycle of services on our platforms. • Collaborates with other IT and non-IT related professionals such as Developers, Architects, Project Managers, Business Analysts, and business leaders. • Provides direct support of enterprise infrastructure including cloud computing solutions, Enterprise Service Integrations, Azure Service Bus and Azure Data Factory. • Orchestrates compute legacy environments. • Configures and integrates custom and 3rd party applications and add-ons. • Regular review of alerts, logs, and performance. • Works with end-users, Microsoft Support, and other vendors in resolution of support issues as needed. • Participates in scheduled and unscheduled weekend/after-hours system maintenance and support. • Performs rotational on-call duty.

United States
$72.1K - $118.8K / year
Job Closed
Core4ce Careers logo

Infrastructure Support Network Engineer

Core4ce Careers

Core4ce is a team of innovators, self-starters, and critical thinkers—driven by a shared mission to strengthen national security and advance warfighting outcomes. Got a big idea? At Core4ce, The Forge gives every employee the chance to propose bold innovations and help bring them to life with internal backing. Join us to build a career that matters—supported by a company that invests in you.

DevOps Engineer106 days ago

The primary responsibility for the iSupport Network Engineer position is to plan, design, deploy, and continue to support a mission critical project for the Military Health Systems under the Defense Health Administration. Secondary responsibilities for the position include the deployment and continued support of enterprise network infrastructure systems in the areas of Route/Switch, Security, and Load Balancing. Technologies utilized by the Network Engineer SME include: Ruckus 7750/7450/6610 Cisco Catalyst 4500/9300/9500 Cisco Nexus Cisco ASR/ISR Routers Palo Alto Firewall, IDS, IPS F5 BIP-IP Citrix Netscaler Cisco ISE The position also requires demonstrated competence in LAN/WAN technologies, BGP, OSPF, Spanning Tree, and Software Defined Networking. Last, the Network Engineer (SME) is responsible for the coordination of cross-functional teams in relation to the deployment and support of the physical/virtual network infrastructure, which includes technologies such as VMWare, Linux, Windows Server, physical Servers, and enterprise storage arrays. Attending planning sessions to understand business and technical requirements. Recommending cost-effective, technically feasible network solutions that meet initial capacity requirements, provide long-term scalability, and anticipate future growth and performance needs. Adhering to and following change management policies that facilitate a stable, well-documented network infrastructure that meets organizational requirements. Documenting projects, support actions and other administrative artifacts to serve as reference material for fellow engineers and management staff. Ensuring incidents and service requests adhere to established service level agreements (SLAs). Pursuing professional development opportunities through certifications, training, or conferences that enhance both technical and non-technical skillsets relevant to their role within the organization. Submitting written and verbal status reports as requested to management staff. Demonstrating a solid understanding of the OSI and TCP/IP network models and providing examples. Supporting and maintaining various network platforms with focus on Cisco (switches and routers), Palo Alto (firewalls), UpLogix (remote console), and F5 (Load Balancing and Packet Broker). Understanding of BGP, OSPF and LISP routing technologies. Demonstrating experience as a Systems Engineer to include network design, documentation, installation, operational support, and configuration of network devices such as servers, routers, switches, workstations, associated software tools, and cabling in a LAN environment. This position is designed to be flexible, with responsibilities evolving to meet business needs and enable individual growth.

United States
Job Closed
Emerald logo

Senior DevOps Engineer, Elastic Suite

Emerald

We grow our customers’ businesses 365 days a year through Connections, Content, and Commerce

DevOps Engineer106 days ago
OtherRemoteTeam 501-1,000Since 2014H1B Sponsor

• Deploy, manage and optimize AWS infrastructure for development, QA and production application environments • Manage AWS Cloud deployments for the Elastic B2B Sales and eCommerce Platform • Work on design and implementation of service orchestration solutions • Maintain and improve automated deployments through the use of CI/CD tools • Manage and utilize tools to monitor application uptime and performance • Collaborate with developers to resolve application and performance issues • Manage access to applications for various teams and their members • Manage logging, monitoring and alerting tools • Collaborate with company Security and Compliance leadership to establish and maintain SOC II, ISO, PCI, and GDPR security standards and compliance • Collaborate with the DevOps and other teams to improve application performance and processes • On-call rotation for weekend issues

United States
$110K - $140K / year
Job Closed
OtherRemoteTeam 51-200Since 2015H1B No Sponsor

• Design, build, and maintain cloud infrastructure in GCP (VPC, GKE Autopilot/Standard, Cloud SQL, AlloyDB, Memorystore, Cloud Storage, Cloud Run, Artifact Registry, Cloud NAT, Cloud Armor, etc.) • Manage hybrid environments (remaining on-prem + GCP) • Deploy and operate production-grade Kubernetes clusters (GKE + on-prem k8s) • Manage all infrastructure as code using Terraform (mandatory) + Helm + Kustomize • Configuration management with Ansible (existing playbooks) while evolving to more modern practices • Ensure high availability and disaster recovery for databases and queues (Cloud SQL, AlloyDB, Memorystore Redis, managed Kafka/RabbitMQ, Elasticsearch/OpenSearch on GKE) • Build a modern observability stack: Prometheus + Grafana + Loki/Tempo + OpenTelemetry , integrated with Cloud Monitoring and Cloud Logging • Design and implement CI/CD pipelines (GitLab CI, GitHub Actions, Cloud Build) • Participate in security & compliance processes (IAM, KMS, Secret Manager, VPC Service Controls, Security Command Center, hardening) • Join the on-call rotation (we are building an SRE culture) • Mentor mid/junior engineers and participate in architecture reviews

United States
Job Closed