Shuru logo
Shuru

Give wings to your ideas!

Senior DevOps Engineer

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 51-200Since 2021H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

127 days ago

Salary

0

Seniority

Senior

Job Description

Senior DevOps Engineer

Shuru

• Kubernetes platform engineering (EKS-first) ● Design, build, and operate production-grade Kubernetes clusters (multi-nodegroup, autoscaling, upgrades, cluster add-ons). • Implement intelligent autoscaling using real metrics (queue depth, consumer lag, service latency) via tools like KEDA/Karpenter. • Own AWS environments end-to-end (VPC, IAM, EKS/ECS/EC2, ALB/ELB, S3, Route53, CloudWatch, RDS, SQS, Lambda). • Build reproducible infrastructure using Terraform, with strong review + change management practices. • Implement backup/DR patterns (e.g., snapshots, retention, automation) and safe rollouts. • Design infrastructure for data-intensive workloads: high-throughput ingestion, batch processing, and real-time streaming. • Understand and operate distributed systems at scale — consensus, partitioning, replication, and failure modes. • Build and maintain infrastructure for data pipelines, vector databases. • Design for horizontal scalability, ensuring systems handle growing data volumes and user traffic gracefully. • Build/own monitoring + logging from scratch and make it actionable (Prometheus/Grafana, ELK/EFK, alerting). • Define/partner on SLI/SLOs and incident response practices; improve reliability with data-driven changes. • Establish performance testing and production-like load testing environments. • Continuously reduce AWS spend via right-sizing, Spot strategies, reserved capacity planning, and architecture improvements. • Partner with engineering teams to diagnose bottlenecks (db queries, caching, queueing) and propose scalable solutions. • Optimize infrastructure costs for data-heavy workloads (storage tiering, compute scheduling, GPU utilization). • Improve cloud and cluster security posture (IAM, network policies, secrets management, least privilege). • Support SOC2 readiness/execution (controls, evidence automation, operational hardening). • Implement access management patterns.

Job Requirements

  • 7+ years in DevOps / SRE / Cloud Infra roles operating production systems.
  • Deep hands-on experience with Kubernetes in production.
  • Strong AWS fundamentals across compute/networking/storage/identity, including VPC, IAM, EC2/EKS, ALB, S3, Route53, CloudWatch, RDS, SQS.
  • Proven ability to build infra using Terraform (and strong IaC practices).
  • Production-grade observability experience: Prometheus + Grafana, and centralized logging (ELK/EFK or similar).
  • Experience scaling product infrastructure — you've grown systems from thousands to millions of requests, and understand capacity planning, bottleneck identification, and scaling patterns.
  • Solid understanding of distributed systems concepts: CAP theorem, consistency models, partitioning strategies, distributed consensus, and failure handling.
  • Strong understanding of databases and performance fundamentals.
  • CI/CD experience building reliable pipelines (Jenkins/Spinnaker/GitHub Actions equivalents), with safe deployment strategies.
  • Scripting/automation ability in Python and/or Bash (Go is a plus).

Benefits

  • Competitive salary and benefits package.
  • Opportunity to work with a team of experienced product and tech leaders.
  • A flexible work environment with remote working options.
  • Continuous learning and development opportunities.
  • Chance to make a significant impact on diverse and innovative projects.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

ghSMART logo

Devops Engineer

ghSMART

We help CEOs, boards and investors develop winning executive teams and make high-stakes leadership decisions.

DevOps Engineer128 days ago
OtherRemoteTeam 51-200Since 1995H1B No Sponsor

As the industry pioneer behind Content Performance Marketing, BrightEdge has thoroughly redefined the concept of search engine optimization (SEO) by developing an award-winning platform that precisely measures and optimizes marketing content across online channels. Our cloud-based platform is powered by big data analysis that allows our customers to plan, optimize, and measure campaigns based on real-time content performance. BrightEdge has emerged as the leading international provider of cloud-based SEO Enterprise solutions due to its dynamic and results oriented entrepreneurial culture. We're currently seeking a motivated Senior DevOps Engineer to join our growing team. Our goal is to ensure high performance and availability of the BrightEdge S3 platform through operations monitoring and automation of processes. In this role, you'll have the opportunity to be involved with continuous integration and delivery of the top, industry-leading SEO platform. You will be responsible for maintaining and scaling our cloud infrastructure, as well as troubleshooting and developing solutions for complex problems. Core Responsibilities Ensure up-time, data-availability, and performance of the BrightEdge S3 platform, 24/7/365 Debug critical problems on production environment with minimal turnaround time Automate the operations of thousands of machines that are constantly collecting/crunching/querying data Build and enhance the in-house monitoring and maintenance software Provide tools to monitor performance, deploy code and provision machines Auto-scaling our global cloud network to cover 100+ countries Support Fortune 500 companies that leverage our platform for their SEO success Manage our hybrid environment of both cloud (AWS, GCP) and Data Center Create innovative solutions to tackle new challenges What it Takes to Be Successful B.S. in Computer Science or related field 4+ years of experience in DevOps UNIX/Linux systems knowledge and diagnostic skills MySQL, and No-SQL and database experience required Scripting language experience in Python, shell, Terraform, Ansible is a plus Google Cloud Platform or other cloud technology experience is a plus Benefits and Perks Competitive Salary PTO and Paid Holidays Medical, Dental, Vision Insurance About BrightEdge BrightEdge is widely recognized as a global leader in SEO and Digital marketing. The most innovative customers across more than 80 countries trust BrightEdge to modernize their Digital Marketing stack for today’s digital world. We are helping thousands of organizations, including many of the world’s largest companies, transform their businesses and drive more revenue. The continuous innovation of our product is supported by what we believe to be our most valuable assets: our people. Our employees are industry experts at the forefront of digital transformation. Come join us and help us share the future of SEO.

United States
Centene Corporation logo

Lead Site Reliability Engineer

Centene Corporation

Transforming the health of the communities we serve, one person at a time.

DevOps Engineer129 days ago
OtherRemoteTeam 10,001+Since 1984H1B No Sponsor

• Lead projects from end-to-end focused on managing and maintaining optimum platform infrastructure performance, reliability, and security using SRE practices • Automate monitoring activities and provide critical information to facilitate response and resolution of performance and availability issues and incidents • Develop and deliver complex services and software tools to ensure systems operate without interruption at optimum performance • Troubleshoot and analyze service disruptions to determine root causes and develop solutions for improved reliability • Drive decisions around system validation, monitoring, and standing up new services/tools • Conduct post-incident reviews and document findings for future informed decision making • Coach and mentor teams, design and implement key performance indicators

California + 2 moreAll locations: California | Florida | Missouri
$102.9K - $190.5K / year
Job Closed
Prelude logo

Deployment Engineer

Prelude

Know with certainty that your defenses will protect you against the latest threats.

DevOps Engineer129 days ago
OtherRemoteTeam 11-50H1B Sponsor

• Build and maintain deployment automation infrastructure: packaging pipelines, update distribution systems, and orchestration tooling for seamless customer rollouts • Manage MDM integrations: configure and optimize Origin deployments via Intune, Jamf, SCCM, and other endpoint management platforms • Package and maintain installation artifacts: build MSI/EXE packages for Windows, PKG/DMG for macOS, ensuring compatibility across diverse customer environments • Orchestrate customer onboarding: manage phased deployments from pilot to production scale, ensuring successful expansion across thousands of endpoints • Troubleshoot production deployment issues: debug complex problems in customer environments, provide Tier 3 support for deployment and compatibility challenges • Handle edge cases and non-standard environments: make Origin work in legacy infrastructure, air-gapped networks, security-restricted environments, and custom configurations • Optimize deployment performance and reliability: instrument deployment pipelines, identify bottlenecks, and continuously improve success rates • Document deployment best practices: create runbooks, deployment guides, and knowledge base articles for both internal teams and customers • Collaborate with engineering teams: translate field deployment learnings into product improvements and automation opportunities

United States
$170K - $270K / year
Job Closed
FICO logo

Senior DevOps Engineer

FICO

FICO is an analytics company helping businesses make better decisions that drive higher levels of growth and success.

DevOps Engineer129 days ago
OtherRemoteTeam 1,001-5,000Since 1956H1B No Sponsor

• Lead infrastructure and operations management for FICO’s Scores Business Unit, including analytics platforms, credit scoring systems, and decision management solutions • Design, build and maintain scalable and secure on premise and cloud infrastructure • Serve as a central technical resource helping diagnose and remediate technical issues, optimize performance, and implement security best practices • Manage technical support operations by triaging incoming support tickets, prioritizing requests based on business impact and resolving support requests while ensuring timely resolution • Effectively collaborate and partner with various stakeholders across the FICO organization to deliver technical solutions that support strategic Scores initiatives • Ensure high level of technical support and execution to high visibility, business-critical projects • Set up and maintain AWS IAM, AWS IDC, AWS Cloud Watch • Administration of software repositories, such as Git/BitBucket • Support of MFT, Connect:Direct (NDM) and SFTP used for file transfers with clients

United States
$116K - $182K / year
Job Closed