Job Closed

This listing is no longer active.

Fusable logo
Fusable

Stronger data. Smarter decisions. Greater impact.

Senior Dev Ops Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 201-500H1B No SponsorCompany SiteLinkedIn

Location

North Carolina

Posted

69 days ago

Salary

0

Seniority

Senior

Job Description

Senior Dev Ops Engineer

Fusable

• Use Terraform to define, deploy, and manage infrastructure as code across multiple environments (development, staging, production). • Implement and maintain containerized applications using Docker, ECS, and Kubernetes to enhance scalability and deployment efficiency. • Design, develop, and maintain continuous integration and continuous deployment (CI/CD) pipelines to automate testing, building, and deployment of code. • Manage and optimize infrastructure, ensuring a robust, secure, and scalable environment for application deployment. • Work with AWS services such as Code Pipeline, Elastic Beanstalk, EC2, RDS, Load Balancing and Autoscaling Groups to maintain and optimize infrastructure. • Ensure the efficient and secure integration of APIs with backend systems, leveraging AWS services. • Implement security measures using AWS WAF to protect applications and data from common web threats. • Work closely with networking and routing protocols to ensure seamless connectivity, load balancing, and high availability across cloud-based environments. • Collaborate with development, QA, and other teams to troubleshoot and resolve issues related to infrastructure, application deployment, and cloud architecture. • Proactively monitor infrastructure performance, optimize resource usage, and ensure uptime with continuous improvements. • In addition to using terraform, knowledge of other cloud providers and cloud agnostic design is appreciated.

Job Requirements

  • Strong expertise in Terraform for infrastructure as code.
  • Proficient in containerization technologies like Docker, ECS, and Kubernetes.
  • Experience with CI/CD pipelines (AWS CodePipeline, Jenkins, GitLab CI).
  • In-depth knowledge of AWS services including EC2, RDS, Elastic Beanstalk, CodePipeline, Autoscaling Groups, and WAF.
  • Experience with Git/Bitbucket for version control and collaboration.
  • Strong troubleshooting and debugging skills.
  • Familiarity with Networking and Routing principles in a cloud environment.
  • Excellent accountability for delivering projects on time, within scope, and with quality.
  • Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.
  • Experience in full stack observability options (Grafana/Splunk/Prometheus etc).

Benefits

  • Flexible work arrangements
  • Professional development

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 1,001-5,000

Role Description The Site Reliability Engineering team at iCapital is fundamental to ensuring our platform delivers consistent, reliable service to our client base. As a Site Reliability Engineer, you'll work at the intersection of software engineering and operations, applying engineering principles to infrastructure challenges. You'll be responsible for designing and implementing systems that scale efficiently, architecting observability solutions that provide actionable insights, and building automation that enhances our platform's reliability. This role requires someone who thinks systematically about reliability, can translate business requirements into technical implementations, and thrives on making complex systems more robust. - Define, implement, and iterate service level objectives (SLOs) and service level indicators (SLIs) that reflect customer and business expectations. - Lead monitoring and alerting standardization through “monitors as code” (Terraform preferred), including quality gates such as severity, ownership, and runbook links. - Develop observability standards across metrics, logs, and traces, including instrumentation and dependency mapping patterns (OpenTelemetry where applicable). - Lead technical evaluations and PoCs for observability platforms and integrations; define success criteria and migration approach for adoption. - Define and implement reliability and operability standards for Kubernetes-based services, including scaling patterns, resource constraints, rollout safety, and baseline dashboards and alerts as part of service onboarding. - Drive automation to eliminate toil, improve repeatability, and accelerate recovery (incident workflows, runbooks, and remediation where appropriate). - Serve as Incident Commander for high-severity incidents, lead postmortems, and drive systemic improvements through action items and measurable follow-through using established tooling workflows. - Participate in on-call rotations with a focus on improving reliability, reducing alert noise, and increasing signal quality over time. Qualifications - 7+ years in SRE or related roles, with evidence of technical seniority across multiple services and teams. - Strong experience with AWS and container orchestration (Kubernetes) in production environments. - Demonstrated experience defining SLOs/SLIs and using them to drive operational and engineering decisions. - Proven ability to design and implement observability solutions that produce actionable insights while reducing alert fatigue and operational noise. - Strong IaC skills (Terraform preferred) and the ability to build reusable automation and standards (monitoring as code, configuration patterns). - Familiarity with common data stores and managed services (e.g., Postgres, MongoDB, DynamoDB) and how they fail in distributed systems. - Experience with at least two observability stacks (Prometheus/Grafana, New Relic, Splunk, CloudWatch, ELK, etc.) and driving standardization across them. - Strong incident response skills, including leading retrospectives/postmortems and improving reliability through systematic follow-up. - Strong debugging skills across distributed systems and production environments, including performance and reliability investigations. - Clear written and verbal communication skills with the ability to influence engineering teams through standards, tooling, and practical guidance. Benefits - Comprehensive benefits package including competitive salary, annual performance bonus, and equity for all full-time employees. - Healthcare with 100% employer-paid health and dental insurance. - Generous paid time off (PTO).

Worldwide
TherapyNotes, LLC logo

Senior Database Site Reliability Engineer

TherapyNotes, LLC

TherapyNotes™ is the industry-preferred online EHR for behavioral health. Try one month free!

DevOps Engineer69 days ago
Full TimeRemoteTeam 51-200Since 2010H1B No Sponsor

• Responsible to design, implement, and maintain high-availability, high throughput, data and compute intensive, critical database systems running PostgreSQL which supports a growing 24x7 SaaS platform. • Define and improve database service reliability through monitoring/alerting, SLO-oriented metrics, and operational readiness. • Participate in and help drive incident response, root cause analysis, and post-incident corrective actions for database-related production events. • Partner with other technical leaders to ensure all newly introduced systems are supportable and maintainable by both development and operations. • Provides escalated technical guidance and support to other technology teams throughout the organization • Provides on-call coverage for production support and other duties as required. • Accountable for complying with HIPAA security policies within the database platform • Ensure all solutions and operational activities adhere to the security and operating policies established by the organization • Own and continuously improve our Datadog database observability by building actionable dashboards, alerts, and service-level views using an observability stack (e.g., Prometheus, Grafana, New Relic, or equivalent). Familiarity with PGAnalyze or Percona a plus. • Automate system maintenance tasks using Bash, Powershell, Python, or Ansible. Manage infrastructure as code (IaC) writing Ansible playbooks. Some exposure to Terraform a plus. • Experience with writing & designing ETL pipelines using Python a plus • Understand and maintain various PostgreSQL ecosystem components like: PgBouncer, PgBackrest, HaProxy, RepMgr a plus • Excellent communication and interpersonal skills.

United States
$120K - $160K / year
OneStream Software logo

Senior Cloud DevOps Network Engineer – FedRAMP, Azure, Advanced Networking

OneStream Software

A comprehensive cloud-based platform to modernize the Office of the CFO.

DevOps Engineer69 days ago
Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

• Lead the design, continuous monitoring, implementation, and security operations of Azure cloud solutions, ensuring they meet industry best practices and comply with FedRAMP High, IL4 requirements • Lead team in developing modular Infrastructure-as-Code utilizing Terraform, PowerShell, ARM, Bicep, and YAML languages • Lead projects of moderate complexity to completion • Sustain a high level of reliability for key automated systems • Leads teams to define, estimate, and implement requirements for new automations or services of moderate complexity needing development • Stay up to date with the latest Azure and FedRAMP regulatory changes and industry trends, advising teams on potential impacts and necessary adjustments • Update technical documentation, workflows, and knowledge base articles • Provide feedback in pull requests and peer coding reviews • Solid knowledge in focused areas of OneStream Software • Participate in on-call rotation to support production systems • Assist in efforts to debug the problems which arise in production • Ability to mentor others in several technical areas • Understanding practical use of FedRAMP/SOC controls to assist Compliance and Security teams

United States
$140K - $172.3K / year
journaway logo

Senior DevOps

journaway

Bucket list Moments in over 100 countries

DevOps Engineer69 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Design, implement, and manage cloud infrastructure for high availability and performance • Improve pipelines, automate workflows, and drive innovation in infrastructure setup • Monitor platform performance and manage incident responses • Apply and enforce security best practices and ensure compliance • Collaborate with cross-functional teams to optimize application performance and deployments

Germany
Job Closed