OneStream Software logo
OneStream Software

A comprehensive cloud-based platform to modernize the Office of the CFO.

Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 1,001-5,000H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

1 day ago

Salary

$114K - $148K / year

Seniority

Senior

Job Description

Site Reliability Engineer

OneStream Software

• Implement application/infrastructure observability solutions to ensure desired application availability, reliability, and performance • Participate in regular On-Call rotations and share details related to incidents and their resolution through post-mortem reports and regular review meetings • Proactively partner with Product and Engineering teams to identify, develop, deploy, and maintain reliable systems and services • Influence and create new designs, architectures, standards, and methods for large-scale systems • Sustain a high level of reliability for key services and automated systems • Automate processes to improve reliability, performance, and availability • Update technical documentation, workflows, and knowledge base articles • Provide feedback in pull requests and peer coding reviews • Implement codified automated solutions that build integrations between Dynatrace, Azure DevOps and Jira • Solid knowledge in focused areas of OneStream Software • Ability to mentor others in several technical areas • Understanding practical use of SOC/FedRAMP controls to assist Compliance and Security teams

Job Requirements

  • BS/BA in computer science, engineering, or technology-related field (or equivalent work experience)
  • Proven work experience as a Site Reliability Engineer or in a similar role
  • 6+ years of cloud infrastructure and software development experience
  • 2+ years hands on experience of Azure Kubernetes Services (AKS) with container-based deployment skills or other platforms such as OpenShift, GKS, EKS
  • Advanced understanding of APM and observability tools such as Dynatrace, AppInsights, DataDog, Log Analytics, New Relic, Prometheus and Grafana
  • Advanced understanding of Infrastructure-as-Code (IaC) concepts and tooling (Terraform, CloudFormation templates, Bicep or ARM templates) on Microsoft Azure, Amazon Web Services (AWS), or Google Cloud Platform (GCP)
  • Deep knowledge of Configuration Management/Orchestration utilities such as Ansible, PowerShell DSC, Chef, and Puppet
  • Advanced understanding of cloud concepts including elasticity, security, and identity management
  • Well versed familiarity with Agile Development methodologies utilizing Jira or Azure DevOps Boards
  • 6+ years of hands-on experience with the following technologies, tools, and concepts: Automating processes using PowerShell, Bash, CLI, REST APIs, python, ARM Templates or other scripting languages
  • Comfortable leveraging source control tools such as Git, Azure DevOps, or GitHub
  • Knowledge of container orchestration platforms such as Kubernetes, OpenShift, AKS, GKS or helm
  • Microsoft Azure, Amazon Web Services (AWS) or Google Cloud (GCP)

Benefits

  • Vision
  • Medical
  • Life
  • Dental
  • 401K

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 5,001-10,000Since 1998H1B No Sponsor

• Work from home as a Staff Database Reliability Engineer • Manage databases in a cloud environment using tools like Terraform and AWS • Collaborate with cross-functional teams to ensure high availability

India
Full TimeRemoteTeam 5,001-10,000Since 1998H1B No Sponsor

• Ensure reliability, performance, scalability, and operational excellence of our multi-cloud DBaaS platform across AWS, Azure, and GCP. • Drive operational standards, automation frameworks, and reliability engineering practices across distributed cloud environments.

India
Full TimeRemoteTeam 201-500H1B No Sponsor

• Design, develop, and maintain custom Drupal modules, integrations, and content models using modern Drupal engineering practices. • Build and support headless CMS capabilities using JSON:API to deliver content to downstream applications and digital experiences. • Develop automated testing, deployment, and CI/CD pipelines to ensure reliable and secure software releases. • Collaborate with DevOps and platform teams to deploy and operate Drupal applications in containerized cloud environments. • Perform application updates, security patching, performance tuning, and ongoing platform maintenance. • Troubleshoot and resolve application, integration, and production issues to maintain system stability and availability. • Contribute to platform modernization initiatives, including cloud-native architectures, infrastructure improvements, and Drupal version upgrades. • Partner with cross-functional teams to translate business requirements into scalable, user-centered technical solutions. • Participate in code reviews and engineering best practices to ensure high-quality, maintainable software. • Document technical designs, implementation approaches, and operational procedures to support long-term platform sustainability.

United States
$147K - $170K / year
Full TimeRemoteTeam 10,001+Since 1986H1B No Sponsor

• Working network engineer experience and knowledge of network components and vendors such as Arista, Aruba and Cisco switches, Juniper and Cisco routers, Palo Alto and Cisco firewalls, Aruba and Arista wireless, Aruba Clearpass authentication, Arista Cloud Vision (CVP), f5 load balancers, Bluecat and Micetro DNS, and the like • Working knowledge of routing protocols, routing/switching technologies, enterprise DNS, firewalls, VPN, load balancing, Internet proxies, wireless, and authentication • Software-Defined Network (SDN) deployment and support knowledge (Versa SDWAN, Redhat Openshift, VMware, and NSX experience preferred) • Ability to troubleshoot large scale data center enterprise network environments and resolve problems • Ability to develop software and tools to proactively monitor global network, capacity planning and load testing, and respond to network events • Ability to develop network automation software and tools to proactively enhance operational support of the network • Ability to communicate and collaborate closely with multiple development and operations teams to build reliable system designs • Understanding of Network Monitoring and Observability and taking appropriate actions on Fault and Performance, Packet Analysis, Logging, Configuration Management and Automation • Must understand, configure, and have knowledge of modern network automation scripting including Python, Ansible, etc. • Support 24x7x365 global network data center operations • Participate in the on-call rotation schedule for after-hours and weekend support • Ability to understand and work in a complex network with moderate supervision in a global team environment • Excellent customer service and English communications skills (written and oral) • Fluency in ticketing systems (eg., ServiceNow) and ITIL processes and procedures • Assist in the Change Management process as necessary, including participating in maintenance activities during off hours including weekends

North Carolina