Job Closed

This listing is no longer active.

GoDaddy is a web services platform that helps individuals and businesses worldwide start, grow, and manage their online presence. GoDaddy employs team members a

Site Reliability Engineer - Storage Engineer

DevOps EngineerDevOps EngineerFull Time Remote Mid Level Company Site

Location

United States

Posted

59 days ago

Salary

$98.5K - $192K / year

Seniority

Mid Level

No structured requirement data.

Job Description

Role Description GoDaddy is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our dynamic team. This role will focus on automating and maintaining our storage infrastructure with a focus on Ceph, ensuring the reliability, scalability, and performance of our systems. - Automate and maintain day-to-day operations of storage systems to support application demands - Develop and maintain tools and automation scripts to streamline storage operations and improve efficiency - Monitor system performance, identify issues, and implement solutions to ensure high availability and reliability - Participate in agile concepts such as daily stand-up meetings, task tracking boards, design and code reviews, automated testing, continuous integration, and deployment - Continuously improve system reliability, performance, and capacity through proactive monitoring, automation, and optimization Qualifications - 2+ years of professional experience with Ceph, working in a production environment - 2+ years of experience in site reliability engineering or a similar role - 2+ years of professional experience with Ceph, including deployment, configuration, and management of Ceph clusters and systems - Experience working on Linux/Unix systems, with a focus on automation and operating at scale - Proficiency in Python or Bash - Experience with Ansible, Terraform, or SaltStack - Experience with Nagios-based monitoring tools, such as Icinga2 - Experience with observability tooling, such as Prometheus, Grafana, Mimir, and Loki - Solid understanding of core networking concepts and protocols, particularly in relation to Linux/Unix systems Requirements - Experience with containerization and orchestration tools (e.g., Docker, Kubernetes) - Exposure to and experience working with compute platforms (e.g., OpenStack, AWS) - Familiarity with ability to contribute to CI/CD pipelines and automation workflows Benefits - Competitive pay - Generous time off - Parental and wellness leave - Healthcare - Retirement savings program - Comprehensive benefits package, including: - Medical, dental, and vision insurance - 401(k)-retirement plan - Paid sick time - Paid flexible time off - Paid parental leave - Life insurance - Short- and long-term disability - AD&D insurance - Mental health or EAP programs - Remote or hybrid work options - Paid holidays - Paid Wellness days - Tuition assistance - Adoption, surrogacy, and fertility benefits - Dependent daycare and backup care benefits - Employee stock purchase plan - Financial education and advice Compensation - Bay Area (Santa Clara, San Francisco) and Los Angeles: $128,000 — $192,000 USD - Austin, D.C. Metro, CA (non-Bay Area), HI, IL, MA, NH, OR, VA, WA: $110,500 — $165,500 USD - New York City Metro, Kirkland/Seattle: $117,200 — $175,800 USD - All other US locations not previously listed: $98,500 — $147,500 USD

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Junior DevOps Engineer

OpenVPN Inc.

OpenVPN® helps businesses of all sizes create secure, virtualized, reliable networks that scale with your team.

DevOps Engineer59 days ago

Full Time RemoteTeam 51-200Since 2002H1B No Sponsor

Company Site LinkedIn

• Assist in designing, implementing, and maintaining scalable, fault-tolerant systems that leverage cluster orchestration and containerization technologies, with guidance from senior engineers. • Work alongside the Software Engineering and QA teams to support deployment processes for microservices-based architectures. • Help build and maintain CI/CD pipelines that support container-based application deployment and rollback. • Support system availability by running health checks and contributing to zero-downtime deployments. • Participate in a supported on-call rotation, escalating critical system issues appropriately as you build incident-response experience. • Collaborate with information security teams to follow industry best practices and compliance requirements. • Use AI-assisted tooling (e.g., LLM-powered coding assistants, chat-ops agents, scripted automations) to accelerate routine tasks such as log triage, runbook execution, ticket drafting, and code review.

AWS Azure Cloud DNS Docker Google Cloud Platform Grafana Jenkins Kubernetes Microservices Prometheus Python TCP/IP Terraform Go

View details: Junior DevOps Engineer

Bosnia And Herzegovina

Apply

Job Closed

Senior Systems Site Reliability Engineer, B2B

Jamf

The Standard in Apple Enterprise Management

DevOps Engineer59 days ago

Full Time RemoteTeam 1,001-5,000Since 2002H1B Sponsor

Company Site LinkedIn

• Site Reliability Engineers are responsible for helping balance development velocity against customer-centric stability of systems through the use of SRE best practices and the creation of new processes with automation. • The Senior Site Reliability Engineer is responsible for creating and leading projects around how services and other workflows should be measured as well as participating in the observability of production systems and services through day to day operational responsibilities with the intention to gain that wisdom of production to then decide what toil should be automated next. • The Senior Site Reliability Engineer is expected to operate with a DevOps mindset at the convergence point of Cloud Operations, Engineering, and Technical Support within the framework of the Agile process. • Identify improvements in both the platform and processes by implementing established SRE concepts with the goal of improving product and system reliability. • Proactively engage and collaborate with other individuals and teams as issues arise by serving as an escalation point of customer issues to ensure successful outcomes. • Perform root cause analysis for customer impacting issues and be able to clearly document the solution and advise others from the results of those findings. • Create technical documentation based upon new technology proof of concepts, project work, root cause issue analysis, identification of alerting patterns, and proactively sharing this knowledge with other teams as part of the Continuous Improvement Model. • Participate in team ceremonies to identify and refine potential work, communicate findings, and drive opportunities to collaborate. • Assign and communicate the business value and benefit hypothesis of new projects, initiatives, and strategies while being able to break down the technical work require to achieve a successful outcome. • Lead cross-team and cross-department technical collaboration in critical customer escalations. • Advise stakeholders and senior leadership on critical customer escalations. • Occasionally provide off hours support for deployments and customer escalations.

AWS Cloud EC2 Grafana Java Jenkins Kubernetes Prometheus SQL Terraform

View details: Senior Systems Site Reliability Engineer, B2B

Poland

Apply

Job Closed

Senior DevOps Engineer

Smartsheet

Founded in 2005, Smartsheet offers collaborative work management and process automation to empower greater enterprise productivity. A leading cloud-based platfo

DevOps Engineer59 days ago

Full Time Remote

Company Site

• Own and evolve the edge proxy platform: Maintain, upgrade, and extend a high-performance reverse proxy — including maintaining the proxy binary and its configuration tooling, writing Go and Python automation, managing the full container image lifecycle on hardened Linux base images, and working across the broader edge layer, including CDN, WAF, and traffic management capabilities. • Build and maintain cloud infrastructure as code: Design and implement Terraform/Terragrunt modules and live environment configurations managing EKS clusters, load balancers, IAM roles, VPC networking, ECR registries, and supporting AWS services across multiple regions including GovCloud. • Operate Kubernetes clusters at scale: Manage multi-region, multi-cluster EKS deployments via FluxCD GitOps workflows and Helm charts, including node AMI rotation, add-on lifecycle management, and horizontal pod autoscaling. • Build and own CI/CD pipelines: Design, maintain, and improve shared GitLab CI/CD pipeline templates used across all team repositories; build and operate alternative pipeline workflows for isolated government cloud environments. • Automate operational toil: Build and maintain tooling for tasks such as container image patching, EKS AMI rotation, air-gapped ECR image sync to GovCloud, and automated MR creation for monthly version-bump patching cycles. • Manage observability and on-call: Provision and maintain Datadog SLOs, monitors, and dashboards via Terraform; participate in the team's on-call rotation responding to edge proxy incidents across production and GovCloud environments. • Support FedRAMP/GovCloud operations: Operate the GovCloud environment with its unique constraints — air-gapped image distribution, infrastructure automation in isolated networks, and alert management with compliance-aware data handling. • Evaluate and adopt internal developer tooling: Research, prototype, and drive the adoption of internal tools that improve engineering productivity across the company — including developer portals, platform self-service capabilities, and other tooling that raises the bar for the developer experience at Smartsheet. • Mentor and collaborate: Share knowledge across the team through code reviews, architecture discussions, and runbook authorship; foster a culture of engineering excellence and operational rigour. • Strategically apply AI tools: Strategically apply and champion AI tools within your team's domain to improve project execution, infrastructure design, quality, and debugging, leading adoption of AI best practices.

AWS Cloud HAProxy Kubernetes Linux NGINX Node.js Python Terraform Go

View details: Senior DevOps Engineer

Bulgaria

Apply

AI Solutions - Agent Engineer

Marco Technologies

This is a remote-eligible position; however, Marco Technologies requires employees to reside within one of the following states: DE, FL, IA, IL, IN, KY, MD, MI, MN, MO, ME, NE, ND, NJ, PA, RI, SD, TX, WI.

DevOps Engineer59 days ago

Full Time Remote

Role Description The AI Solutions Engineer is a hands-on technical role focused on building, deploying, and supporting AI-driven solutions, including AI and Digital Employee offerings, that address real-world client and internal business needs. Operating at the intersection of emerging AI technologies and practical implementation, this role is responsible for executing on defined solution architectures, developing proof-of-concept and production-ready systems, and integrating AI capabilities into scalable, supportable environments. This individual works closely with Innovation leadership, AI Architects, engineering, and customer-facing teams to bring AI solutions to life. Responsibilities include: - Develop and support AI-based managed services and Digital Employee offerings from prototype through production deployment. - Engineer agentic systems with memory, tools, and orchestration. - Help define the future Digital Workforce platform. - Build and implement AI-driven solutions with a focus on LLMs, generative AI, AI agents, and automation platforms. - Execute on defined solution architectures by developing, integrating, and deploying AI systems. - Design and deliver proof-of-concept implementations to validate AI use cases. - Collaborate with AI Architects, Innovation, engineering, and operations teams. - Integrate AI solutions into client environments. - Ensure AI solutions align with client business objectives and governance requirements. - Support client engagements including technical discussions and ongoing optimization. - Translate complex AI concepts into clear, practical guidance. - Contribute to documentation, enablement materials, and internal training. - Continuously monitor emerging tools, frameworks, and best practices. - Participate in cross-functional planning sessions to align AI initiatives. Qualifications - Bachelor’s degree in Computer Science, Information Technology, Data Science, or a closely related discipline. - 3–6+ years of experience in software engineering, solution implementation, automation, or related technical roles. - Hands-on experience building or implementing artificial intelligence, machine learning, generative AI, or large language model (LLM)-based systems. - Experience developing and integrating APIs, automation workflows, or distributed systems in production environments. - Relevant certifications related to cloud platforms, AI/ML, or software engineering are preferred but not required. Requirements - Hands-on experience building and deploying AI solutions, including large language models (LLMs), agent-based systems, and automation workflows. - Proficiency in software development and integration (e.g., Python, APIs, SDKs, or equivalent modern development tools). - Experience working with cloud platforms and modern application architectures (cloud-native or hybrid environments). - Ability to implement and integrate AI solutions into business processes and existing technology ecosystems. - Strong problem-solving skills with the ability to move from concept to working solution quickly. - Effective communication skills, with the ability to explain technical concepts to both technical and non-technical audiences. - Proven ability to collaborate across engineering, innovation, sales, and delivery teams. - Demonstrated curiosity and commitment to continuous learning in a rapidly evolving AI landscape. - Ability to balance rapid experimentation with building stable, supportable solutions. Benefits - Pay Range: $109,855 - $175,768 annually. - The pay range listed for this position is based on candidate's skill level, experience, relevant licenses, and educational background. - For detailed information about our benefits, please visit our careers page at www.marconet.com/careers . Location This is a remote-eligible position; however, Marco Technologies requires employees to reside within one of the following states: DE, FL, IA, IL, IN, KY, MD, MI, MN, MO, ME, NE, ND, NJ, PA, RI, SD, TX, WI.

View details: AI Solutions - Agent Engineer

United States

$109.9K - $175.8K / year

Apply

Site Reliability Engineer - Storage Engineer

Job Description

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Junior DevOps Engineer

Senior Systems Site Reliability Engineer, B2B

Senior DevOps Engineer

AI Solutions - Agent Engineer