Home Depot is a Fortune 500 company and the world's largest specialty retailer of home-improvement products. Founded in 1978 with its first two stores in Atlant
Senior Software Engineer – Site Reliability
Location
United States
Posted
60 days ago
Salary
$80K - $180K / year
Seniority
Senior
Job Description
Senior Software Engineer – Site Reliability
Home Depot
• Drives the platform's stability, scalability, and performance. • Enhances product reliability by engineering automated solutions for complex infrastructure and operational challenges. • Champions application availability and efficiency through proactive monitoring, performance tuning, and strategic improvements. • Leads post-mortems, creates automation to reduce operational toil, and partners with product owners and developers. • Participates in tool selection, assists with capacity planning, and builds the monitoring and alerting to meet business-defined Service Level Objectives (SLOs). • Mentors less experienced engineers to foster a culture of operational excellence.
Job Requirements
- Must be eighteen years of age or older.
- Must be legally permitted to work in the United States.
- GCP - Cloud Infrastructure
- Observability - Grafana, Prometheus, Loki, Tempo
- Litmus Chaos - Destructive Testing
- K6 - Performance Testing
- Terraform Enterprise - Infrastructure as Code
- Github - SCM
- CDK8S - Kubernetes Manifest as Code
- GH Copilot - AI dev acceleration
- SRE Practices - Production Readiness Review, Capacity Planning, Change Validation, Prod Support
- 3+ years of experience in software development.
Benefits
- Health insurance
- 401(k) matching
- Flexible work hours
- Paid time off
- Remote work options
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description Join DuoKey as a DevOPS Engineer (DevOPS) to ensure our services are reliable, scalable, and efficient. At DuoKey, we specialize in cutting-edge key management and encryption solutions for enterprise cloud platforms like Microsoft 365, Salesforce, Google, and AWS. As a DevOPS, you will play a key role in bridging the gap between development and operations, applying software engineering principles to resolve problems impacting service uptime and performance. If you are passionate about innovation, integrity, and driving growth in the data security industry, DuoKey offers an inclusive and challenging environment where your creativity and expertise can thrive. Qualifications - Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience. - Solid understanding of the security landscape, from cloud and infrastructure security to vulnerability management, monitoring, and incident response. - Strong background in Linux/Unix administration and scripting languages (Python, Bash, etc.). - Experience with cloud services (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes). - Knowledge of configuration management tools (Terraform, Helm charts) and infrastructure as code (Terraform, IaC). - Familiarity with compliance standards and frameworks like ISO 27001 and GDPR, ensuring systems adhere to necessary regulations. - Familiarity with continuous integration and deployment methodologies. - Problem-solving mindset with a focus on reliability and availability. Requirements - Develop, scale, and maintain our infrastructure using automation and configuration management tools. - Work closely with development and security teams to design and implement robust, scalable services. - Monitor service health, responding to incidents, and ensuring high availability and performance. - Implement continuous integration and delivery pipelines for efficient and reliable deployments. - Conduct post-incident reviews and implement preventative measures to avoid recurrence. - Optimize system performance, applying capacity planning techniques. Benefits - Competitive salary with performance-based bonuses to reward your contributions and achievements. - Opportunities for professional development and career growth including certifications in key technologies. - The chance to work with a cutting-edge technology company in a growing industry. - Dynamic team environment with supportive colleagues. - Private medical insurance. - Flexible working conditions at home office. - Provision of all necessary work equipment. - A Work From Home (WFH) allowance to support remote working needs, along with a monthly allowance for electricity and internet expenses. Company Description DuoKey is a cloud security leader that specializes in robust key management and advanced encryption for enterprise cloud environments, including Microsoft 365, Salesforce, Google, and AWS. The company offers a comprehensive suite of key management and encryption products for various platforms like Microsoft 365, Amazon S3, Salesforce, and AWS XKS, which features Multi-Party Computation (MPC) encryption. This technology ensures secure, distributed encryption to ensure a high level of control, confidentiality, and security over sensitive data and encryption keys, even in the event of unauthorized access or breach. Trusted by leading corporations in the automotive, financial, and health industries, DuoKey helps businesses worldwide safeguard their confidential information and comply with ever-evolving regulations and industry standards, while maintaining full control over their encryption keys in multi-tenant and vault solutions powered by MPC or Hardware Security Modules (HSMs).
This role supports the U.S. Air Force Cloud One Architecture and Common Shared Services contract and currently has an opening for a DevOps Engineer. SES is seeking a DevOps Engineer supporting AWS, Azure, GCP, and Oracle Clouds. This is an exciting opportunity to use your experience to modernize a leading, global-scale multi-cloud environment in support of a critical mission, supporting USAF system resiliency, security, and cost effectiveness. Location: This position will be remote.
DevOps Manager
Lexipol LLCLexipol is an Aliso Viejo, California-based provider of risk management resources for public safety organizations. Founded in 2003, Lexipol provides resources f
At Lexipol, our mission is to create safer communities and empower the individuals on the front lines with market-leading content and technology. Our top-notch team works closely with law enforcement, fire, EMS, corrections, and local government professionals to tailor our solutions to better address today’s challenges and keep first responders coming home safely at the end of each shift. Working at Lexipol means making a difference – day in and day out. Lexipol is searching for a hands-on DevOps Manager to lead and support our DevOps initiatives, with a strong focus on security, SOC2 compliance, and reliable infrastructure across both cloud and physical data center environments. This role blends leadership with execution, requiring someone who is equally comfortable guiding a team and rolling up their sleeves to solve complex technical challenges. The DevOps Manager will work closely with engineering and cross-functional teams to build, deploy, and maintain scalable, secure infrastructure while driving operational excellence. This is a remote position with some travel to the corporate office in Frisco, TX, #LI-Remote. Key Responsibilities: Hands-On DevOps Leadership - Lead by example as a player-coach, actively contributing to infrastructure design, automation, and incident response. - Partner with engineering teams to improve system reliability, scalability, and deployment velocity. - Support and guide day-to-day DevOps operations while mentoring team members. Infrastructure & Cloud Management - Design, build, and maintain scalable infrastructure across development, QA, and production environments in AWS and co-location facilities. - Collaborate with developers and architects to select appropriate cloud resources, balancing performance, cost, and security. - Continuously improve infrastructure through automation and best practices. CI/CD & Automation - Build, optimize, and maintain CI/CD pipelines to enable fast, reliable releases. - Drive automation efforts across infrastructure provisioning, configuration management, and deployments. - Identify opportunities to reduce manual processes and improve efficiency. Site Reliability & Operations - Monitor system performance and availability, ensuring high uptime and rapid issue resolution. - Participate in on-call rotations and lead incident response efforts when needed. - Develop and maintain runbooks, documentation, and post-incident reviews to improve operational maturity. - Help reduce MTTR through proactive monitoring and process improvements. Security & Compliance - Implement and maintain security best practices across infrastructure and systems. - Support SOC2 compliance efforts, including audit readiness, logging, and traceability. - Manage and enhance SIEM tools and monitoring solutions to ensure visibility into system health and security. Team Leadership & Collaboration - Mentor and develop DevOps engineers, fostering a culture of ownership and continuous improvement. - Collaborate cross-functionally with Engineering, QA, Data, and Business Systems teams to align priorities and solutions. - Act as a trusted partner to stakeholders, helping translate business needs into technical solutions. Duties listed are not intended to be exhaustive or exclusive; other duties may be assigned. Management retains the discretion to add to or change the duties of the position at any time. Qualifications: - Educational Background: Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field (or equivalent experience). - Professional Experience: Proven experience in a DevOps or SRE role with hands-on expertise in cloud infrastructure, particularly AWS (API Gateway, Lambda, MySQL). - Compliance and Security Expertise: Experience supporting SOC2 environments and applying strong security practices. - Leadership Skills: Experience leading or mentoring engineers, with the ability to balance hands-on work and team development. - Technical Proficiencies: Strong experience with DevOps and SRE practices, cloud platforms (AWS preferred; Azure/GCP a plus), infrastructure as code, and automation tools. Additional Preferred Experience: - Experience implementing performance and reliability testing practices (including chaos testing). - Built or supported internal tools and platforms to improve developer productivity. - Experience with incident management across L1–L3 support models. - Track record of improving system reliability while reducing operational costs. Compensation and Benefits Lexipol offers a competitive base salary, monthly, quarterly, or annual incentive and a comprehensive benefits package including 401(k) with Company match and a flexible paid time off plan. About Lexipol Lexipol empowers first responders and public servants to best meet the needs of their residents safely and responsibly. We are the experts in policy, training, and wellness support, committed to improving the quality of life for all community members. Our solutions include state-specific policies, online learning, behavioral health resources, grant assistance, and industry news and information offered through the websites Police1, FireRescue1, EMS1, Corrections1 and Gov1. Lexipol serves more than 2 million public safety and government professionals in over 12,000 agencies and municipalities. For additional information, visit www.lexipol.com. Lexipol Is an Equal Opportunity Employer (EOE) Lexipol, LLC provides equal employment opportunities (EEO) to all team members and applicants for employment without regard to race, color, religion, gender, national origin, age, sex, pregnancy, disability, sexual orientation, gender identity or expression, veteran status, genetic information, or any other non-job-related characteristic. Lexipol complies with applicable federal, state, and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including hiring, placement, promotion, termination, layoff, recall, transfers, leave of absence, compensation, and training. #LI-AD1
• You will be responsible for supporting, implementing, and optimizing CI/CD pipelines and infrastructure within a client platform. • Address the needs of development teams.



