Job Closed
This listing is no longer active.
Innovative Solutions to Complex Problems
Senior Site Reliability Engineer
Location
New Mexico
Posted
77 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer
ARA
• Partner with software developers, platform engineers, and IT staff to improve system design, operability, deployment safety, and production support readiness. • Define and maintain operational standards, runbooks, support procedures, escalation paths, and service-level objectives. • Evaluate system architecture and changes to ensure they balance functional requirements, service quality, reliability, security, and compliance needs. • Drive continuous improvement in platform stability, maintenance, and availability. • Provide advanced technical support and troubleshooting for complex platform and service issues affecting internal users and stakeholders.
Job Requirements
- 8+ years of experience in Site Reliability Engineering, DevOps, Platform Engineering, Systems Engineering, or related infrastructure roles supporting production services.
- Strong experience with Linux systems administration and troubleshooting in enterprise environments.
- Strong experience operating and maintaining on-prem Kubernetes platforms and all related components including CRI, CNI, and CSI plugins.
- Experience deploying and maintaining applications on Kubernetes using Helm, Kustomize, and similar tooling.
- Experience supporting DevOps tooling such as GitLab, Artifactory, Jira, Confluence.
- Experience with GitOps tools such as FluxCD or ArgoCD.
- Proficiency scripting with at least one of Python, Go, or Bash.
- Strong experience designing, maintaining, and maturing observability tooling including monitoring, dashboards, logging and tracing, and supporting SLOs.
- Strong understanding of reliability engineering concepts: Service health indicators High availability design, failure reduction, and testing Operational readiness practices, including developing documentation, runbooks, and architectural descriptions Incident response, root cause analysis, remediation/recovery
- Ability to obtain a security clearance, which includes U.S. citizenship.
- Preferred: Experience with multiple Linux distributions including Ubuntu.
- Experience with at least one of the following: Tanzu Kubernetes, Nutanix Kubernetes Platform, Canonical Kubernetes.
- Experience with cloud platforms such as AWS and Azure.
- Experience with infrastructure automation and configuration management.
- Experience managing AI tooling on Kubernetes including MCP Servers, LLM platforms (vLLM, Ollama), Kubeflow.
- Experience with security and compliance considerations in regulated environments.
- DoD experience.
- Active or inactive Secret Security Clearance.
Benefits
- Remote work options
- Health insurance
- 401(k) retirement plan
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Support teams with self‑service tools for provisioning, building, testing, and deploying applications • Improve system reliability, security, and scalability using automation and modern DevOps practices • Maintain and enhance CI/CD pipelines (Jenkins, GitLab CI/CD) • Work across cloud infrastructure (AWS), networking, system administration, and security • Implement infrastructure‑as‑code and environment automation • Drive operational excellence through monitoring, logging, and process improvements
• Ensure Azure services are deployed in a consistent, production-ready manner • Drive consistent and repeatable deployment processes across services • Create reusable pipeline templates and deployment patterns • Coach teams on deployment patterns and cloud-native practices • Help teams design systems that are deployable, scalable, and cost-efficient • Build cost awareness into deployment patterns • Optimize serverless and messaging usage for performance per dollar • Monitor and manage cloud spend to prevent waste
Java Developer
TalentuchMicrosoft, SAP, and general IT recruiters | Talent Acquisition Worldwide | Recruitment Outsourcing
We have grown from an original team of brand planners, and we focus on developing brand and user experiences across the digital landscape. We come to work to mix curiosity and technology to deliver informed commercial results and positive digital experiences. Our is outstaffing, so all projects are different. The company is looking for a Backend Developer. Focus: Implementation and further development of backend services and business logic. Core Responsibilities - Design and implementation of REST APIs and microservices using Spring Boot (Java) on Azure Container Apps - Integration with internal and external systems via the integration layer - Implementation of data access (schema design, queries, migrations, performance optimization) — Azure Database for PostgreSQL - Error and exception handling including logging and monitoring hooks, functional and technical validations - Creation of unit and integration tests Extended Responsibilities - Implementation of security mechanisms (authentication/authorization with Azure Identity Management, role-based access) - Contribution to architectural decisions (API design, domain interfaces, service boundaries, data models) - Refactoring of legacy logic, reduction of technical debt, and standardization of patterns (error handling, logging, DTOs) - Support in performance analysis, scaling strategies, and cost optimization at the backend level Required Stack Java, Spring Boot, REST APIs, Azure Container Apps, Azure Database for PostgreSQL, Azure Identity Management (AD B2C / Entra), Azure DevOps (Repos, Pipelines) What do we offer? - Remote work, the result is all we care about, work from bed, the beach, or anywhere else you want. - Flexibility, we do not impose a strict schedule or mandatory attendance, you allocate your time taking into account your tasks on the project and at the company. - Stability, Client's employees work for The Comopany not for a project, so we continue to cooperate and offer new projects if the current one ends - Diversity, we welcome all people with diverse interests and hobbies. At this position your colleagues will be musicians, artists, athletes, collectors, brave and unusual people. - PTO (14 days) and sick leave (7 days) are provided after 6 months since start working.
• Assist in supporting an Internal Developer Platform built on Backstage • Learn how Backstage entities are defined and maintained • Support the use of Backstage templates by helping test workflows • Help monitor and troubleshoot platform-related Kubernetes workloads • Contribute to improving the developer experience by documenting features and onboarding steps




