Job Closed
This listing is no longer active.
Based in Foster City, California, Visa is a global payments technology organization. Visa was founded in 1958, coinciding with Bank of America’s launch of the
Staff Software Engineer - Backend - DevOps
Location
Washington
Posted
57 days ago
Salary
0
Seniority
Senior
No structured requirement data.
Job Description
Staff Software Engineer - Backend - DevOps
Visa
Open this listing to view full details.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior DevOps Engineer
ClarivateHeadquartered in Philadelphia, Pennsylvania, Clarivate offers a patent search and analytics platform to help users worldwide discover, protect, and commercializ
Title: Senior DevOps Engineer Location: Ann Arbor United States Job Description: IT Services Remote:Hybrid Job ID:JREQ135263 We are looking for a Senior DevOps Engineer to join our team in Ann Arbor, MI. The DevOps team is responsible for everything between the hardware and the application. If you are an expert in Linux and have experience supporting larger applications, then we would love to speak with you! About You – experience, education, skills, and accomplishments - Bachelors’ degree or higher in Computer Science, IT, or related technical field or equivalent relevant experience - At least 5+ years managing systems in a large-scale production environment. - At least 5 years of experience working in datacenter facilities and with datacenter grade hardware. Understanding of physical computer fundamentals - CPUs, memory, storage, and so on. Ability to troubleshoot hardware issues. - At least 5 years of experience with Bash/Python scripting. - At least 5 years of experience managing a JVM based application stack. Understanding of Java memory management, Jetty/Tomcat configuration, and monitoring. - At least 5 years of experience with virtualization and containerization technologies. - At least 5 years of experience with using and configuring monitoring tools such as Nagios and Kibana. It would be great if you also had . . . - Understanding and ability to perform basic administration of database systems. - Deep understanding of DevOps concepts, configuration management (Puppet, Chef, etc.), deployment tools (Jenkins). - Deep understanding of networked storage. Ability to work with iSCSI and NFS mounts. - Deep understanding of networking and various communication protocols/standards - TCP/IP networking, DNS, HTTP / REST, SSL. - Experience with managing public cloud services - AWS, Azure, Google Cloud. - Software Engineering experience in a higher-level language such as Java or C++. - Experience managing database systems such as MySQL, MongoDB, or Cassandra. Including ability to write MySQL queries, diagnosing performance issues, and configuring replication. - Experience working with 3rd party vendors and contractors to set up meetings, product evaluations, and product purchasing. - Ability to create project plans, budgets, presentations, and documentation involving different products and teams. What will you be doing in this role? - Ensure day to day functionality, performance and security of our products. - Works closely with the R&D and Infrastructure Engineering teams on: * Infrastructure sizing, provisioning, and capacity management. * Monitoring performance and troubleshooting application issues. Resolving or escalating as necessary. * Helping diagnose hardware, networking, and other infrastructure issues and working with Engineering teams on resolutions. * Updates, upgrades, and migrations of everything from server firmware to databases and application code. * Backups and disaster recovery. - Creates, updates, and maintains automation of the application ecosystem using a multitude of tools. - Keeps learning! Assesses new technologies, suggests and implements improvements to existing tools and processes. - Independently manages long term projects and acts as an SME/Owner of application and automation components. - Helps manage capacity, costs, and budgets. - Acts as an escalation point for R&D and Product teams. - Acts as a thought leader, architect, and mentor to Jr. team members and other teams. About the Team Our DevOps team is responsible for the hybrid cloud infrastructure that powers Clarivate’s suite of Academic & Government (A&G) products, with a primary focus on our academic library platforms. We own everything between the hardware and the application - ensuring our systems are secure, reliable, scalable, and able to support mission‑critical products used by institutions around the world. You’ll join a highly experienced DevOps team that is globally distributed, collaborating daily with partners across the US, EU, Israel, and India. The team works closely with R&D, Infrastructure Engineering, and Product to design and operate complex production environments, automate at scale, and continuously improve how our platforms perform and grow. We value deep technical expertise, strong ownership, and thoughtful collaboration across time zones and cultures. Hours of Work - Full-time, permanent - Hybrid working model of 2-3 days/week on-site. - Must live within a commutable distance to our Ann Arbor, MI office At Clarivate, we are committed to providing equal employment opportunities for all qualified persons with respect to hiring, compensation, promotion, training, and other terms, conditions, and privileges of employment. We comply with applicable laws and regulations governing non-discrimination in all locations.
Senior Site Reliability Engineer
Apex SystemsApex Systems, an IT staffing and workforce solutions firm, provides recruiting and staffing services to large and small companies alike. Founded in 1995 by three Virginia Tech clas
Title: Senior Site Reliability Engineer - NC, TX Location: Charlotte, NC and Irving, TX Employee Type: Contract Job Type:Pay Range: $61 - $65 per hour Job Description: Job#: 3028557 Job Description: Senior Site Reliability Engineer Location: Charlotte, NC and Irving, TX (Hybrid) Employment Type: 18 Months Contract Pay Rate: $61.00 and $65.00 Role Overview We are seeking a Senior Site Reliability Engineer (SRE) with a background in software engineering and a passion for solving complex problems at scale. This role supports large-scale production systems for regulated communication archives critical for compliance and eDiscovery. The position blends software engineering with operational expertise to deliver stable, scalable, and resilient services while reducing manual work through automation. Key Responsibilities - Design and implement automated tooling to eliminate manual toil and optimize operations. - Build and enhance monitoring, alerting, and overall system observability. - Champion SRE best practices by modeling standards, mentoring peers, and collaborating with platform SRE teams. - Enhance system availability and resiliency patterns in a multi-cloud environment. - Introduce and scale AIOps, including self-healing and autonomic systems using AI/ML and RPA. - Automate key SRE metrics such as SLO/SLI adherence, error budgeting, and incident response. - Support critical applications, lead agile-based remediation efforts, and conduct blameless postmortems to perform root cause analysis. - Implement and guide Non-Functional Requirements (NFRs) during modernization initiatives. - Participate in production support rotations, which may include weekend on-call work. Required Qualifications Experience: 5+ years of Systems Engineering or Technology Architecture experience. This must include a minimum of 2 years of direct experience leading or operating within an SRE team. Technical Skills: - Expertise in container platforms, specifically Kubernetes and OpenShift (OCP). - Proficiency in Python and/or Java/J2EE. - Experience with observability tools such as Grafana, Prometheus, Splunk/ELK, AppDynamics, Aternity, or ThousandEyes. - Strong database knowledge, including Oracle, DB2, SQL, or MongoDB. - Experience with REST APIs, microservices, and messaging technologies like Kafka or MQ. - Familiarity with CI/CD tools such as Jenkins, GitLab, SonarQube, Artifactory, or Ansible. Preferred Qualifications - Experience with AutoSys scheduling. - Background in AIOps platforms like Moogsoft or experience with AI/ML frameworks for predictive alerting and self-healing automation. - Exposure to corporate banking or financial crimes environments (KYC, AML). - Familiarity with ITSM tools such as ServiceNow or Remedy. - Knowledge of foundational AI concepts, including classification, regression, and anomaly detection. Compensation & Benefits A competitive compensation package is offered for this position. The pay rate is between $61.00 and $65.00 per hour. A benefits package is available to eligible employees. We are an equal opportunity employer and welcome applications from all qualified candidates regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status. Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Apex Benefits Overview: Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach.
Systems Engineer - DevOps (Senior)
La Desenvolvimento Humano E EmpresarialEssa posição é estratégica para a operação da empresa, pois atua diretamente na organização dos processos e no acompanhamento das demandas principais desafios atuais do negócio. Buscamos alguém que traga controle, consistência e qualidade na execução, garantindo que nenhuma solicitação se perca ao longo do fluxo e que o cliente tenha uma experiência positiva em todos os pontos de contato.
Role Description A LA Desenvolvimento Humano e Empresarial está conduzindo um processo seletivo estratégico para uma posição de alta senioridade na área de tecnologia. Buscamos um(a) profissional com forte domínio técnico e visão sistêmica, capaz de atuar diretamente na sustentação, evolução e segurança da infraestrutura tecnológica, garantindo estabilidade, performance e escalabilidade dos ambientes. Essa posição é essencial para a continuidade do negócio e exige maturidade técnica, autonomia e alto nível de responsabilidade. Propósito da função: - Atuar na sustentação e evolução da infraestrutura tecnológica, com foco em automação, confiabilidade, segurança da informação e melhoria contínua dos ambientes, impactando diretamente a estabilidade e o crescimento da operação. Principais responsabilidades: - Garantir a disponibilidade, segurança e performance dos sistemas em ambiente de produção. - Estruturar e evoluir arquiteturas escaláveis e padronizadas. - Atuar na automação de processos e maturidade de pipelines CI/CD. - Realizar troubleshooting avançado em ambientes complexos (rede, containers, proxy, backend). - Gerenciar e validar certificados, fluxos de TLS e ambientes com PKI. - Atuar em incidentes críticos com rapidez, precisão e responsabilidade. - Propor melhorias contínuas em arquitetura, processos e segurança. - Colaborar com times multidisciplinares (desenvolvimento, produto e liderança). Qualifications - Profissional sênior, com vivência prática em ambientes complexos de produção. - Forte domínio técnico (não apenas teórico). - Capacidade de explicar arquitetura e decisões técnicas com clareza. - Perfil estruturado, metódico e orientado à qualidade. - Experiência com incidentes reais e ambientes críticos. - Alta capacidade de priorização e tomada de decisão. Requirements - Domínio avançado de Linux e redes. - Experiência sólida com Kubernetes. - Conhecimento profundo em TLS, PKI e segurança de infraestrutura. - Experiência com arquiteturas escaláveis e ambientes de alta disponibilidade. - Vivência com pipelines CI/CD e automação. - Experiência prática com troubleshooting avançado. Benefits - Modelo: CLT ou PJ. - Limite: até 170 horas mensais. - Atuação 100% remota. Important Esta não é uma vaga para profissionais em início de carreira. Buscamos alguém com alta autonomia, domínio técnico consolidado e capacidade de atuar em ambientes críticos com responsabilidade direta sobre a operação. Se você busca um ambiente desafiador, com autonomia e espaço para atuação estratégica em tecnologia, essa oportunidade é para você.
Senior Site Reliability Engineer
Dispensed GlobalDispensed: Your alternative therapy journey to wellness starts here.
About The Role Dispensed delivers prescriptions and clinical consultations to patients across Australia, New Zealand, and the UK, and the reliability of that platform is not an abstract engineering concern: when it degrades, patients lose access to healthcare. As a Senior SRE, you will own the operational health of a platform in active transition, consolidating a legacy Django system into a modern Next.js and Supabase architecture on AWS, and you will have genuine influence over how reliability is designed into that new foundation from the start. This is a role where you will define SLO frameworks, shape observability architecture, and lead the kind of post-incident work that produces lasting systemic change rather than tactical patches. If you want to work at the intersection of serious engineering craft and meaningful patient outcomes, and to build practices that a growing team will rely on for years, this is that role. What You'll Own - Define and maintain SLO and error budget frameworks across multiple services, working directly with product engineers to make reliability expectations concrete and actionable rather than aspirational. - Design and evolve the observability architecture across the platform, ensuring the engineering team has genuine insight into system behaviour during the Django-to-Next.js migration and beyond. - Identify systemic gaps in monitoring, alerting, and incident response before they surface as patient-facing incidents, and drive the work required to close them. - Lead post-incident reviews that go beyond immediate fixes, producing changes to architecture, runbooks, on-call processes, or delivery practices that reduce the likelihood and impact of recurrence. - Write infrastructure-as-code and automation that sets a quality bar for the team, reviewing infrastructure contributions from product engineers and junior SREs with direct, specific feedback. - Keep product engineering teams unblocked on reliability concerns by being a visible, proactive partner in delivery: attending design conversations, raising reliability risks early, and pushing back constructively when decisions create patient risk without a conscious trade-off. - Improve how the team operates on reliability over time, including on-call processes, reliability review checkpoints in the delivery cycle, and the quality of documentation product engineers use to understand what is expected of their services. What You’ll Need Required: - 6+ years in SRE, DevOps, or backend roles with production ownership. - Experience operating and improving reliability of distributed, customer-facing systems. - Strong cloud and infrastructure-as-code experience (AWS, Terraform, or similar). - Hands-on experience with SLOs, SLIs, and error budgets. - Solid observability experience (metrics, logging, tracing). - Experience leading incidents and post-incident reviews that drive systemic change. - Strong scripting/programming skills (e.g. Python, Go, TypeScript). - Ability to identify risks early and influence cross-team engineering decisions. - Clear communication and documentation skills. Highly Valued: - Experience supporting system migrations or major architectural changes. - Experience in regulated or high-availability environments. - Experience improving on-call practices or mentoring engineers. What We Offer - Work From Anywhere in Australia. 🌍 - A competitive salary and awesome benefits package. 💰 - A supportive and positive work environment. 🌟 - Opportunities to grow and develop your career. 📈 - Opportunity to transform lives through alternative medicine. 💡



