Job Closed

This listing is no longer active.

Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteMid LevelTeam 10,001+Since 1903H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

38 days ago

Salary

$85.4K - $192.9K / year

Seniority

Mid Level

No structured requirement data.

Job Description

Site Reliability Engineer

Ford Motor Company

Role Description Enterprise Technology is the engine driving the future of transportation. If you’re looking for the chance to leverage advanced technology to redefine the mobility landscape, enhance the customer experience and improve people’s lives, this is the opportunity for you. Ford is seeking an experienced and passionate Site Reliability Engineer (SRE) to join our team in developing, enhancing, and expanding our global monitoring and observability platform. You'll blend software and systems engineering to ensure the uptime, scalability, and maintainability of our critical cloud services. You'll be at the intersection of SRE and Software Development, building and driving the adoption of our global monitoring capabilities. If you're passionate about using your IT expertise and analytical skills to shape the future of transportation, this is your opportunity to make a real impact. Join us and be part of a team that's building the future of mobility! - Write, configure, and deploy code in Go and Javascript that improves service reliability for existing or new systems; set standard for others with respect to code quality. - Work within Google Cloud Platform (GCP) infrastructure, optimizing performance and cost, and scaling resources to meet demand. - Provide helpful and actionable feedback and review for code or production changes. - Drive repair/optimization of complex systems with consideration towards a wide range of contributing factors. - Lead debugging, troubleshooting, and analysis of service architecture and design. - Participate in on-call rotation. - Write documentation: design, system analysis, runbooks, playbooks. Provide design feedback and uplevel design skills of others. - Implement and manage SRE monitoring application backends using Golang, Postgres, and OpenTelemetry. Develop tooling using Terraform and other IaC tools to ensure visibility and proactive issue detection across our platforms. - Collaborate with development teams to enhance system reliability and performance, applying a platform engineering mindset to system administration tasks. - Develop and maintain automated solutions for operational aspects such as on-call monitoring, performance tuning, and disaster recovery. - Troubleshoot and resolve issues in our dev, test, and production environments. - Participate in postmortem analysis and create preventative measures for future incidents. - Implement and maintain security best practices across our infrastructure, ensuring compliance with industry standards and internal policies. Participate in security audits and vulnerability assessments. - Participate in capacity planning and forecasting efforts to ensure our systems can handle future growth and demand. Analyze trends and make recommendations for resource allocation. - Identify and address performance bottlenecks through code profiling, system analysis, and configuration tuning. Implement and monitor performance metrics to proactively identify and resolve issues. - Develop, maintain, and test disaster recovery plans and procedures to ensure business continuity in the event of a major outage or disaster. Participate in regular disaster recovery exercises. - Contribute to internal knowledge bases and documentation. Qualifications - Bachelor’s degree in Computer Science, Engineering, Mathematics or equivalent work experience. - 3+ years of experience as an SRE, Software Engineer, DevOps Engineer or similar role. - Solid programming skills in Golang and scripting languages, with a good understanding of software development best practices. - Proficient with monitoring and observability tools, particularly OpenTelemetry, Dynatrace or other tools. - Proficient with cloud services, with a strong preference for Kubernetes and Google Cloud Platform (GCP) experience. - Experience with relational and document databases. - Ability to debug, optimize code, and automate routine tasks. - Strong problem-solving skills and the ability to work under pressure in a fast-paced environment. - Excellent verbal and written communication skills. Benefits - Immediate medical, dental, vision and prescription drug coverage. - Flexible family care days, paid parental leave, new parent ramp-up programs, subsidized back-up child care and more. - Family building benefits including adoption and surrogacy expense reimbursement, fertility treatments, and more. - Vehicle discount program for employees and family members and management leases. - Tuition assistance. - Established and active employee resource groups. - Paid time off for individual and team community service. - A generous schedule of paid holidays, including the week between Christmas and New Year's Day. - Paid time off and the option to purchase additional vacation time.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior Site Reliability Engineer

Funded.club

Funded.club is a global recruitment firm specializing in building high-performing teams for startups and scale-ups, offering a streamlined process that delivers

DevOps Engineer38 days ago

Role Description We are looking for a Site Reliability Engineer to help us tame DNS. Responsibilities include: - Linux system administration and troubleshooting - Network configuration and troubleshooting - Working with Salt configuration management software to admin hosts - Write Python scripts for automation - Use your comprehensive understanding of DNS to troubleshoot and resolve issues across different operating systems and networking environments - Solve infrastructure projects and help developers launch new software into production - Join our critical incident response team in an established on-call rotation - Build comprehensive and beautiful documentation - Improve operational checklists, processes, and validations making them error-proof Qualifications - Proven track record of completing complex work independently - Solid understanding of the Linux Operating System - Filesystem and permissions - Firewall (iptables) and networking setup - System management (systemd) - Infrastructure as Code (IaC) experience (Salt / Ansible / Terraform / Opentofu / Chef / Puppet) - Develop automation using Python and Bash - Experience with network troubleshooting is a must - Command line tools (dig, dog, delv, ping, nping, mtr) - Packet Capture (tcpdump, wireshark, termshark) - Solid understanding of networking concepts (OSI model, TCP/IP, UDP/TCP, IP addresses and Subnetting, Routing, unicast and anycast, BGP) - Domain Name System protocol on lock (Record Types, Resolution paths and server types, DNSSEC extensions) Requirements - Bonus if you have DNS experience (Worked with BIND / PowerDNS / Unbound or worked closely with organizations deeply in DNS administration and service, Cloudflare, ARN, RIPE, etc.) - Skill with some hardware (Juniper / JunOS / VyOS) - Skill with VPN protocols (OpenVPN / Wireguard / StrongSwan) - Passion for anti-censorship / VPNs / DNS Benefits - Salary: 140K - 180K CAD per annum - This position is REMOTE in Canada only but you MUST reside in the Eastern Time Zone - Toronto office welcomes all to get together for face time and social gatherings with the team regularly

Canada
C$140K - C$180K / year
Solo Network logo

Analista DevOps PL

Solo Network

Soluções que valorizam e impulsionam seu negócio

DevOps Engineer38 days ago
Full TimeRemoteTeam 201-500Since 2002H1B No Sponsor

• Atuar na área de Tecnologia com foco em desenvolvimento de soluções de automação inteligente • Utilizando engenharia de software, integração de sistemas e recursos de Inteligência Artificial para automatizar processos • Apoiar decisões, reduzir atividades manuais e aumentar a eficiência operacional • Realizando demais atividades correlatas e inerentes ao cargo.

Brazil
DevOps Engineer38 days ago
Full TimeRemoteTeam 51-200Since 2014H1B No Sponsor

• Projetar e construir pipelines de CI/CD seguros para múltiplos sistemas de destino. • Desenvolver e gerenciar pipelines usando principalmente GitHub Actions, além de outras ferramentas como Jenkins, Bamboo e Travis. • Automatizar processos manuais e desenvolver ferramentas que aumentem a eficiência operacional. • Colaborar com os times de desenvolvimento para apoiar a adoção de ferramentas e práticas DevOps. • Garantir o uso de padrões seguros e boas práticas em pipelines e infraestrutura. • Escrever e manter infraestrutura como código (IaC), preferencialmente com Terraform (conhecimento em Ansible será considerado um diferencial). • Trabalhar em parceria com as equipes de infraestrutura para entregar soluções robustas e confiáveis. • Solucionar problemas e otimizar os ambientes de desenvolvimento, homologação e produção. • Contribuir para estratégias técnicas e apoiar a evolução da plataforma. • Compartilhar conhecimento e atuar como referência técnica para outros membros do time.

Brazil
Zippy logo

Director of DevOps, IT, Security

Zippy

At Zippy, we provide manufactured home loans in a Zip!

DevOps Engineer38 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Grow and lead a high-performing organization across DevOps, IT, and Security. • Define and execute strategic roadmaps aligned with company objectives, technology priorities, and regulatory requirements. • Establish a results-driven culture focused on measurable outcomes, operational excellence, and continuous improvement. • Partner with senior leadership (VP+) to align on long-term strategy. • Guide and oversee DevOps architecture. • Drive adoption of modern DevOps practices with a focus on developer self-service. • Ensure high availability and disaster recovery capabilities across all systems. • Optimize cost, performance, system resilience, and developer productivity. • Direct IT operations including end-user support, device management, SaaS administration, and corporate systems. • Ensure efficient, secure, and scalable IT service delivery across the organization. • Standardize processes, implement tools, and automate for quality and efficiency. • Own and evolve the company’s security posture. • Ensure compliance with applicable regulatory frameworks and internal policies. • Lead risk management and vulnerability management programs. • Partner with product engineering teams to embed security practices in their work. • Report key metrics (e.g. core infra uptime, security posture, incidents). • Use data to drive continuous improvements. • Identify and solve complex operational and technical challenges autonomously. • Collaborate with other teams to support company initiatives with a helpful spirit. • Foster a culture of transparency, accountability, and constructive communication across departments while advising on infrastructure, IT, and security matters. • Set clear priorities aligned with department and company strategy. • Mentor and develop contributors and leaders. • Provide regular, actionable feedback and performance management. • Promote a structured, results-oriented way of working across teams. • Serve as a role model in problem-solving, collaboration, and conflict resolution. • Anticipate hiring needs and advocate for headcount, providing justification, job descriptions, and implement hiring practices for your area within the Technology department hiring and interviewing framework.

Arizona + 22 moreAll locations: Arizona | Connecticut | Florida | Illinois | Louisiana | Montana | Nebraska | Nevada | New York | North Carolina | Ohio | Oklahoma | Oregon | Maryland | Michigan | Missouri | Pennsylvania | South Carolina | South Dakota | Tennessee | Texas | Utah | Wisconsin