Elastic logo
Elastic

Self-described as the leading platform for search-powered solutions, Elastic helps organizations, their customers, and their employees find what they need faster while protecting a

Senior Site Reliability Engineer (Resilience) - Platform Resilience

Location

United States

Posted

46 days ago

Salary

$154K - $195K / year

Seniority

Senior

Job Description

Senior Site Reliability Engineer (Resilience) - Platform Resilience

Elastic

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role: As part of the Platform Engineering department, the SRE team is designing, building, scaling and maturing the multi-cloud platform for hosting internal and external services such as the Elastic Cloud Hosted and Serverless. We develop and extend new software and tools that support the rest of the infrastructure, so that we can rapidly deploy products from all corners of Elastic. We want your experience and recommendations to offer a truly exceptional customer experience! What you will be doing: - Taking an engineering approach in leading technical initiatives for automating system engineering efforts to guarantee the reliability of the global Elastic infrastructure. . - Growing our global Platform infrastructure to meet the increasing scaling demands by developing and maintaining software, tooling and automations. - Using an inclusive approach at championing an environment focused on collaboration, operational excellence, and uplifting others. - Responding to and preventing repeated customer impact in response to major incidents and prioritised problem management. Our on call rotation uses follow-the-sun model where everyone participates in it in (mostly) their working hours. What you bring: - Success and lessons of experiences from striving for 'progress not perfection' in the name of Platform reliability. We want to hear about your customer first approach in solving operational problems with a SRE perspective. - A background in software engineering to collaborate with engineers to expertly identify, implement and deliver solutions. An experience in public cloud and managed Kubernetes services is advantageous. - Passion for developing solutions that involve inclusive communication methods to grow and strengthen partner and team relationships. Examples of working in distributed teams or working remotely is desirable. Bonus Points: You don't need to have all of these items, but these represent the types of work you will do as a Site Reliability Engineer at Elastic. - You have operated a SaaS product in a public cloud ideally built using Infrastructure-as-Code tooling such as Crossplane or Terraform - You have built or operated a Kubernetes-at-scale infrastructure, ideally across multiple cloud providers, and the vital automation to support it. - You have written non-trivial programs in Golang or other programming languages. - You have worked with containerized services (such as Docker.) - You have proven experience in leading and improving alerting and major incident management standard processes metrics systems (e.g. Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues and quantify impacts to present to others at varying level of the organization. - You have experience in system administration with professional skills in Linux on distributed systems at scale. - You have diagnosed or designed, implemented and created solutions with the Elastic Stack. - You are experienced in thriving in a self-organizing and sharing in a globally distributed team environment. - You strengthen team members in bringing out the best of each other by uplifting others with coaching and mentoring. Compensation for this role is in the form of base salary. This role does not have a variable compensation component. The typical starting salary range for new hires in this role is listed below. In select locations (including Seattle WA, Los Angeles CA, the San Francisco Bay Area CA, and the New York City Metro Area), an alternate range may apply as specified below. These ranges represent the lowest to highest salary we reasonably and in good faith believe we would pay for this role at the time of this posting. We may ultimately pay more or less than the posted range, and the ranges may be modified in the future. An employee's position within the salary range will be based on several factors including, but not limited to, relevant education, qualifications, certifications, experience, skills, geographic location, performance, and business or organizational needs. Elastic believes that employees should have the opportunity to share in the value that we create together for our shareholders. Therefore, in addition to cash compensation, this role is currently eligible to participate in Elastic's stock program. Our total rewards package also includes a company-matched 401k with dollar-for-dollar matching up to 6% of eligible earnings, along with a range of other benefits offered with a holistic emphasis on employee well-being. The typical starting salary range for this role is: $154,800—$195,600 USD The typical starting salary range for this role in the select locations listed above is: $154,800—$195,600 USD Additional Information - We Take Care of Our People As a distributed company, diversity drives our identity. Whether you’re looking to launch a new career or grow an existing one, Elastic is the type of company where you can balance great work with great life. Your age is only a number. It doesn’t matter if you’re just out of college or your children are; we need you for what you can do. We strive to have parity of benefits across regions and while regulations differ from place to place, we believe taking care of our people is the right thing to do. - Competitive pay based on the work you do here and not your previous salary - Health coverage for you and your family in many locations - Ability to craft your calendar with flexible locations and schedules for many roles - Generous number of vacation days each year - Increase your impact - We match up to $2000 (or local currency equivalent) for financial donations and service - Up to 40 hours each year to use toward volunteer projects you love - Embracing parenthood with minimum of 16 weeks of parental leave Different people approach problems differently. We need that. Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, disability status, or any other basis protected by federal, state or local law, ordinance or regulation. We welcome individuals with disabilities and strive to create an accessible and inclusive experience for all individuals. To request an accommodation during the application or the recruiting process, please email candidate_accessibility@elastic.co. We will reply to your request within 24 business hours of submission. Applicants have rights under Federal Employment Laws, view posters linked below: Family and Medical Leave Act (FMLA) Poster; Pay Transparency Nondiscrimination Provision Poster; Employee Polygraph Protection Act (EPPA) Poster and Know Your Rights (Poster) Elasticsearch develops and distributes technology and information that is subject to U.S. and other countries’ export controls and licensing requirements for individuals who are located in or are nationals of the following sanctioned countries and regions: Belarus, Cuba, Iran, North Korea, Syria, or Russia, including the Ukrainian territories annexed by Russia (The Crimea region of Ukraine, The Donetsk People's Republic (DNR), The Luhansk People's Republic (LNR), Kherson or Zaporizhzhia). If you are located in or are a national of one of the listed countries or regions, an export license may be required as a condition of your employment in this role. Please note that national origin and/or nationality do not affect eligibility for employment with Elastic. Please see here for our Privacy Statement.

Related Categories

Related Job Pages

More Platform Engineer Jobs

Exiger logo

Platform Engineer

Exiger

Making the world a safe and transparent place to prosper

Full TimeRemoteTeam 501-1,000Since 2013H1B Sponsor

• Contribute to internal platform tooling and shared cloud services used by engineering teams. • Support day-to-day operation of shared infrastructure and services, including KTLO work and issue follow-up. • Build automations that reduce friction for application development teams. • Create and maintain CI/CD workflows and Terraform-based infrastructure using established team patterns. • Develop basic internal APIs, services, and tooling in Python or Go. • Participate in a limited team on-call rotation with clear escalation paths and second-tier support.

United States
Mayo Clinic logo

Senior Platform Engineer

Mayo Clinic

Headquartered in Rochester, Minnesota, Mayo Clinic is a nonprofit medical institution ranked first in more specialties than all other hospitals in America. The company employs arou

The Mayo Clinic Platform AI team is seeking an experienced Senior Platform Engineer to join our innovative efforts in developing and implementing cutting-edge generative AI solutions. In this role, you will lead the design and development of state-of-the-art generative AI models, establish comprehensive safety guardrails for responsible AI deployment, and drive the creation of autonomous AI agents. You’ll collaborate closely with a diverse team of data scientists, product managers, and engineers as we shape the future of AI applications while ensuring our systems remain safe, ethical, and scalable. Key Responsibilities - Generative AI Model Development: Architect, design, and implement advanced generative AI models and architectures that support varied departmental applications and cutting-edge research initiatives. - GenAI Safety & Ethics: Develop comprehensive safety guardrails and ethical guidelines to ensure responsible AI development and deployment, incorporating best practices in AI alignment and security. - Cross-Functional Collaboration: Partner with cross-functional teams to integrate AI solutions seamlessly within the Mayo Clinic Platform, translating business needs into robust technical implementations. - Autonomous AI Agents: Lead the creation and optimization of intelligent AI agents designed for autonomous decision-making, leveraging techniques in prompt engineering and model fine-tuning. - System Enhancement: Evaluate and enhance existing generative AI deployments across departmental applications, continually iterating to improve performance, safety, and scalability. - Performance Optimization: Identify bottlenecks in AI/ML pipelines and propose solutions to improve system performance, efficiency, and scalability. - Monitoring & Troubleshooting: Develop and maintain observability tools, including logging, monitoring, and alerting, to diagnose and resolve production issues. - Documentation: Create and maintain technical documentation, including architectural diagrams, API specifications, and onboarding guides for internal and external stakeholders. - Thought Leadership: Stay updated with the latest trends and advancements in federated learning, distributed computing, and machine learning frameworks to continually enhance the platform. This vacancy is not eligible for sponsorship/ we will not sponsor or transfer visas for this position. Also, Mayo Clinic DOES NOT participate in the F-1 STEM OPT extension program Why Mayo Clinic Mayo Clinic is top-ranked in more specialties than any other care provider according to U.S. News & World Report. As we work together to put the needs of the patient first, we are also dedicated to our employees, investing in competitive compensation and comprehensive benefit plans – to take care of you and your family, now and in the future. And with continuing education and advancement opportunities at every turn, you can build a long, successful career with Mayo Clinic. Benefits Highlights - Medical: Multiple plan options. - Dental: Delta Dental or reimbursement account for flexible coverage. - Vision: Affordable plan with national network. - Pre-Tax Savings: HSA and FSAs for eligible expenses. - Retirement: Competitive retirement package to secure your future. Just as our reputation has spread beyond our Minnesota roots, so have our locations. Today, our employees are located at our three major campuses in Phoenix/Scottsdale, Arizona, Jacksonville, Florida, Rochester, Minnesota, and at Mayo Clinic Health System campuses throughout Midwestern communities, and at our international locations. Each Mayo Clinic location is a special place where our employees thrive in both their work and personal lives. Learn more about what each unique Mayo Clinic campus has to offer, and where your best fit is. Equal Opportunity All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender identity, sexual orientation, national origin, protected veteran status or disability status. Learn more about the "EOE is the Law". Mayo Clinic participates in E-Verify and may provide the Social Security Administration and, if necessary, the Department of Homeland Security with information from each new employee's Form I-9 to confirm work authorization.

United States
$127K - $185K / year
Canonical logo

Software Engineer - Cross-platform C++ - Multipass

Canonical

Enterprise open source, secured and delivered by the publisher of Ubuntu.

Full TimeRemoteTeam 501-1,000Since 2004H1B Sponsor

Role Description Use your deep C++ skills on Windows and MacOS to build an amazing open source developer experience with Multipass, the workstation mini-cloud at your fingertips which provides Ubuntu and appliance VMs on demand for build, test and prototyping. The Multipass team is hiring a Software Engineer to join our distributed team. We greatly value quality in our code, and great user experience. Multipass is published for macOS, Windows, and also Linux. Think of it as a workstation mini-cloud. At its simplest you can simply say multipass launch and you will get a new VM on your workstation. You can feed that VM data, just as you would on a public cloud like AWS, Azure or GCP. The goal is not to be a full cloud of course. The goal is to give developers a local cloud on their workstation, which they can use to run builds in the background, or to try cloud appliances, or to test their own cloud deployments and cloud-init scripts, free of charge. People sometimes use it as a build farm on a shared server, for example. As a Software Engineer you are expected to play a leadership role designing, mentoring, reviewing and of course coding. Location: This is a remote position available in the EMEA region only. Qualifications - Cross-platform development experience on macOS and/or Windows - Expertise in modern C++ development - Experience with software testing and test-driven development - Extremely high personal standards for code quality, testing and design - Knowledge of hypervisor technologies such as Hyper-V, VirtualBox, KVM, and QEMU - Open source experience and involvement - Knowledge of CI systems a plus - Capacity to learn quickly about new systems and techniques - Excellent communication skills in English - both verbal and written - Bachelor’s or equivalent in Computer Science, STEM or similar degree Requirements - Ensure Multipass is easy and intuitive to use - Architect new features and design the user experience - Write high-quality code to create new features and fix bugs - Review code and architecture as part of Canonical’s engineering process - Collaborate proactively with a distributed team - Debug, track down and fix issues encountered by our users - Foster the open source community and support customers when needed - Travel internationally for up to two weeks, twice a year, for company events Benefits - Distributed work environment with twice-yearly team sprints in person - Personal learning and development budget of USD 2,000 per year - Annual compensation review - Recognition rewards - Annual holiday leave - Maternity and paternity leave - Employee Assistance Programme - Opportunity to travel to new locations to meet colleagues - Priority Pass, and travel upgrades for long haul company events Company Description Canonical is a pioneering tech firm at the forefront of the global move to open source. As the company that publishes Ubuntu, one of the most important open source projects and the platform for AI, IoT and the cloud, we are changing the world on a daily basis. We recruit on a global basis and set a very high standard for people joining the company. We expect excellence - in order to succeed, we need to be the best at what we do. Canonical has been a remote-first company since its inception in 2004. Working here is a step into the future, and will challenge you to think differently, work smarter, learn new skills, and raise your game. Canonical is an equal opportunity employer. We are proud to foster a workplace free from discrimination. Diversity of experience, perspectives, and background create a better work environment and better products. Whatever your identity, we will give your application fair consideration.

EMEA
Full TimeRemoteTeam 51-200

Senior Cloud Engineer ST6 (Seal Team Six) is an elite team of battle-hardened software operators dedicated to building enduringly great software companies. Our focus is on professionalizing and scaling software businesses from $100 million to $500 million. We partner with top-tier private equity software firms such as TA, Hg, Insight Partners, and Genstar to acquire and build one platform company per year. Our companies are not the largest or flashiest, but they are among the best-run software businesses, creating value for customers and shareholders at an accelerated pace. To date, our team has built six platform companies, each culminating in multiple liquidity transactions with multi-billion-dollar valuations. The Senior Cloud Engineer will be instrumental in maintaining and enhancing the cloud infrastructure. This role focuses on optimizing cloud operations to drive cost efficiency, ensure high performance, and maintain scalability and reliability of the company's services. The position will work closely with the development and operations teams to automate processes, manage cloud resources, and ensure strict compliance with security standards. Key Responsibilities - Maintain 99.9% availability for customer-facing hosted solutions, ensuring high reliability. - Reduce cloud operating costs consistently by utilizing automation and innovative tools. - Ensure scalability of systems without compromising on performance. - Achieve and maintain compliance with key security standards, such as ISO 27001. - Maintain a minimum of 90% CSAT rating with a 10% response rate for all cloud service-related cases. Key Qualities - Involves a meticulous approach to work, prioritizing accuracy and thoroughness to ensure high-quality outcomes - Encapsulates taking full responsibility for ones actions and their outcomes, emphasizing accountability and learning from experiences Skills - Experience in risk assessment - Experience in disaster recovery - Experience in cloud / saas

United States