Job Closed

This listing is no longer active.

Hopper is an accredited, mobile-only travel agency using big data to analyze and predict airfare and accommodations. A fully remote employer, Hopper strives to

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerOther Remote Company Site

Location

United States

Posted

101 days ago

Salary

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers to provide a first class experience to millions of end users around the world. You are passionate about automating everything possible and ensuring systems remain optimized. You also like the infrastructure to be as scalable, reliable, secure, and optimized as possible. You like to solve problems in a practical way, building solutions that are simple, reliable, cost-effective, and easy to use. What would your day-to-day look like: - Work on projects that will drive a higher cost efficiency, such as: - Reduce our network egress costs by removing unnecessary headers. - Ensure that our warehouse data is in use and select the most efficient storage for it (e.g., cold storage for buckets with infrequent retrieval). - Ensure that autoscaling for both databases and compute is well optimized. - Work on improving the current cost attribution to ensure all teams have clear visibility into their costs. - Participate in providing support to incidents and be part of on-call rotation for platform incidents. - Contribute to solving doubts and problems engineers might face with our infrastructure and approving PRs that require Platform supervision. - Be part of a small and highly efficient team of SREs. Qualifications - Strong background in SRE, DevOps, Software Engineering or Systems engineering - Troubleshooting skills - System design with good analytical capabilities - Good communication skills - Knowledge of major cloud providers, preferably Google Cloud - SQL knowledge - Containers, Kubernetes, and related tooling like Kustomize and Helm - Service Mesh, preferably with Istio - Networking knowledge (DNS, TLS, certificates, ingresses, etc.) - Observability with log collection, metrics, APM, etc., preferably Datadog - Security knowledge (IAM, RBAC, network security, etc.) - Knowledge on authentication and authorization technologies - CI/CD - Database technologies - Competent in scripting with Bash and Python or other scripting languages Benefits - Well-funded and proven startup with large ambitions, competitive salary and the upsides of pre-IPO equity packages. - Unlimited PTO. - Carrot Cash travel stipend. - Access to co-working space on demand through FlexDesk AND Work-from-home stipend. - Very generous parental leave, much above industry standards. - Entrepreneurial culture where pushing limits and taking risks is everyday business. - Open communication with management and company leadership. - Small, dynamic teams = massive impact. - 100% employer paid Medical, Dental and Vision coverage for employees. - Access to Disability & Life insurance. - Health Reimbursement Account (HRA). - DCA/ FSA and access to 401k plan. Company Description At Hopper, we are on a mission to become the leading travel platform globally – powering Hopper’s mobile app, website and our B2B business, HTS (Hopper Technology Solutions). By leveraging massive amounts of data and advanced machine learning algorithms, Hopper combines its world-class travel agency offering with proprietary fintech products to bring transparency, flexibility and savings to travelers globally. - The Hopper platform serves hundreds of millions of travelers globally and continues to capture market share around the world. - The Hopper app has been downloaded over 120 million times and has become largely popular among younger travelers – with 70% of its users being Gen Z and millennials. - Hopper has been named the #1 most innovative company in travel by Fast Company. - Hopper has raised over $750 million USD of private capital and is backed by some of the largest institutional investors and banks in the world.

Job Requirements

Strong background in SRE, DevOps, Software Engineering or Systems engineering
Troubleshooting skills
System design with good analytical capabilities
Good communication skills
Knowledge of major cloud providers, preferably Google Cloud
SQL knowledge
Containers, Kubernetes, and related tooling like Kustomize and Helm
Service Mesh, preferably with Istio
Networking knowledge (DNS, TLS, certificates, ingresses, etc.)
Observability with log collection, metrics, APM, etc., preferably Datadog
Security knowledge (IAM, RBAC, network security, etc.)
Knowledge on authentication and authorization technologies
CI/CD
Database technologies
Competent in scripting with Bash and Python or other scripting languages

Benefits

Well-funded and proven startup with large ambitions, competitive salary and the upsides of pre-IPO equity packages.
Unlimited PTO.
Carrot Cash travel stipend.
Access to co-working space on demand through FlexDesk AND Work-from-home stipend.
Very generous parental leave, much above industry standards.
Entrepreneurial culture where pushing limits and taking risks is everyday business.
Open communication with management and company leadership.
Small, dynamic teams = massive impact.
100% employer paid Medical, Dental and Vision coverage for employees.
Access to Disability & Life insurance.
Health Reimbursement Account (HRA).
DCA/ FSA and access to 401k plan.

Related Categories

DevOps Engineer

Related Job Pages

More Remote Jobs

More DevOps Engineer Jobs

Sales Deployment Engineer – USSF, USAF

Vantor

DevOps Engineer101 days ago

Other RemoteTeam 1,001-5,000H1B No Sponsor

Company Site LinkedIn

• Act as a trusted technical advisor for USSF and USAF customers, translating mission needs into innovative, mission-ready solutions leveraging Vantor products and capabilities • Partner with Product, Engineering, and Account Leads to align customer requirements with Vantor capabilities and product roadmaps • Translate customer requirements into actionable product feature requests and participate in decomposition, scoping, and prioritization discussions with Product teams • Lead and support technical discussions, solution workshops, product demonstrations, and technology validations that showcase Vantor’s value • Ensure high levels of customer technical satisfaction throughout the pre-sales engagement process • Provide customer and market feedback to inform product development, competitive positioning, and proposal strategies • Represent Vantor as a technical subject matter expert in customer engagements and relevant Air Force and Space Force industry forums • Support business development initiatives including technical contributions to white papers, RFP responses, and solution strategies for USSF/USAF opportunities • Maintain awareness of the competitive landscape and ensure Vantor solutions are effectively positioned across customer missions and operational use cases.

AWS JavaScript Python TypeScript

View details: Sales Deployment Engineer – USSF, USAF

California + 3 more

$150K - $250K / year

Apply

Job Closed

Senior Cloud Operations Engineer

DigitalOcean

The cloud ☁️ of choice for developers, startups, and growing digital businesses around the world.

DevOps Engineer101 days ago

Other RemoteTeam 1,001-5,000Since 2011H1B Sponsor

Company Site LinkedIn

• Ensuring maximum uptime for our global infrastructure • Automating processes and building tools to improve operational efficiency • Coordinating operational work across teams to improve the platform with minimal impact

Linux Python Ruby

View details: Senior Cloud Operations Engineer

California

$123.6K - $154.5K / year

Apply

Job Closed

EOP - System Reliability Engineer - TS/SCI Required

cFocus Software Incorporated

DevOps Engineer101 days ago

Other RemoteTeam 11-50

cFocus Software seeks a System Reliability Engineer to join our program supporting the Executive Office of the President. This position is remote. This position requires a TS/SCI clearance. Qualifications: - 5+ years and Bachelor's Degree in Computer Programming, Science, Engineering or a related technical discipline, or the equivalent combination of education, technical training, or work/military experience, including: - 3+ years of related systems programming experience - Experience maintaining an operational environment and use of monitoring tools and dashboard interfaces (ie. Kibana, Grafana) - Experience working with container images and platforms (Kubernetes/Docker) - Strong understanding of DevOps and software/application development processes - Understanding of GitLab, Jenkins, ArgoCD, and other DevOps/Continuous Integration tools for Kubernetes - Understanding of microservice design and architectural pattern best practices - Understanding of Python, Bash, and Shell scripting - Knowledge of network technologies, common infrastructure components, load balancers, firewalls, virtual and physical infrastructure design - problem solving and troubleshooting skills - communication and interpersonal skills - Must possess excellent time management skills and the drive to work unsupervised - Experience with deploying to on prem/data center infrastructure - Experience using Jira and Confluence on a daily basis - Experience in building processes for deploying to a Kubernetes based environment using Gitlab and Helm - Understanding of access management and security groups (i.e. IAM, S3 bucket, SSH, VPN, etc.) - Ability to write and use unit and functional testing - Technical Skills: Proficiency in programming languages (such as Python, Go, or Bash) is essential for scripting and automation tasks. Knowledge of Linux/Unix systems is also crucial, as SREs often work in these environments. - Problem-Solving: analytical and problem-solving skills are necessary to diagnose and resolve complex system issues effectively. - Understanding of SRE Principles: Familiarity with key SRE concepts such as Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets is important for measuring and maintaining system reliability. - Reliability and Availability: SRE practices help ensure that services are consistently available and reliable, which is critical for user satisfaction and business success. - Scalability: SREs implement strategies that allow systems to scale efficiently as demand increases, ensuring that performance remains optimal even under heavy load. - Cost Management: By optimizing resource usage and reducing downtime, SREs contribute to cost savings for organizations. - Programming and Scripting: Proficiency in languages like Python, Go, or Ruby is crucial for automating tasks and managing infrastructure. - Operating Systems: A strong understanding of Linux/Unix systems is essential for troubleshooting and managing servers. - Cloud Computing: Familiarity with cloud platforms like AWS, Azure, or Google Cloud is vital for deploying and managing applications in distributed environments. - Containers & Orchestration: Understanding containerization tools like Docker and managing containerized workloads with Kubernetes is crucial for cloud-native applications. - Monitoring and Logging: Proficiency in tools like Prometheus, Grafana, or Elasticsearch, Logstash, and Kibana (ELK) Stack is necessary for tracking metrics, setting up alerts, and analyzing logs. - Networking: Knowledge of networking protocols and configurations is essential for maintaining system health and performance. - Configuration Management: Skills in managing and maintaining system configurations are critical for ensuring system reliability. - Incident Response: Ability to respond quickly and effectively to incidents, including documenting and learning from them. - Security Best Practices: Understanding security protocols and best practices to protect systems from vulnerabilities. - These skills are essential for SREs to maintain high availability and performance, balancing the demands of development and operations. - Support required during core business hours of 8am – 5pm, Monday through Friday. - On-call for evenings or weekends, if needed for outages, application upgrades, security patches or other unplanned activities. Duties: - Monitor system health, availability, and performance using centralized monitoring and logging tools. - Administration of accounts (role-based access and rights). - Manage accessibility to the application through EOP’s authentication systems. - Manage the workflow templates to ensure consistent and predictable task flows. - Configure workflow management for new or adjustments based on user requests, while adhering to EOP template standards. - Maintain configurations and configurable fields for users and workflows. - Maintain the test environment to mimic production and conduct test and evaluation in the environment prior to deployments. - Design and maintain a secure and reliable form of backups, ensuring High Availability (HA) and resiliency. - Develop a Disaster Recovery (DR) or Incident Response (IR) plan for specific applications and services in the event of a disaster or unexpected downtime. - Maintain unique instances that support various offices. - Configure and support integrations with complementary systems. - Establish and Improve system monitoring while maintaining established security protocols within development, test, and production systems. - Architect, build and maintain on premise and/or cloud infrastructure to support team and customer initiatives. - Maintain and improve existing infrastructure (build out autoscaling, support new services, optimize for cost efficiencies/authentication/search, etc.). - Administer production, staging and development environments. - Manage and aggregate server logs and monitor for security and system related incidents. - Monitor and analyze system performance, such as server load and resource usage. - Maintain and improve existing build and deployment processes using CI/CD tools. - Apply configuration management disciplines to maintain software revisions, security patches, hardening, and documentation. - Enforce best practices for security and reliability, and drive security initiatives, like access control and vulnerability testing. - Maintain up to date documentation of designs/configurations, ensuring team members have continuity of recurring tasks. - Maintain status of operations at all times: perform after actions reporting on all outages and work with engineering teams to determine solution and root cause analysis. Present findings to management for prioritization and tasking. - Create and determine required metrics for dashboards and service health. - Follow up on engineering tasks for operational solutions, and validate completion - Manage operational readiness board – present at weekly meetings and determine if development services are ready for automation based on best practices and maintainability. - Track and ensure routine operations maintenance tasks are completed in a timely manner. - Align to the customer's strategies for configuration of workflows, without compromising the integrity of the workflow tool and templates. - Build, maintain, and utilize the customer's enterprise Development, Security, and Operations (DevSecOps) pipeline. - Work with other service providers to support areas of common interest. - On-call support may be required.

View details: EOP - System Reliability Engineer - TS/SCI Required

United States

Apply

Job Closed

Google Cloud Platform Solution Leader

Unisys

Unisys is proud to be an equal opportunity employer that considers all qualified applicants without regard to age, caste, citizenship, color, disability, family medical history, family status, ethnicity, gender, gender expression, gender identity, genetic information, marital status, national origin, parental status, pregnancy, race, religion, sex, sexual orientation, transgender status, veteran status or any other category protected by law. This commitment includes our efforts to provide for all those who seek to express interest in employment the opportunity to participate without barriers. If you are a US job seeker unable to review the job opportunities herein, or cannot otherwise complete your expression of interest, without additional assistance and would like to discuss a request for reasonable accommodation, please contact our Global Recruiting organization at GlobalRecruiting@unisys.com or alternatively Toll Free: 888-560-1782 (Prompt 4). US job seekers can find more information about Unisys’ EEO commitment here.

DevOps Engineer101 days ago

Other RemoteTeam 10,001+Since 1980H1B Sponsor

Company Site LinkedIn

What success looks like in this role: • Leads the Google Cloud Platform (GCP) Solutions program and initiatives that may be global in nature. • Defines the strategy and growth plan for business associated with the hyperscaler. • Drives revenue expansion across new business and existing accounts. • Advances hyperscaler partnership levels/tiers. • Advances pre-sales collateral and supporting material. • Educates and trains pre-sales teams on the solution, offering, capabilities, partnership, etc. • Operates across delivery and engineering enabling a future-state operating and service model. • Supports GTM and pre-sales client conversations. • Supports Marketing in advancing our market presence. • Supports Analyst Relations in promoting our hyperscaler business. • Develops and maintains strong relationships with executives in client organizations. • Leads the development of innovative ideas and principles related to solution development, client service and project management. • Assists leadership in developing expertise and capabilities within the Solutions Architecture organization. • Applies deep understanding of industry / target market trends and client needs to identify and propose out-of-scope or new opportunities to leverage and/or drive innovation for company products and services. You will be successful in this role if you have: BA/BS degree and 8+ years’ relevant experience OR equivalent combination of education and experience Master’s degree preferred Must be certified in GCP #LI-JV1 This role may require access to export-controlled commodities and technology. Therefore, to conform to U.S. export control regulations, applicant should be eligible for any required authorizations from the U.S. Government. Unisys is proud to be an equal opportunity employer that considers all qualified applicants without regard to age, caste, citizenship, color, disability, family medical history, family status, ethnicity, gender, gender expression, gender identity, genetic information, marital status, national origin, parental status, pregnancy, race, religion, sex, sexual orientation, transgender status, veteran status or any other category protected by law. This commitment includes our efforts to provide for all those who seek to express interest in employment the opportunity to participate without barriers. If you are a US job seeker unable to review the job opportunities herein, or cannot otherwise complete your expression of interest, without additional assistance and would like to discuss a request for reasonable accommodation, please contact our Global Recruiting organization at GlobalRecruiting@unisys.com or alternatively Toll Free: 888-560-1782 (Prompt 4). US job seekers can find more information about Unisys’ EEO commitment here.

View details: Google Cloud Platform Solution Leader

United States + 3 more

Apply

Senior Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Sales Deployment Engineer – USSF, USAF

Senior Cloud Operations Engineer

EOP - System Reliability Engineer - TS/SCI Required

Google Cloud Platform Solution Leader