At Cloudera, we believe that data can make what is impossible today, possible tomorrow.
Staff Site Reliability Engineer
Location
Czechia
Posted
2 days ago
Salary
0
Seniority
Lead
Job Description
Staff Site Reliability Engineer
Cloudera
• Improve reliability and scalability of the platform • Build systems to manage platform infrastructure and applications (platform engineering) • Optimize existing systems and eliminate toil through simplification and automation • Provide operational support and engineering assistance to the whole of Cloudera engineering • Monitor availability, latency and overall service health • Practice sustainable incident response and blameless postmortems • Participate in an on-call rotation
Job Requirements
- Bsc/Msc in related field or equivalent experience
- Have 5+ years industry experience as an SRE, DevOps or related role
- Enjoy collaborating and are a strong communicator
- Strong Linux and systems administration experience
- Amazon Web Services (AWS) expertise, especially EKS, networking, security and scaling
- Experience with container technology and microservices architectures, including Kubernetes
- Experience with observability, logging and monitoring tools
- Experience with Terraform and related technologies
- Experience with CI/CD tools, such as Spinnaker, Jenkins, Flux CD, Argo CD
- Experience with GitOps and Git-based automation
- Programming experience in Python, Go or similar languages
- Experience with compliance programs such as SOC, FedRAMP, HITRUST CSF
- Experience with database systems, including Postgres and MySQL
- Experience with Microsoft Azure or Google Cloud Platform
Benefits
- Generous PTO Policy
- Support work life balance with Unplugged Days
- Flexible WFH Policy
- Mental & Physical Wellness programs
- Phone and Internet Reimbursement program
- Access to Continued Career Development
- Comprehensive Benefits and Competitive Packages
- Paid Volunteer Time
- Employee Resource Groups
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description As a Senior Software Engineer, DevOps at Upstart, you will help evolve our Ephemeral Infrastructure platform: the Kubernetes-based environments, cloud infrastructure, automation, and developer tooling that enable engineers to rapidly develop and validate software. You will own meaningful technical areas across platform engineering, infrastructure automation, and developer experience while helping improve reliability, scalability, and engineering productivity across Upstart. - Design, build, and operate Kubernetes-based ephemeral environments that enable engineers to develop, test, and validate software efficiently. - Improve the reliability, scalability, performance, and usability of the Ephemeral Infrastructure platform through automation and platform enhancements. - Partner with product engineering, platform, security, and reliability teams to integrate infrastructure capabilities and improve developer workflows. - Build infrastructure automation, tooling, and self-service capabilities that reduce operational toil and accelerate software delivery. - Enhance observability, incident response, and operational practices to improve platform health and engineer productivity. - Contribute to the long-term architecture and technical direction of Upstart’s developer platform and cloud infrastructure ecosystem. Qualifications - Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related field (or equivalent practical experience) and 4+ years of software engineering experience. - 4+ years of experience designing, deploying, and operating production Kubernetes environments. - Experience building and operating cloud infrastructure on AWS, including services such as EKS, EC2, IAM, and networking components. - Experience with Kubernetes operators, controllers, and cloud-native platform architecture. - Experience developing software and automation using Go or a comparable programming language. - Experience implementing infrastructure-as-code, CI/CD pipelines, and automated operational workflows in production environments. Requirements - Certified Kubernetes Administrator/Architect (CKA/CKAD) or equivalent certification. - Experience with Terraform, Helm, GitOps practices, and tools such as ArgoCD. - Experience operating distributed systems with a focus on observability, reliability, and incident response. - Ability to influence platform adoption and collaborate effectively across multiple engineering teams. - Experience building internal developer platforms, ephemeral environment solutions, or developer productivity tooling. Benefits - Competitive compensation, including base pay, bonus opportunities, and annual equity grants that vest quarterly. - Retirement benefits to help you plan for the future, including a 401(k) or Group Retirement Savings Plan with a company match of $2 for every $1 contributed, up to $15,000 annually (USD in the US, CAD in Canada). - Employee Stock Purchase Plan (ESPP) with discounted stock purchase options for eligible employees (US only). - Comprehensive health coverage designed to support you and your family, including medical, dental, vision, and wellness resources for US and supplemental health coverage for Canada. - Health Savings Account contributions from Upstart for eligible plans (US only). - Income protection benefits, including life insurance and disability coverage for added financial security. - Paid time off, sick leave, and company holidays, in line with local requirements. - Paid family and parental leave to support caregiving and major life moments (duration varies by country). - Family-centered benefits to support fertility, parenthood, and caregiving needs. - Employee Assistance Program (EAP) offering mental health support and life-centered resources. - Financial wellness resources, including access to financial planning tools and a financial concierge service (US Only). - Annual wellness allowance to support your physical and emotional well-being and personal development, based on what matters most to you. - Annual productivity allowance to invest in relevant tools and resources you need to do your best work, no matter where you work from. - Connection and community through team events, all-company updates, and employee resource groups (ERGs). - Onsite perks, including catered lunches and fully stocked micro-kitchens when working from one of our offices in the Bay Area, Austin, Columbus, and New York City (opening Summer 2026!).
• Implement and deploy complex robotic order fulfillment solutions at customer sites. • Create and configure SLAM-based maps of warehouse fulfillment centers. • Manage onsite customer relationships, coordinating effectively with internal teams and external stakeholders. • Provide on-site customer support following system installation.
Site Reliability Engineer
WescoWesco is a global wholesale distributor of communications, electrical, and utility solutions and supply chain services. As an employer, the company strives to f
Role Description As a Lead Engineer – Technical Services, you will provide guidance, training, and technical support to associate engineers and engineers on the technical service team. You will lead teams to the design and development of information technology architecture for internal and/or external clients. You will utilize expertise to coach the team to effectively support customers, including product design and configuration based on architectural requirements and infrastructure needs. You will act as the liaison between engineering, applications engineering, marketing, product development, and project management teams. - Interfaces with internal and external clients to provide field support for successful project completion. - Provides feedback to program leadership regarding system performance and end-user comments. - Serves as a technical expert on solutions design, development, and implementation requirements to address business needs. - Requires broad and deep technical knowledge and experience across varying infrastructure requirements, development, design, and re-engineering. - Provides expertise for resolving technical problems, troubleshoots product, and modifies product to customer requirements. - Provides training to customers. - Coordinates, schedules, and provides work direction to engineers, test specialists, test technicians, and external personnel. Oversees safety of assigned personnel in these activities. - Defines scope of major projects and consults with contractors and/or vendors. - Coordinates maintenance activities. - Responsible for the effective and efficient use of assigned personnel, equipment, and material resources for assigned projects or maintenance work. Qualifications - Bachelor’s Degree - Engineering discipline. - Licenses/Certificates/Designations - IT industry networking certifications such as CCNA or CCDA; or CTS, CTS-I, or CTS-D. - 3+ years technical services experience. - 3+ years required troubleshooting network security configurations and protocols and have experience with third-party control integration (AMX, Crestron, etc.). - Network security concepts (802.1x, RADIUS, Security Certificate Management). - Possess a customer-centric mindset. - Possess strong computer skills, including proficiency with Microsoft Office Outlook, Word, Excel, and PowerPoint. - Demonstrate strong leadership skills and the ability to make decisions within the designated area of responsibility. - CompTIA certifications such as Network+ and Security+ preferred. - Ability to travel up to 25%. Requirements This amount is what we reasonably believe we will pay for the position; however, offer amounts may vary based on factors such as geographic location, relevant education, experience, qualifications, skills, shift, or any collective bargaining agreements. For eligible positions, compensation may include participation in a bonus or sales incentive plan, subject to the terms and conditions of the applicable plan documents. For certain sales roles, Wesco also offers a commission structure that provides additional compensation based on sales results, as defined by the applicable commission plan. Benefits - Paid time off. - Medical, dental, and vision coverage. - Retirement savings plans. Company Description At Wesco, we build, connect, power, and protect the world. As a leading provider of business-to-business distribution, logistics services, and supply chain solutions, we create a world that you can depend on. Our Company’s greatest asset is our people. Wesco is committed to fostering a workplace where every individual is respected, valued, and empowered to succeed. We promote a culture that is grounded in teamwork and respect. With a workforce of over 20,000 people worldwide, we embrace the unique perspectives each person brings. Through comprehensive benefits and active community engagement, we create an environment where every team member has the opportunity to thrive. Founded in 1922 and headquartered in Pittsburgh, Wesco is a publicly traded (NYSE: WCC) FORTUNE 500® company.
Product Reliability Engineer
Pinpoint Applicant Tracking SystemPinpoint is the ATS that makes complex hiring simpler.
• Own the full lifecycle of issues: triage, diagnose, fix, and prevent high impact problems across the product. Roughly half your time is reactive (the escalation queue), half is proactive (stopping the next ticket before it's raised) • Build internal tooling that makes other teams self sufficient, especially our Technical Success team (part of R&D, there to resolve technical complexity on behalf of every customer facing team). The goal: they get what they need without waiting on engineering. Bulk operations, config changes, diagnostics, automation of anything manual and painful • Build a world-class feedback mechanism back to the product squads. You'll proactively and visibly feed what you're seeing (product pain, recurring issues, what users are actually reporting) back to roadmap teams and PMs, so it's reliably ingested and the same problems stop coming back • Make the application meaningfully faster and more performant: instrumentation, logging, monitoring, and hands on performance work. (This matters even more right now as we scale our infrastructure function) • Dent the backlog: fixing root causes, not symptoms, and removing repeat issues so net throughput keeps improving cycle over cycle • Reach for AI to make all of the above faster and better: embedding AI into the triage and feedback loop, automating toil, and unlocking the context buried in eight years of codebase and docs • Work a light weekday escalation rotation, and shipping to production independently within your first year • The scope of this team is growing fast: as the tooling gets easier to build, what we can take on keeps expanding. There's a lot of room here for someone who wants it



