Bloomreach is a computer software company that is on a mission to empower its clients to seamlessly personalize their customer experience and, in turn, successf
Senior DevOps / Infrastructure Engineer
Location
Slovakia
Posted
53 days ago
Salary
€4.5K / month
Seniority
Senior
No structured requirement data.
Job Description
Senior DevOps / Infrastructure Engineer
Bloomreach
Bloomreach is building the world’s premier agentic platform for personalization.We’re revolutionizing how businesses connect with their customers, building and deploying AI agents to personalize the entire customer journey. - We're taking autonomous search mainstream, making product discovery more intuitive and conversational for customers, and more profitable for businesses. - We’re making conversational shopping a reality, connecting every shopper with tailored guidance and product expertise — available on demand, at every touchpoint in their journey. - We're designing the future of autonomous marketing, taking the work out of workflows, and reclaiming the creative, strategic, and customer-first work marketers were always meant to do. And we're building all of that on the intelligence of a single AI engine — Loomi AI — so that personalization isn't only autonomous…it's also consistent.From retail to financial services, hospitality to gaming, businesses use Bloomreach to drive higher growth and lasting loyalty. We power personalization for more than 1,400 global brands, including American Eagle, Sonepar, and Pandora. Are you looking for a cutting-edge tech stack to work with on a daily basis? We are currently expanding our Infrastructure team and are looking for a new colleague to join as a Senior DevOps / Infrastructure Engineer. The salary starts from € 3900 gross per month based on your experience level. You can work in one of our Central Europe offices (Bratislava / Brno / Prague) or from home on a full-time basis. Are you ready to grow with us? What tech stack do we have for you? - Python, Golang - Kubernetes, Terraform, Gitlab - Google Cloud, GCP Bigtable, GCP BigQuery, GRPC - MongoDB, Redis, Elasticsearch, Influxdb, Etcd, Kafka - Victoria Metrics, Grafana, Sentry Minimum requirements: At least 3 years of production experience with: - Kubernetes - we are looking for engineer that not only deployed applications to a cluster, but who also understands what is happening behind the scenes and can operate 24x7 production - GCP (preferred)/AWS/Azure - our solution is built on top of GCP platform. Candidate should be comfortable working with public cloud, understand the risks and benefits associated with running applications in the public cloud, be familiar with infrastructure as a code principle and have ability to make design choices between using cloud managed solutions versus self hosted alternatives - Python/Go - you should be a solid programmer capable of developing custom tooling How to know if you are good fit: The qualifications outlined below serve as a guide to determine if your skills and experience align with the requirements of this position: - Continuous Learning: You have a keen interest in Kubernetes and related technologies, demonstrated by your active engagement in reading and staying updated about them. - Conference Participation: You have participated in DevOps related conferences, showcasing your commitment to continuous learning and networking in the field. - Configuration Proficiency: You have hands-on experience configuring pod/container security context, network policies, roles and role bindings, pod affinity, host path, pod disruption budgets, priority classes, node taints, to name a few. - Resource Optimization: You have analyzed resource usage of applications hosted on a cluster and implemented or suggested changes to resource requests/limits, Horizontal Pod Autoscalers (HPAs), or Vertical Pod Autoscalers (VPAs). - Cluster Management: You have a deep understanding of the clusters you manage, including the types of machines used in node pools, the reasons for their selection, the enabled or disabled cluster features, the cluster version, and the node autoscaling setup. You have successfully upgraded Kubernetes cluster versions without causing interruptions to live applications hosted on the cluster. - Terraform Proficiency: You have written a Terraform module with multiple interconnected resources. - Monitoring and Alerting: You have experience setting up monitoring systems and configuring alerts. On-duty experience is preferred, along with experience with Grafana and Prometheus. - DevOps and CI/CD Experience: You have experience with DevOps, Orchestration/Configuration Management, and Continuous Integration technologies such as Terraform, GitLab, Ansible, Docker, etc. - Team Onboarding and Training: You have experience with onboarding and training new team members, demonstrating your leadership skills and commitment to team growth. About your team: The Infrastructure team operates and maintains Bloomreach Engagement core infrastructure built on Google Cloud with security, high availability, costs, and scalability in mind. Our vision is to identify and implement opportunities to achieve a robust, reliable, and efficient infrastructure and development platform. We strongly support DevOps culture: each team is responsible for releasing, operating, and monitoring their own applications. The role of the Infrastructure Team is to provide a strong foundation upon which all teams can build, for example, manage big infrastructure components like Kubernetes, databases, and cloud components in Google Cloud. An important role of the team is also providing support for developers, reviewing design proposals, validating the performance and availability of applications, and sometimes even developing new core application components like logging or authorization. Tasks and responsibilities: In the position of DevOps Engineer, you’d be expected to work with other Engineering teams to design sustainable infrastructure, microservice solutions, and an efficient and robust production environment. Additionally, you’ll be working on a variety of tasks and projects, including automating tools and infrastructure to reduce manual work, monitoring applications and participating in an on-call rotation as required. The ideal candidate will be passionate about learning new things, creative, willing to take the initiative, and able to think outside the box to solve problems strategically. #LI-KP1 More things you'll like about Bloomreach: Culture: - A great deal of freedom and trust. At Bloomreach we don’t clock in and out, and we have neither corporate rules nor long approval processes. This freedom goes hand in hand with responsibility. We are interested in results from day one. - We have defined our 5 values and the 10 underlying key behaviors that we strongly believe in. We can only succeed if everyone lives these behaviors day to day. We've embedded them in our processes like recruitment, onboarding, feedback, personal development, performance review and internal communication. - We believe in flexible working hours to accommodate your working style. - We work virtual-first with several Bloomreach Hubs available across three continents. - We organize company events to experience the global spirit of the company and get excited about what's ahead. - We encourage and support our employees to engage in volunteering activities - every Bloomreacher can take 5 paid days off to volunteer*. - The Bloomreach Glassdoor page elaborates on our stellar 4.4/5 rating. The Bloomreach Comparably page Culture score is even higher at 4.9/5 Personal Development: - We have a People Development Program -- participating in personal development workshops on various topics run by experts from inside the company. We are continuously developing & updating competency maps for select functions. - Our resident communication coach Ivo Večeřa is available to help navigate work-related communications & decision-making challenges.* - Our managers are strongly encouraged to participate in the Leader Development Program to develop in the areas we consider essential for any leader. The program includes regular comprehensive feedback, consultations with a coach and follow-up check-ins. - Bloomreachers utilize the $1,500 professional education budget on an annual basis to purchase education products (books, courses, certifications, etc.)* Well-being: - The Employee Assistance Program -- with counselors -- is available for non-work-related challenges.* - Subscription to Calm - sleep and meditation app.* - We organize ‘DisConnect’ days where Bloomreachers globally enjoy one additional day off each quarter, allowing us to unwind together and focus on activities away from the screen with our loved ones. - We facilitate sports, yoga, and meditation opportunities for each other. - Extended parental leave up to 26 calendar weeks for Primary Caregivers.* Compensation: - Restricted Stock Units or Stock Options are granted depending on a team member’s role, seniority, and location.* - Everyone gets to participate in the company's success through the company performance bonus.* - We offer an employee referral bonus of up to $3,000 paid out immediately after the new hire starts. - We reward & celebrate work anniversaries -- Bloomversaries!* (*Subject to employment type. Interns are exempt from marked benefits, usually for the first 6 months.) Excited? Join us and transform the future of commerce experiences! If this position doesn't suit you, but you know someone who might be a great fit, share it - we will be very grateful! Any unsolicited resumes/candidate profiles submitted through our website or to personal email accounts of employees of Bloomreach are considered property of Bloomreach and are not subject to payment of agency fees. #LI-Remote
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
SRE
ZensarAt Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus. Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.
Cloud & Infrastructure Expertise: Strong knowledge of AWS services (EC2, RDS, S3, IAM), networking, and Infrastructure as Code (Terraform/CloudFormation). Reliability & Automation Skills: Proficiency in CI/CD pipelines, monitoring tools (CloudWatch, Prometheus, Grafana), and incident response automation. Security & Performance Focus: Ability to enforce IAM policies, compliance standards, and optimize workloads for scalability, resilience, and cost efficiency
Head of Security – DevOps
iTalent PLUSA Recruitment agency that aims to simplify the hiring needs of organisations.
• Develop, implement, and maintain the organisation’s information security strategy and cybersecurity framework. • Establish security policies, standards, and governance structures to protect systems, infrastructure, and data assets. • Ensure alignment of security practices with operational objectives and the broader technology roadmap. • Own the reliability and scalability of the organisation’s cloud infrastructure, including container orchestration, CI/CD pipelines, observability, and disaster recovery. • Design and maintain infrastructure-as-code (IaC) across AWS or equivalent cloud platforms, ensuring reproducibility, auditability, and least-privilege access. • Build and optimise CI/CD pipelines to enable fast and secure deployments, including Docker build caching, multi-stage builds, and automated testing gates. • Establish SLOs, SLIs, and error budgets, while leading incident management and on-call practices. • Architect and maintain disaster recovery and business continuity plans, including cross-region failover and backup strategies. • Drive cloud cost optimisation while maintaining high performance and security standards. • Identify, assess, and manage cybersecurity risks across the organisation’s technology environment. • Implement risk mitigation strategies and security controls to protect critical infrastructure and digital assets. • Monitor emerging cyber threats and vulnerabilities that could impact operations or infrastructure. • Oversee monitoring, detection, and response processes for cybersecurity incidents and vulnerabilities. • Coordinate incident response activities, ensuring proper investigation, containment, and remediation. • Support the development and maintenance of incident response plans and procedures. • Oversee security frameworks related to digital assets, wallets, and transaction infrastructure where applicable. • Support safeguards that protect wallet systems, transaction flows, and overall platform integrity. • Collaborate with Risk, Fraud, and Product teams to strengthen controls against abuse, account compromise, and system manipulation. • Ensure alignment with relevant regulatory obligations, compliance requirements, and industry standards. • Support internal and external audits, risk assessments, and compliance reviews. • Maintain oversight of data protection, security controls, and governance frameworks. • Promote a strong culture of security awareness through training, guidance, and knowledge-sharing initiatives. • Identify opportunities to enhance the organisation’s cybersecurity posture through improved tools, processes, and practices.
Principal Site Reliability Engineer
Parallel DomainSynthetic data for computer vision and perception.
About the Role Parallel Domain is looking for a Principal Site Reliability Engineer to own the reliability, scalability, and security of our cloud infrastructure - the backbone that runs simulation workloads for some of the most demanding customers in autonomous vehicle development. This is a hands-on, high-ownership role. You'll be the primary infrastructure owner across our multi-region AWS/EKS platform, working closely with a small platform engineering team, partnering with engineering leads across simulation and ML, and our customer-facing teams. What You'll Do Infrastructure Ownership & Cloud Operations - Own and evolve our AWS-based infrastructure, improving platform performance and availability today, and building toward deployable configurations that support enterprise customer environments tomorrow. - Own EKS cluster operations across production regions: node pool strategy, AMI lifecycle, autoscaling, and Kubernetes workload health. - Support the GitOps deployment pipeline - define, deploy, and manage applications across clusters using infrastructure-as-code. - Manage complex networking: VPC design, cross-region connectivity, DNS, and load balancing. - Lead infrastructure deprecation and migration efforts with minimal disruption. Reliability Engineering & Incident Response - Own SLO measurement infrastructure; enable proactive triage of emerging issues before they impact customers. - Lead incident investigation, root cause analysis and postmortems, driving systemic fixes rather than one-off patches. - Design and improve automated remediation systems to reduce MTTR. Security & Access Management - Review and provide security-conscious feedback on platform architecture decisions. - Own cloud IAM governance - roles, policies, and access boundaries across accounts and services. - Lead compliance-adjacent work including audit-readiness, partner certification requirements, and supporting responses to customer security questionnaires. Cross-Functional Collaboration - Partner with application development teams to build an inherently secure platform and drive next-generation deployment architecture. - Partner with customer teams to ensure availability for expected utilization. - Partner with Finance on cloud cost optimization - lifecycle policies, right-sizing, and spend visibility. - Support GPU and batch workloads in collaboration with simulation and ML engineering teams. Platform Tooling & Developer Experience - Improve CI/CD pipelines and automated infrastructure validation. - Support engineering teams with infra-side debugging, log analysis, and environment configuration. What We're Looking For Technical Depth - 5+ years in SRE, DevOps, or infrastructure engineering roles. - Infrastructure-as-code proficiency - Terraform modules, state management, and multi-environment patterns. - Deep AWS experience - EKS, EC2, IAM, S3, Storage Gateway, VPC networking, Transit Gateway, CloudFront, KMS, and IRSA. - Kubernetes expertise - cluster operations, node pools, probes, cordoning, pod scheduling, RBAC, Helm, node autoscaling (Karpenter experience a plus); solid understanding of containerization and AMI lifecycle management. - CI/CD - experience with GitOps workflows and pipeline tooling (ArgoCD, GitHub Actions, Jenkins) - Solid networking fundamentals - CIDR design, security groups, DNS, load balancing, VPN, cross-region connectivity. - Experience with monitoring and observability tooling - Prometheus, Grafana, Elasticsearch. - Comfort with Python and Bash for tooling and automation. - Familiarity working across Linux and Windows environments. Operational familiarity with Windows Server is a meaningful advantage. Communication & Ownership - You communicate clearly across engineering, product, and customer-facing teams, flagging issues with urgency proportional to customer impact. - You advocate for SRE best practices and can effectively operationalize an informed and principled view on security. - You take end-to-end ownership of complex, multi-team efforts - from planning through execution and post-change verification. - You know when to push for a clean solution vs. when to accept a pragmatic one, and you communicate that tradeoff clearly. Nice to Have - Experience with Windows-based workloads on EKS. - Experience supporting simulation, ML, or rendering workloads in cloud infrastructure; running GPU workloads on Kubernetes, including NVIDIA and DirectX device plugin configuration. - Experience with AWS Storage Gateway or Transfer Family integrations. - Familiarity with Envoy Gateway or similar. - Experience with container-optimized OS images (e.g., Bottlerocket, Packer). - Experience with cloud cost optimization at scale. Core ToolsTerraform · AWS · Kubernetes · Helm · ArgoCD · Kustomize · Grafana · Prometheus · Elasticsearch · VictoriaLogs · Fluent Bit · GitHub Actions · Jenkins · Docker · Python · Bash Why This Role PD's simulation platform runs at the intersection of high-performance compute, distributed systems, and customer-critical reliability. The infrastructure problems here are genuinely interesting — multi-region GPU scheduling, Windows workloads on Kubernetes, startup latency optimization, and an enterprise product direction that will require rethinking how we deploy and manage the platform entirely. The Principal SRE at PD is not a ticket-taker - it's a high-trust, high-autonomy position where you'll have genuine influence over infrastructure architecture, cross-team process, and customer experience.
Senior Site Reliability Engineer
Centene CorporationTransforming the health of the communities we serve, one person at a time.
You could be the one who changes everything for our 28 million members by using technology to improve health outcomes around the world. As a diversified, national organization, Centene's technology professionals have access to competitive benefits including a fresh perspective on workplace flexibility. Position Purpose: Helps lead projects that are focused on managing and maintaining optimum platform infrastructure performance, reliability, and security using SRE practices, observability tools, manual and automated procedures, documentation, people and processes and continuous delivery(CI/CD) tools, processes, and designs. Develops complex services to automate monitoring activities and provide critical information to facilitate response and resolution of performance and availability issues and incidents. Understands and advocates for standardized and scalable software tools to ensure that systems operate without interruption at optimum performance and leads project teams through out the deployment process. Troubleshoots and analyzes service disruptions to determine the root cause of issues and develop solutions for improved reliability. - Support multiple applications and schedule batch jobs for a large number of transactions weekly - Troubleshoots and resolves more complex problems with systems and services and initiates regular deployment of new versions of the systems and their subcomponents - Leads more complex projects focused on building and maintaining observability/monitoring for the application, monitoring key performance indicators, maintaining alerting, and continuously improving visibility. - Helps make decisions around periodic system validation and testing, service monitoring, and standing up new services/tools - Uses knowledge and experience to identify strategies that increase system reliability and performance through on-call rotation and process optimization - Identifies and implements necessary manual and automated procedures for improved collaborative response in real-time - Leads lower level Engineers in stress, security, and performance testing - Resolves issues that come up through support escalation - Keeps documentation and runbooks up to date to effectively deal with new incidents that might arise - Leads post incident reviews and documents findings for future informed decision making - Reviews proposals to optimize Software Development Life Cycle (SDLC) to boost service reliability and makes decisions around which proposals should move forward. - Communicates complex topics with development teams to investigate and document issues and leads internal team to develop solutions to mitigate them - Performs other duties as assigned - Complies with all policies and standards Education/Experience: A Bachelor's degree in a quantitative or business field (e.g., statistics, mathematics, engineering, computer science) and Requires 4 – 6 years of related experience. Or equivalent experience acquired through accomplishments of applicable knowledge, duties, scope and skill reflective of the level of this position. Technical Skills: - One or more of the following skills are desired. - - Experience with SRE or DevOps - Batch scheduling - Monitoring experience - SQL Pay Range: $87,000.00 - $161,300.00 per year Centene offers a comprehensive benefits package including: competitive pay, health insurance, 401K and stock purchase plans, tuition reimbursement, paid time off plus holidays, and a flexible approach to work with remote, hybrid, field or office work schedules. Actual pay will be adjusted based on an individual's skills, experience, education, and other job-related factors permitted by law, including full-time or part-time status. Total compensation may also include additional forms of incentives. Benefits may be subject to program eligibility. Centene is an equal opportunity employer that is committed to diversity, and values the ways in which we are different. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or other characteristic protected by applicable law. Qualified applicants with arrest or conviction records will be considered in accordance with the LA County Ordinance and the California Fair Chance Act



