HiBob is a modern HR technology company focused on transforming the way organizations operate in today’s dynamic workplace. Its platform streamlines core HR processes, enhances e

AI Infrastructure & Reliability Engineer

DevOps EngineerDevOps EngineerFull Time Remote Mid LevelTeam 1,350Since 2015

Location

Israel

Posted

35 days ago

Salary

Seniority

Mid Level

Bachelor Degree5 yrs expEnglishAWS Github Actions Kubernetes Python Terraform

Job Description

Job Description About UsHiBob helps modern, mid-size businesses transform the way they manage people, giving HR and managers all they need to connect, engage, develop, and retain top talent. Since 2015, we've achieved consecutive triple-digit year-over-year growth, all backed by our amazing team of Bobbers from across the globe, making us the choice HRIS of over ~5500 midsize and multinational companies and over 1 Milion users. Our HR platform is intuitive, data-driven, and built for the way people work today: globally, remotely, and collaboratively. What this role is really about You'll join a 3-person platform team within our Business Technology group -owning the internal infrastructure that our AI platform and its users depend on. This isn't a product engineering role, and it isn't ticket work or babysitting pipelines someone else built. You're building and operating the internal foundation that the company runs on. The work covers the full stack of platform engineering: core cloud infrastructure (AWS, Kubernetes, IaC), CI/CD pipelines, AI-driven infrastructure components, and the SRE and observability practice that keeps it all honest -metrics, alerting, incident response, and reliability standards. As our AI capabilities grow, so does the complexity underneath them, and staying ahead of that is central to the role. If you treat infrastructure as a product -reusable, automated, observable, and built to last -this is your kind of role. Job Requirements - 2-4 years Hands-on DevOps, SRE, or infrastructure engineering in production SaaS environments. - Strong AWS experience: multi-account architecture, cross-account IAM, serverless and event-driven services (Lambda, SQS, SNS, EventBridge), and EKS cluster management. - Proven Kubernetes experience in production, including cross-account migrations and stateful workload management. - Proficiency with Terraform - repository structure design, module architecture, and CI/CD pipeline implementation. - Hands-on experience building and maintaining GitHub Actions pipelines for end-to-end CI/CD workflows. - Working Python proficiency for scripting, internal tooling, and workflow automation. - Practical experience implementing observability stacks from scratch: metrics, logging, distributed tracing, and alerting. - Experience owning reliability practices: SLOs, incident response, and postmortem culture. Nice to have - Hands-on experience operating LLM APIs in production: rate-limit and quota management, cost attribution per team/model, latency monitoring, and resilience patterns (retries, fallbacks, circuit breakers). - FinOps experience across cloud, AI, and observability spend. - Experience introducing self-healing or auto-remediation patterns in production. Job Responsibilities - DevOps & AI-Driven Infrastructure - own CI/CD, deployment processes, and release reliability. Build and operate cloud infrastructure that is automated, intelligent, and continuously self-improving - not just managed. - Design and build our Terraform repository and IaC pipeline from scratch -AI-assisted generation, drift detection, and policy enforcement built in. - Build AI-driven GitHub Actions pipelines -automated code review, risk assessment, and intelligent deployment decisions. - Manage Kubernetes workloads across AWS accounts -zero downtime, fully automated, nothing left behind. - Embed AI into the operational layer -proactive drift detection, automated remediation, and intelligent scaling toward a self-healing runtime. - Reliability & SRE -improve uptime, resilience, and incident response. - Define and enforce SLOs/SLIs, error budgets, and on-call practices. - Lead incident response, postmortems, and systemic reliability improvements. - Own AI-specific reliability: model latency SLOs, token quota monitoring, rate limit handling, fallback and retry strategies, and cost-per-request alerting. - Observability & Telemetry - increase visibility, reduce noise, improve troubleshooting. - Establish and continuously evolve the observability stack: metrics, logs, distributed tracing, and alerting tuned for both application and AI workloads. - AI / LLM Operations- bringing AI systems to production and operating them at scale, with a focus on reliability, performance, and trust. - Own the AI infrastructure layer: rate limits, quota management, latency SLOs, and fallback strategies (retries, circuit breakers). - Operate LLM APIs in production with resilience and cost attribution per team/model. - FinOps & Cost Optimization - optimize AI, infra, and logging costs at scale. - Build cost visibility and guardrails across AWS, LLM usage, and observability pipelines. Benefits Join our village HiBob is a village filled with amazing people and we're especially proud of that. It's a place where Bobbers can be themselves. We're about fun, dreams, hopes and ambition, just as much as we are about precision, growth, and top performance. Becoming a Bobber means you'll receive competitive compensation, benefits, and pre-IPO equity alongside all of this: - Company share options plan - We have a flexible hybrid working model - Work from home allowance- to get your home office set up! - Payment for sick leave from the first day - 2 Social Impact days per year for volunteering - Annual Headspace subscription and wellness benefits - Awesome employee referral program- $2,500 for each successful referral with an additional ambassador programme - Monthly Wolt Allowance - Transportation allowance - Dog-friendly - Temporary remote work from anywhere in the world for up to 2 months (after 6 months of employment) - Fun company and team social events (locally and virtually with our global teams) - Bob balance days - 4 additional days within a calendar year - Enjoy a company-wide long weekend at the beginning of each quarter If this sounds like something you've been looking for, we'd love to have you. Come on, join our village!

Benefits

401(K), 401(K) matching, Commuter benefits, Company equity, Company-sponsored outings, Company sponsored family events, Dental insurance, Disability insurance, Volunteer in local community, Family medical leave, Flexible Spending Account (FSA), Generous parental leave, Generous PTO, Company-sponsored happy hours, Health insurance, Highly diverse management team, Open door policy, Life insurance, Mentorship program, Paid volunteer time, Online course subscriptions available, Open office floor plan, Paid holidays, Paid sick days, Performance bonus, Pet friendly, Pet insurance, Promote from within, Lunch and learns, Remote work program, Return-to-work program post parental leave, Free snacks and drinks, OKR operational model, Team workouts, Mandated unconscious bias training, Vision insurance, Wellness programs, Some meals provided, Mental health benefits, Home-office stipend for remote employees, Diversity employee resource groups, Hiring practices that promote diversity, Employee resource groups, Employee-led culture committees, Day off for your birthday, Quarterly engagement surveys, Hybrid work model, In-person revenue kickoff, Employee awards, Diversity recruitment program, Pay transparency, Wellness days, Mother's room, Virtual coaching services, Bereavement leave benefits

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps Engineer

TekSynap

TekSynap, formerly known as Synaptek, is a privately held, ISO-certified IT company offering solutions and services to meet the business technology needs of local, state, and feder

DevOps Engineer35 days ago

Full Time Remote

Role Description We are seeking a DevOps Engineer (Senior). - Facilitates the development of new software solutions and transition of existing solutions from monolithic structures to micro-service structure operating within hardened containers. - Working with application development teams to refactor or create solutions that leverage the DevSecOps CI/CD pipeline and tools. - Instructs/guides teams through their solution development. - Deploys and sustain microservices factory utilizing COTS and open-source solutions. Qualifications - Five (5+) years Agile experience. - Driving strategy and overseeing architecture of continuous integration and deployment, and monitoring across technologies. - Demonstrated experience with defining SAFe Agile methodology for large scale clients. - Demonstrated experience in leading DevOps methodologies. - Demonstrated experience with implementing Test Driven Development (TDD) Methodologies. - Demonstrated experience with driving automated software development lifecycle toolchain. - Demonstrated experience with deployments in both on premise and cloud environments. - Experience serving as the engineer of complex technology implementations in a product centric environment. - Experience with DevOps services using infrastructure as a service provider (e.g., Amazon Web Services, Microsoft Azure, Google Compute Engine, RackSpace/OpenStack). - Using scripting or basic programming skills to solve problems. - Experience with configuration management tools (e.g., TFS, Puppet, Chef, Ansible, Salt, LVM). - Familiarity with containerization technologies (e.g., LXC, Docker, Rocket, OpenShift). - Preferred SAFe Agile Certification or industry recognized equivalent certification. - Demonstrated experience supporting government agencies, customers, or contracts within federal environments. - Certifications: Cloud Service Provider Certification: AWS; Microsoft Azure; Google Cloud Platform. - Security + - Bachelor’s Degree - Clearance: Secret, IT II Requirements - Location: Remote with periodic support at Fort Belvoir or other places in the National Capitol Region. - Type of environment: Remote - Noise level: Low - Work schedule: Schedule is day shift Monday – Friday. May be requested to work evenings and weekends to meet program and contract needs. - Amount of Travel: less than 10% - List of Approved States: AL, AK, AZ, AR, CT, DE, FL, GA, ID, IN, IO, KS, KY, LA, ME, MI, MS, MO, MT, NE, NV, NH, NM, NC, ND, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VA, WV, WI, WY. - U.S. Citizen - Secret Clearance Physical Demands The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions. - Regularly required to use hands to handle, feel, touch; reach with hands and arms; talk and hear. - Regularly required to stand; walk; sit; climb or balance; and stoop, kneel, crouch, or crawl. - Regularly required to lift up to 10 pounds. - Frequently required to lift up to 25 pounds; and up to 50 pounds. - Vision requirements include close vision, distance vision, peripheral vision, depth perception, and ability to adjust focus. Benefits - Competitive benefits package including health, dental, vision, 401K, life insurance, short-term and long-term disability plans, vacation time and holidays. Equal Employment Opportunity In order to provide equal employment and advancement opportunities to all individuals, employment decisions will be based on merit, qualifications, and abilities. TekSynap does not discriminate against any person because of race, color, creed, religion, sex, sexual orientation, gender identity, protected veteran status, national origin, disability, age, genetic information or any other characteristic protected by law.

View details: Senior DevOps Engineer

United States

Apply

Sr Site Reliability Engineer

Dynatrace

Dynatrace is a global application performance management software firm and a former member of Compuware. As an employer, the company is in support of helping its team achieve a hea

DevOps Engineer35 days ago

Full Time RemoteTeam 5,200Since 2005

Your role at DynatraceWe are strengthening our Site Reliability Engineering team based in Sydney and looking for an SRE to join our innovative team. Your detailed responsibilities in this new team will be: - Automate Manual Tasks: Leverage your production expertise to translate manual processes into automated solutions, driving operational efficiency. - Optimize Capacity Planning: Ensure cost-effective resource utilization while maintaining scalability and performance. - Product Release Management: Oversee and coordinate product release processes, refining workflows through continuous improvement initiatives. - Implement Monitoring & Alerting: Design and deploy automated monitoring and alerting systems to boost the efficiency, reliability, and scalability of cloud infrastructure. - Incident Resolution: Support production stability by promptly investigating and resolving production incidents. - Monitoring Configuration: Configure and maintain robust monitoring solutions to ensure efficient, scalable, and seamless production operations. - On-Call Support: Participate in On-Call rotations to provide critical support and maintain system stability and uptime. What will help you succeed - 3+ years of experience in scripting and/or programming languages such as Go, C, Shell, Python, or Java. - Proficiency in at least one Hyperscaler (AWS, Azure, or GCP). - Enthusiastic, go-for-it attitude with hands-on experience in Kubernetes (preferred). - Strong communication, problem-solving, and critical-thinking skills. - Curiosity and interest in working on highly scalable systems. Why you will love being a Dynatracer - Dynatrace is a leader in unified observability and security. - We provide a culture of excellence with competitive compensation packages designed to recognize and reward performance. - Our employees work with the largest cloud providers, including AWS, Microsoft, and Google Cloud, and other leading partners worldwide to create strategic alliances. - The Dynatrace platform uses cutting-edge technologies, including our own Davis hypermodal AI, to help our customers modernize and automate cloud operations, deliver software faster and more securely, and enable flawless digital experiences. - Over 50% of the Fortune 100 companies are current customers of Dynatrace.

AWS Azure C GCP Go Java Kubernetes Python Shell

View details: Sr Site Reliability Engineer

Australia

Apply

DevSecOps Engineer

Team Velocity Marketing

Team Velocity is an automotive retailer providing digital marketing, advertising, and data analytics services to improve client sales and automotive service processes. Through cutt

DevOps Engineer35 days ago

Full Time Remote

• Form, Lead, and Execute Red Team engagements simulating real-world attack scenarios. • Collaborate with SRE and DevOps teams to validate findings and recommend remediation strategies. • Manage full attack lifecycle operations: reconnaissance, exploitation, persistence, lateral movement, and exfiltration. • Integrate security requirements and controls into architecture, design, and coding practices. • Automate and conduct reviews of code, libraries, and dependencies to identify vulnerabilities. • Collaborate with engineers to assess potential attack vectors and recommend mitigations. • Implement static (SAST), dynamic (DAST), and dependency scanning tools into CI/CD pipelines. • Work with DevOps to secure Kubernetes, containers, secrets management, and cloud environments (AWS/GCP/Azure).

AWS Azure Cloud Docker Google Cloud Platform Java JavaScript Kubernetes Python Terraform

View details: DevSecOps Engineer

Virginia

Apply

Staff DevOps Engineer, Salesforce

BD is a global medical technology company that is advancing the world of health. www.bd.com

DevOps Engineer35 days ago

Full Time RemoteTeam 10,001+Since 1897H1B Sponsor

Company Site LinkedIn

Title: Staff DevOps Engineer, Salesforce Location: Stuart United States Job Description: We are the makers of possible BD is one of the largest global medical technology companies in the world. Advancing the world of health™ is our Purpose, and it’s no small feat. It takes the imagination and passion of all of us—from design and engineering to the manufacturing and marketing of our billions of MedTech products per year—to look at the impossible and find transformative solutions that turn dreams into possibilities. We believe that the human element, across our global teams, is what allows us to continually evolve. Join us and discover an environment in which you’ll be supported to learn, grow and become your best self. Become a maker of possible with us. POSITION PURPOSE Liberator Medical Supply, a subsidiary of BD, is seeking an experienced Staff DevOps Engineer (Salesforce) to define and lead CI/CD strategy, automation, and release management across our Salesforce ecosystem, including Commerce Cloud (B2C) and Data Cloud. This role is critical to enabling reliable, repeatable deployments, improving code quality, and accelerating delivery through standardized engineering practices and strong cross-functional partnership. You will design, implement, and continuously improve the Salesforce DevOps operating model—tooling, standards, governance, and metrics. You will own deployment pipelines, guide version-control and branching strategies, enable test automation and quality gates, and partner with product and engineering leaders to reduce delivery risk while increasing release velocity. The ideal candidate is an agile practitioner who combines deep Salesforce platform knowledge with strong DevOps fundamentals. You communicate clearly, influence without authority, and build trusted relationships across marketing, sales, technology, operations, and leadership to deliver the right functionality in the right order while removing delivery obstacles. PRIMARY DUTIES AND RESPONSIBILITIES CI/CD Pipeline Management - Design, build, and maintain robust CI/CD pipelines using Copado - Orchestrate complex release cycles across multiple Salesforce environments (Dev, QA, UAT, Production) - Implement automated deployment strategies including validation, testing, and rollback procedures - Manage Commerce Cloud deployments including storefront code, cartridges, site preferences, and data imports - Manage version control workflows using Git, including branching strategies and merge conflict resolution - Configure and optimize deployment automation for metadata, source format (SFDX), and data migrations DevOps Infrastructure & Automation - Establish and enforce DevOps best practices, coding standards, and deployment governance - Implement automated testing frameworks including unit tests, integration tests, and end-to-end testing - Configure static code analysis tools (PMD) and enforce quality gates - Develop and maintain deployment scripts, automation tools, and custom DevOps utilities - Monitor deployment metrics, identify bottlenecks, and continuously improve pipeline efficiency Release Management & Coordination - Own release planning sessions and coordinate cross-functional deployment activities - Manage release calendars, deployment windows, and change management processes - Conduct pre-deployment validation and post-deployment verification - Troubleshoot deployment failures and provide rapid resolution - Document deployment procedures, runbooks, and disaster recovery processes Environment & Security Management - Manage sandbox refresh strategies and environment synchronization - Implement security best practices for credential management, access control, and compliance - Configure environment-specific variables, system properties, and deployment parameters - Manage Commerce Cloud site preferences, custom preferences, and environment-specific configurations - Ensure deployments follow internal security, privacy, and compliance requirements (including HIPAA where applicable) in partnership with Security/Compliance stakeholders - Maintain backup and recovery procedures for all Salesforce environments Collaboration & Technical Leadership - Partner with Salesforce Administrators, Developers, and Product Managers to streamline delivery - Provide technical guidance on deployment architecture and DevOps tooling - Train development teams on CI/CD processes, Git workflows, and deployment best practices - Participate in architecture reviews and provide DevOps perspective on solution design - Stay current with Salesforce DevOps innovations and recommend process improvements REPORTING RELATIONSHIP RESPONSIBILITIES - No supervisory responsibilities Minimum Requirements Experience - 5+ years of hands-on experience with Salesforce DevOps and release management - 3+ years of experience managing CI/CD pipelines with Copado - 3+ years of experience with Git version control and branching strategies - Proven track record managing complex Salesforce implementations across multiple orgs, environments, and products - Experience with Salesforce DX (SFDX) and source-driven development Key Competencies Required - Analytical Thinking: Ability to troubleshoot complex deployment issues and identify root causes - Attention to Detail: Meticulous approach to configuration management and deployment validation - Communication: Clear documentation and ability to explain technical concepts to non-technical stakeholders - Problem-Solving: Proactive identification of risks and implementation of preventive measures - Collaboration: Strong team player who can work across development, QA, and product teams - Continuous Learning: Commitment to staying current with Salesforce and DevOps innovations Technical Skills Required - Expert-level knowledge of Salesforce metadata architecture and deployment mechanisms - Strong Copado expertise, including pipeline configuration, quality gates, and automation rules - Advanced Git skills including merge strategies, conflict resolution, and repository management - Strong proficiency with Salesforce Commerce Cloud deployment tools and processes - Experience with automated testing frameworks (Apex tests, AccelQ) - Familiarity with static code analysis tools (PMD) - Working knowledge of one or more Salesforce products - Knowledge of scripting languages (Python, JavaScript) for automation - Experience with CI/CD concepts Preferred Qualifications Certifications - Salesforce Certified Platform Developer I or II - Salesforce Certified Administrator - Copado Certified DevOps Engineer - Git or DevOps-related certifications Additional Preferred Qualifications - Experience in healthcare or medical device industries with HIPAA compliance requirements - Familiarity with Azure DevOps, Jenkins, or other CI/CD platforms - Knowledge of MuleSoft integration patterns and deployment strategies - Background in Agile/Scrum methodologies and sprint-based delivery - Experience with monitoring and observability tools (Datadog) At BD, we prioritize on-site collaboration because we believe it fosters creativity, innovation, and effective problem-solving, which are essential in the fast-paced healthcare industry. For most roles, we require a minimum of 4 days of in-office presence per week to maintain our culture of excellence and ensure smooth operations, while also recognizing the importance of flexibility and work-life balance. Remote or field-based positions will have different workplace arrangements which will be indicated in the job posting. For certain roles at BD, employment is contingent upon the Company’s receipt of sufficient proof that you are fully vaccinated against COVID-19. In some locations, testing for COVID-19 may be available and/or required. Consistent with BD’s Workplace Accommodations Policy, requests for accommodation will be considered pursuant to applicable law. Why Join Us? A career at BD means being part of a team that values your opinions and contributions and that encourages you to bring your authentic self to work. It’s also a place where we help each other be great, we do what’s right, we hold each other accountable, and learn and improve every day. To find purpose in the possibilities, we need people who can see the bigger picture, who understand the human story that underpins everything we do. We welcome people with the imagination and drive to help us reinvent the future of health. At BD, you’ll discover a culture in which you can learn, grow, and thrive. And find satisfaction in doing your part to make the world a better place.

View details: Staff DevOps Engineer, Salesforce

Florida

Apply

Job Closed

AI Infrastructure & Reliability Engineer

Job Description

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevOps Engineer

Sr Site Reliability Engineer

DevSecOps Engineer

Staff DevOps Engineer, Salesforce