Optimizing business performance through people, data, tech & analytics
Senior Cloud & DevOps Engineer
Location
Uruguay
Posted
52 days ago
Salary
0
Seniority
Senior
No structured requirement data.
Job Description
Senior Cloud & DevOps Engineer
Blend360
Company Description Blend is a premier AI services provider, committed to creating meaningful impact for its clients through the power of data science, AI, technology, and people. We help organisations solve complex business challenges by combining deep domain understanding with modern data and AI capabilities. Our teams work across strategy, analytics, engineering, and product delivery to create scalable, high-value solutions that improve decision-making, efficiency, and growth. Job Description We are looking for an experienced Senior Cloud & DevOps Engineer to support the build and production readiness of a foundational Azure data platform for a large telecommunications client. This role will focus on provisioning and operating the core Azure infrastructure, including Azure Data Factory, Data Lake Storage, data warehousing solutions and establishing the CI/CD pipelines, environment management, monitoring, and operational controls needed to take the platform through Dev, Test, and Production. The ideal candidate will have strong expertise in Azure-native architecture, infrastructure-as-code (Terraform), release engineering, observability, and secure platform operations in regulated environments. This person will work closely with Data Engineers, BI Consultants, and Governance leads to ensure the platform is deployable, scalable, secure, and aligned with enterprise and PIPEDA compliance standards. Responsibilities - Design and implement Azure cloud infrastructure and deployment patterns for the data platform, including Entra ID design, subscription hierarchy, naming conventions, and tagging standards. - Build and maintain CI/CD pipelines to support repeatable, controlled releases across Development, Test, and Production environments. - Provision and configure Azure infrastructure as code (Terraform), including Data Factory, Data Lake, ExpressRoute/VPN, network topology, and firewall rules to connect on-premises source systems. - Configure Azure DevOps and Databricks or Snowflake Git integration to enforce version-controlled deployments. - Support deployment of backend services, orchestration components, data services, and front-end applications. - Enable monitoring, logging, alerting, and telemetry for both platform health and end-user usage feedback loops. - Define and implement operational controls for reliability, performance, scalability, and incident response. - Implement and enforce secure access patterns using Entra ID, Azure Key Vault for secrets management, and RBAC, including column-level and row-level security controls required for PIPEDA compliance. - Ensure the solution aligns with architecture, security, and service transition requirements. - Support non-functional testing, release readiness, and path-to-production activities. - Produce comprehensive operational runbooks, platform documentation, and a full IaC handover package enabling the client’s internal IT team to take ownership of platform operations at programme close. - Support cost management, network performance tuning, and security hardening of the Azure platform; contribute to cost optimisation reporting and assist with backup and disaster recovery planning. Qualifications - Strong hands-on experience with CI/CD tooling and release automation. - Experience with infrastructure-as-code using Terraform or similar tools. - Hands-on experience deploying and operating cloud-native workloads in Microsoft Azure, including Data Factory, Databricks, Snowflake, Data Lake Storage, and Entra ID. - Strong understanding of containerisation, serverless and managed compute services, and environment promotion strategies. - Experience with observability tooling covering logging, monitoring, alerting, and service health. - Knowledge of security best practices including IAM, RBAC, secrets management, and policy-driven access control. - Experience supporting production-grade data platforms in enterprise environments, ideally in regulated sectors with compliance requirements such as PIPEDA or equivalent. - Familiarity with Git-based workflows and collaborative engineering practices. - Strong troubleshooting, communication, and stakeholder management skills. Nice to Have - Experience with specific Azure services including Azure Data Factory (including Self-Hosted Integration Runtime), Azure Databricks (Unity Catalog, Repos, Medallion architecture), Snowflake, Azure Data Lake Storage Gen2, Azure Key Vault, and Azure Monitor. - Familiarity with Azure DevOps pipelines and Power BI deployment pipelines for dev/test/prod environment promotion of both infrastructure and BI assets. - Experience with pipeline observability and data quality monitoring in Medallion architectures, including alerting on ingestion failures and SLA-driven orchestration schedules. - Understanding of Canadian data privacy requirements (PIPEDA) and how they translate into platform controls such as column-level security, PII tagging, RBAC design, and audit logging in Azure and data warehouse environments. - Experience supporting service transition into managed support models. - Exposure to QA automation and non-functional testing in cloud-native systems. What about languages? Advanced English proficiency required. How much experience must I have? 5+ years of experience in Cloud Engineering, DevOps, or Platform Engineering roles. Additional Information Our benefits: Learning Opportunities: - Certifications in AWS (we are AWS Partners), Databricks, and Snowflake. - Access to AI learning paths to stay up to date with the latest technologies. - Study plans, courses, and additional certifications tailored to your role. - Access to Udemy Business, offering thousands of courses to boost your technical and soft skills. - English lessons to support your professional communication. 👩🏫 Mentoring and Development: - Career development plans and mentorship programs to help shape your path. 🎁 Celebrations & Support: - Special day rewards to celebrate birthdays, work anniversaries, and other personal milestones. - Company-provided equipment. ⚖️ Flexible working options to help you strike the right balance. Other benefits may vary according to your location in LATAM. For detailed information regarding the benefits applicable to your specific location, please consult with one of our recruiters. So what are the next steps? Our team is eager to learn about you! Send us your resume or LinkedIn profile below and we’ll explore working together!
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Define and drive the DevOps Vision and using Agile best practices • Set direction, standards, and best practices for the team • Lead the design of scalable, secure, and reliable infrastructure and delivery pipelines • Establish and maintain CI/CD pipelines for multiple applications and services • Align DevOps initiatives with engineering, product, and business goals • Ensure high-quality engineering is demonstrated across the team • Design, deploy and maintain cloud infrastructure (Azure) • Mentor engineers and promote knowledge sharing • Facilitate clear communication between the different departments • Advocate DevOps culture across the teams looking to shift-left wherever possible
Site Reliability Operations Analyst II
Centene CorporationTransforming the health of the communities we serve, one person at a time.
• Collaborate with cross-functional teams to proactively monitor, maintain, and enhance production systems. • Bridge the gap between software development and IT operations, with an emphasis on incident triage, systems operations and documentation, business user support and communications, toil reduction, work automation and improving the reliability of services. • Provide timely and accurate user support/troubleshooting as well as management of escalations within enterprise SLA’s (Service Level Agreements). • Researches and analyzes enrollment/provider data and Utilization Management/Case Management business processes to support applications issues while acquiring a more advanced skill set for assistance of product design and solutions. • Perform On Call post deployment application validations of enhancement/code fixes and monitoring for escalations. • Act as a key player in application incident management, ensuring prompt detection, triage, and resolution of production incidents. • Monitor and manage technology business cycle processing within the application environment. • Establish, document, and refine incident management processes to reduce downtime and service degradation. • Participate in post-incident reviews, ensuring that lessons learned are incorporated into operational practices and automation efforts. • Develop, improve and review runbooks and documentation and keep them up to date to ensure consistency across SRE teams. • Document operational processes including workflows, system configurations, troubleshooting guides, and incident reports to improve the team’s ability to respond quickly to incidents and system failures. • Review and provide feedback and process improvement recommendations on topics like automation, monitoring gaps, toil reduction and systems resiliency. • Comply with all policies and standards.
TL;DR We're hiring a Site Reliability Engineer to own and evolve deepset's cloud and customer infrastructure end to end. You'll work across SaaS, private cloud, and on-prem environments to make our self-hosted platform production-ready, drive CI/CD and GitOps maturity, and reduce complexity at scale. Your work will directly shape how deepset's AI platform is built, deployed, and scaled for our own cloud and for customers running it in their own environments. Why deepset At deepset, we’re on a mission to make custom AI solutions accessible to every organization. With Haystack, thousands of developers build advanced LLM applications every day, while our enterprise-ready AI Platform helps companies turn large language models into business value. We’re remote-first, flexible, and built on a culture of trust and ownership. You’ll collaborate with top-tier tech talent, tackle meaningful challenges, and help transform complex AI into solutions that are simple, powerful, and ready for the real world. What you will do You won’t just “keep things running” - you’ll help define how our platform is built, deployed, and scaled across cloud and customer environments. - Build and operate real-world infrastructureDesign, configure, and evolve infrastructure that runs both in our cloud and inside customer environments (SaaS, private cloud, on-prem). - Make self-hosted production-readyHelp us deliver a production-grade, self-hosted platform that can be deployed on any Kubernetes setup in weeks - not months. - Drive automation & platform maturityImprove CI/CD pipelines, GitHub workflows, and GitOps setups so teams can ship faster with confidence. - Reduce complexity and costContinuously simplify systems and optimize infrastructure spend without compromising performance or reliability. - Shape how we buildChampion best practices in reliability, scalability, and security across the organization, not as rules, but as working systems. Requirements - 2-5 years of experience working with large-scale production infrastructure - Fluent German language skills - Experience with distributed or service-oriented architectures - Hands-on expertise with: - AWS - Kubernetes - CI/CD and GitOps (e.g. ArgoCD) - Working knowledge of Infrastructure as Code (Terraform preferred) - Solid troubleshooting skills - you can debug across systems, not just within one layer - A pragmatic mindset: you balance speed, simplicity, and reliability - Ownership and accountability - you take responsibility for systems end-to-end - Ability to work independently while staying aligned with the team’s goals Nice to have - Familiarity with observability stacks (e.g. Datadog, Prometheus) - Experience optimizing cloud costs at scale - Interest or experience in Machine Learning / LLM systems - Experience improving developer experience and platform tooling using AI agents - Contributions to SRE practices like postmortems, SLIs/SLOs, and reliability engineering culture Benefits - Remote-first setup with flexible hours & tech of your choice - 30 days vacation + extra days for family sick leave - Competitive salary & stock options for every team member - Monthly sports & mental health support allowance with Oliva - Annual learning & development budget - Monthly team socials & in-person meetups - Dog-friendly Berlin HQ
TL;DR We're hiring a Site Reliability Engineer to own and evolve deepset's cloud and customer infrastructure end to end. You'll work across SaaS, private cloud, and on-prem environments to make our self-hosted platform production-ready, drive CI/CD and GitOps maturity, and reduce complexity at scale. Your work will directly shape how deepset's AI platform is built, deployed, and scaled for our own cloud and for customers running it in their own environments. Why deepset At deepset, we’re on a mission to make custom AI solutions accessible to every organization. With Haystack, thousands of developers build advanced LLM applications every day, while our enterprise-ready AI Platform helps companies turn large language models into business value. We’re remote-first, flexible, and built on a culture of trust and ownership. You’ll collaborate with top-tier tech talent, tackle meaningful challenges, and help transform complex AI into solutions that are simple, powerful, and ready for the real world. What you will do You won’t just “keep things running” - you’ll help define how our platform is built, deployed, and scaled across cloud and customer environments. - Build and operate real-world infrastructureDesign, configure, and evolve infrastructure that runs both in our cloud and inside customer environments (SaaS, private cloud, on-prem). - Make self-hosted production-readyHelp us deliver a production-grade, self-hosted platform that can be deployed on any Kubernetes setup in weeks - not months. - Drive automation & platform maturityImprove CI/CD pipelines, GitHub workflows, and GitOps setups so teams can ship faster with confidence. - Reduce complexity and costContinuously simplify systems and optimize infrastructure spend without compromising performance or reliability. - Shape how we buildChampion best practices in reliability, scalability, and security across the organization, not as rules, but as working systems. Requirements - 2-5 years of experience working with large-scale production infrastructure - Fluent German language skills - Experience with distributed or service-oriented architectures - Hands-on expertise with: - AWS - Kubernetes - CI/CD and GitOps (e.g. ArgoCD) - Working knowledge of Infrastructure as Code (Terraform preferred) - Solid troubleshooting skills - you can debug across systems, not just within one layer - A pragmatic mindset: you balance speed, simplicity, and reliability - Ownership and accountability - you take responsibility for systems end-to-end - Ability to work independently while staying aligned with the team’s goals Nice to have - Familiarity with observability stacks (e.g. Datadog, Prometheus) - Experience optimizing cloud costs at scale - Interest or experience in Machine Learning / LLM systems - Experience improving developer experience and platform tooling using AI agents - Contributions to SRE practices like postmortems, SLIs/SLOs, and reliability engineering culture Benefits - Remote-first setup with flexible hours & tech of your choice - 30 days vacation + extra days for family sick leave - Competitive salary & stock options for every team member - Monthly sports & mental health support allowance with Oliva - Annual learning & development budget - Monthly team socials & in-person meetups - Dog-friendly Berlin HQ



