Digital Signage Solutions you can rely on! Cost effect solutions that adapt to your business needs.
Senior Cloud Engineer, Cloud Services
Location
Canada
Posted
128 days ago
Salary
$175K - $200K / year
Seniority
Senior
Job Description
Senior Cloud Engineer, Cloud Services
Xibo Open Source Digital Signage
• Maintain the Cloud (AWS) environment, support, and operations. • Assist in architecting, designing, and implementing cloud-based IaaS, PaaS, and SaaS solutions. • Design, develop, deploy, and configure modular cloud-based systems that are highly available, scalable, and secure. • Create and update support documentation and standards. • Develop automated methodologies for deployment activities, configuration management, supporting systems, and business processes that are secure. • Work in an agile environment, planning features and prioritizing changes based on team goals, product usage, and feedback. • Accountable for accomplishing tasks both independently and collaboratively. • Communicate with leaders, stakeholders, and users on process changes and proposals for future improvements. • Define and document the best practices and strategies regarding deployment and maintenance of infrastructure and applications. • Provide guidance, through leadership, and mentorship to Cloud, Development, and Business teams to build their cloud competencies. • Proactively and/or based on report logs, identify, analyze, and resolve infrastructure vulnerabilities and application deployment issues. • Manage cloud environments in accordance with company policy, standards, and best practice. • Regularly review existing systems and make recommendations for improvements. • Create CloudFormation Templates. • Manage Database Infrastructure.
Job Requirements
- AWS Certified Solution Architect Professional
- 2+ years experience with DMS
- 5+ years specializing in AWS Cloud Services Operations
- Experience with Cloud Networking, Cloud WAN
- 4+ years experience with AWS Serverless
- 3+ years experience with deploying using CICD, Gitlab and CloudFormation
- 3+ years scripting in languages such as Bash and Python
- 5+ years experience with developing scripts to automate infrastructure fulfillment tasks
Benefits
- medical (vision care) and dental benefits
- life insurance
- paid disability leave
- defined contribution pension plan with company matching
- vacation time
- tuition reimbursement
- paid sick leave time annually
Related Guides
Related Categories
Related Job Pages
More Cloud Engineer Jobs
Cloud Engineer
LeidosLeidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.
• Build and maintain products focused on application development lifecycle automation across multiple cloud environments including AWS, Azure, and GCP. • Apply best practices and best of breed services within and across multiple CSPs to provide parity and cohesive integration within the customer’s hybrid cloud enterprise. • Work with the cloud, security, testing, and application teams to build orchestration pipelines. • Build, support, and maintain CI/CD pipelines that conduct multiple deployments a day. • Contribute to a culture of continuous improvement by establishing and participating in feedback loops. • Meet with applications on a regular cadence to provide support and identify opportunities for optimization. • Develop strong relationships and partnerships with the other CMS Hybrid Cloud Product Teams. • Continuously refresh infrastructure components such as EC2 instances, Cloud Formation Stacks, Kubernetes clusters, containers, etc. • Promote strong quality control practices for all managed services as well as maintain robust documentation. • Work independently to resolve highly complex problems using significant application of technical knowledge, conceptualizing, reasoning, and interpretation.
GPU Cloud Platform Engineer
Yotta LabsBuilding the Decentralized OS for AI Optimization and Orchestration at Planet Scale
Location: Remote (Global) Type: Full-time Company: Yotta Labs Apply: careers@yottalabs.ai 🧠 About Yotta Labs Yotta Labs is pioneering the development of a Decentralized Operating System (DeOS) for AI workload orchestration at a planetary scale. Our mission is to democratize access to AI resources by aggregating geo-distributed GPUs, enabling high-performance computing for AI training and inference on a wide spectrum of hardware—from commodity to high-end GPUs. Our platform supports major large language models (LLMs) and offers customizable solutions for new models, facilitating elastic and efficient AI development. 🛠️ Role Overview We are seeking a GPU Cloud Platform Engineer to join our core infrastructure team and help build the next-generation AI compute cloud. In this role, you will design, deploy, and operate large-scale, multi-cluster GPU infrastructure across data centers and cloud environments. You will be responsible for ensuring high availability, performance, and efficiency of containerized AI workloads—ranging from LLMs to generative models—deployed in Kubernetes-based GPU clusters. If you're passionate about high-performance systems, distributed orchestration, and scaling real-world AI infrastructure, this role offers a unique opportunity to shape the backbone of our AI cloud platform. 🎯 Responsibilities Build and operate large-scale, high-performance GPU clusters; ensure stable operation of compute, network, and storage systems; monitor and troubleshoot online issues. Conduct performance testing and evaluation of multi-node GPU clusters using standard benchmarking tools to identify and resolve performance bottlenecks. Deploy and orchestrate large models (e.g., LLMs, video generation models) across multi-cluster environments using Kubernetes; implement elastic scaling and cross-cluster load balancing to ensure efficient service response under high concurrency for global users. Participate in the design, development, and iteration of GPU cluster scheduling and optimization systems. Define and lead Kubernetes multi-cluster configuration standards; Optimize scheduling strategies (e.g., node affinity, taints/tolerations) to improve GPU resource utilization. Build a unified multi-cluster management and monitoring system to support cross-region resource monitoring, traffic scheduling, and fault failover. Collect key metrics such as GPU memory usage, QPS, and response latency in real time; configure alert mechanisms. Coordinate with IDC providers for planning and deploying large-scale GPU clusters, networks, and storage infrastructure to support internal cloud platforms and external customer needs. ✅ Qualifications
• Designing and deploying AI agents using Google’s Gemini models • Using GCP’s Vertex AI as the primary ML platform (model hosting, pipelines, endpoints) • Building with Google’s agent tooling — the Agent Development Kit (ADK) for constructing agent logic, and the Managed Agents API for running/orchestrating them at scale • Architecting and implementing infrastructure as code (IaC) . • Defining administrative choices for environments that are auto-configured by IaC. • Focus on agent use cases , including relationships with BigQuery databases, enterprise connectors, and agent orchestration. • Heavy leverage of Vertex . • Investigating and tweaking designs to avoid issues with Pfizer’s enterprise infrastructure and GCP limitations.
• Lead the design and architecture of end-to-end integration solutions using Oracle Integration Cloud (OIC), including app-driven orchestrations, scheduled integrations, file-based integrations, event-driven architectures, REST/SOAP APIs, and B2B integrations. • Define integration standards, reference architectures, governance models, reusable patterns, and best practices to ensure consistency, scalability, and maintainability across the enterprise. • Collaborate with business stakeholders, solution architects, and technical teams to gather requirements, analyze integration needs, perform gap analysis, and translate business processes into robust technical designs. • Architect and implement complex integrations, data mappings, transformations (using XSLT, lookups, and canonical models), adapters, and packages in OIC. • Oversee integration with Oracle SaaS applications, legacy systems (including Oracle E-Business Suite), and external platforms, ensuring data integrity, security, and performance. • Provide technical leadership on migration strategies from on-premises middleware (e.g., SOA) to OIC, hybrid cloud integrations, and optimization of existing integration landscapes. • Conduct design reviews, mentor development teams, troubleshoot complex issues, and support testing, deployment, and post-production monitoring. • Stay current with Oracle Integration Cloud updates, new features (such as enhanced observability, projects, file server capabilities, AI agents, and process improvements in recent releases), and industry trends to recommend proactive enhancements. • Ensure compliance with security protocols, data governance, and regulatory requirements in all integration designs.




