Job Closed
This listing is no longer active.
Top Hat's dynamic courseware and AI-enhanced features empower educators to give students personalized learning.
Platform Engineer
Location
Canada
Posted
83 days ago
Salary
0
Seniority
Senior
Job Description
Platform Engineer
Top Hat
• Build and operate platform-level shared services and capabilities, such as continuous integration, continuous deployment, infrastructure automation, and monitoring. • Extend our reusable service template and its associated CI/CD tooling. • Lead efforts to further mature our cloud infrastructure and platform offering as our business grows. • Build and extend our production observability, helping teams manage and achieve SLOs. • Scale continuous deployment practices across the engineering department, equipping teams to ship software reliably and frequently with minimal friction. • Provide reusable patterns and guidance to enable DevOps practices (end-to-end production ownership within each cross-functional team). This means working with tools and infrastructure in addition to shaping processes and practices. • Coach product teams on operational ownership and teach blame-free root cause analysis for incidents that impact customers or delivery performance. • Mentor and support platform team members as they acclimate to the domain. You will also help new and junior developers learn how we work; day-to-day routines, best practices, and patterns while providing both positive and constructive feedback tactfully. • Participate in team on-call practices and support rotations.
Job Requirements
- An advocate for DevOps practices and principles. We have the buy-in and now it’s about continuing the execution of the vision.
- Experienced in cloud infrastructure and tooling (AWS, Terraform), with the ability to design and operate services in the cloud.
- Knowledgeable in CI/CD tooling (GitHub Actions), with experience maintaining large multi-stage pipelines.
- Hands-on with infrastructure as code (Terraform), Docker, and observability tooling (Honeycomb).
- Proficient in at least one software development language. While your skills are transferable, familiarity with Python, used for most of our platform work, will give you a strong head start.
- A collaborative team player who can also independently drive mostly self-contained projects when required.
Benefits
- A noble mission that creates meaningful, fulfilling work
- A team that cares deeply for customers and for each other
- Flexible, remote first work environment
- Professional learning and development for all role levels
- An awesome and welcoming Toronto HQ
- Competitive health benefits that start on day one
- A management team focused on performance, growth, engagement and connection
- Our winning strategy and market potential
- Innovative PTO policy with lots of time and space for self-care
- Passionate customers that believe in us—and what we do
- A chance to work with new tech like generative AI—and see the customer impact
Related Guides
Related Categories
Related Job Pages
More Platform Engineer Jobs
Power Platform Developer – Power BI Focus
Burwood GroupIT Consulting, Integration, and Managed Services Firm
• Design, develop, and enhance Power BI reports, dashboards, and datasets based on business requirements • Build and optimize data models using DAX and Power Query (M) • Implement row-level security (RLS) and ensure appropriate access controls • Improve report performance, refresh reliability, and usability • Translate loosely defined requirements into clear, actionable analytics • Integrate Power BI with SharePoint, Excel, Power Apps, Power Automate, and Dataverse as required • Support embedded Power BI and cross-workspace reporting scenarios • Provide recommendations on Power Platform best practices and architecture • Collaborate with business users, analysts, and technical teams during the contract period • Provide documentation, knowledge transfer, and handoff at project milestones • Troubleshoot and resolve Power BI–related issues efficiently
Company Overview Throughout our worldwide network of experts, clients and communities, we are renowned for our leadership in fire protection engineering – a legacy of responsibility we have proudly upheld since 1939. Today, our expertise extends broadly across closely related security and risk-based fields – from accessibility consulting and risk analysis to process safety, forensic investigations, security risk consulting, emergency management, digital innovation and more. Our engineers and consultants collaborate to solve complex safety and security challenges, ensuring our clients can protect what matters most. For over 80 years, we have helped mitigate risks that threaten lives, property and reputations. Through technology, expertise and industry-leading research, we remain dedicated to our purpose of making our world safe, secure and resilient. At Jensen Hughes, we believe that creating and sustaining a culture of trust, integrity and professional growth starts with putting our people first. Our employees are our greatest strength, and we value the unique perspectives and talents they bring to our organization. Our wide range of Global Employee Networks connect people from across the organization, supporting career development and providing forums for individuals to share experiences on topics they're passionate about. Together, we are cultivating a connected culture where everyone has the opportunity to learn, grow and succeed together. Job Overview We’re hiring a Platform Engineer to build and operate our cloud platform on AWS, with Terraform as the infrastructure control plane. You will design and implement Infrastructure CI/CD and PR-driven infrastructure delivery (GitOps principles) for cloud infrastructure and platform configuration (Git as source of truth, automated checks, controlled applies, strong auditability). You’ll also own platform-grade observability (Splunk preferred) and help enable secure, production-ready agentic AI capabilities using Amazon Bedrock and Bedrock AgentCore, partnering with application teams to establish reliable patterns and guardrails. Responsibilities: AWS platform engineering (multi-account) - Design, build, and operate secure, reliable AWS foundations across a multi-account AWS environment (AWS Organizations / Control Tower where applicable), including networking, IAM, KMS, secrets, tagging, and shared services - Establish scalable patterns for compute, storage, and networking; enable repeatable environments across dev/stage/prod - Improve developer experience through standards, templates, and clear platform documentation Terraform (deep expertise) - Own Terraform architecture end-to-end: module strategy, state design, environment separation, provider/version management. - Build and maintain a production-grade Terraform SDLC: - PR-driven workflows with plan previews, approvals, and promotion across environments - Controlled apply mechanisms with audit trails and rollback plans - Drift detection and safe reconciliation strategy • Import/migration/refactor patterns without downtime - Implement baseline guardrails (tagging, encryption, access controls) as code wherever feasible Infrastructure CI/CD + PR-driven infrastructure delivery (GitOps principles) - Implement PR-driven infrastructure delivery using GitOps principles (not Kubernetes-only): - Git as the source of truth; PRs as change requests - Automated validation/testing/security checks on every change - Safe promotion model (dev → stage → prod) with appropriate gates - Controlled applies for production (approval gates / break-glass procedures), with full traceability - Standardize pipelines in the team’s primary CI/CD platform (GitHub Actions) and integrate with existing enterprise tooling where needed - Establish repo structure, branching strategy, and operational runbooks for the infrastructure delivery workflow Observability (Splunk preferred) - Own the Splunk observability operating model: dashboards, alerting standards, SLOs/SLIs, runbooks, and on-call readiness - Build/operate telemetry pipelines for reliability and cost efficiency (noise reduction, sampling/cardinality strategies, retention and routing). - Partner with application teams to improve visibility, reduce MTTR, and drive incident learnings into platform improvements Agentic AI enablement (Amazon Bedrock + AgentCore) - Partner with engineering teams to enable agentic AI use cases using Amazon Bedrock and AgentCore (tool integration patterns, secure operation, production readiness) - Help establish foundational patterns for agent deployment and operations (environments, permissions, observability, and evaluation/reliability practices) aligned to enterprise controls Operational excellence - Participate in incident response; lead postmortems and drive systemic, preventive fixes - Measure and improve platform reliability, security posture, and cost efficiency over time Requirements (must have): - 8–10 years of experience in Platform Engineering / SRE / DevOps (or equivalent experience delivering platform outcomes)AWS expertise, including multi-account patterns (AWS Organizations / Control Tower preferred), networking, IAM/security, and operations - Terraform expert with proven ownership of org-scale infrastructure-as-code (modules, state, CI controls, large refactors) - Proven experience designing Infrastructure CI/CD and PR-driven infrastructure delivery (GitOps principles) for Terraform and cloud configuration: - PR-based automation with plan previews and security/policy checks - Controlled apply processes with approvals and auditability - Environment promotion patterns and rollback strategies - Strong production experience with observability platforms such as Splunk, Datadog, Grafana, or Dynatrace, including building and operating dashboards, alerting standards, and telemetry pipelines (logs/metrics/traces) in production - Strong Linux and troubleshooting skills; proficiency in automation (Python or Go preferred) Preferred Qualifications: - Experience building agentic AI solutions using Amazon Bedrock Agents and/or Amazon Bedrock AgentCore (deployment/operations, tool integration patterns) - OpenTelemetry at scale (standards, collectors/gateways, sampling, correlation across logs/metrics/traces) - Policy-as-code experience (Conftest/Sentinel or similar) applied to Terraform and platform guardrails - Experience building an Internal Developer Platform (IDP) / self-service workflows (golden paths, templates, paved roads). - Databricks on AWS platform support (workspace/cluster policies, reliability, cost controls; Unity Catalog familiarity a plus) Please note that the salary range provided is a good faith estimate for the position at the time of posting and not a guarantee of compensation. Final compensation may vary based on factors, including but not limited to, responsibilities of the job, education, experience, knowledge, skills, and abilities, geographic location, internal equity, alignment with market data. Jensen Hughes offers a competitive total rewards package, which includes a retirement plan, healthcare coverage, and a broad range of other benefits. Incentives and/or benefit packages may vary depending on the position and location. National Pay Range $100,000—$125,000 USD Jensen Hughes is an Equal Opportunity Employer. Qualified candidates will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. At Jensen Hughes, we embrace innovation and understand that people are increasingly using artificial intelligence (AI) tools like ChatGPT and other generative platforms to learn, prepare and communicate. We have provided some guidelines regarding the responsible use of AI in the recruitment process. Please click here to review. The security of your personal data is important to us. Jensen Hughes has implemented reasonable physical, technical, and administrative security standards to protect personal data from loss, misuse, alteration, or destruction. We protect your personal data against unauthorized access, use, or disclosure, using security technologies and procedures, such as encryption and limited access. Only authorized individuals may access your personal data for the purpose for which it was collected, and these individuals receive training about the importance of protecting personal data. Jensen Hughes is committed to compliance with all relevant data privacy laws in all areas where we do business, including, but not limited to, the GDPR and the CCPA. Additionally, our service providers are contractually bound to maintain the confidentiality of personal data and may not use the information for any unauthorized purpose. *Policy on use of 3rd party recruiting agency for direct placements Jensen Hughes will occasionally augment a recruiting search through agencies for certain positions when business conditions warrant. Jensen Hughes will not accept resumes, inquiries or proposals from recruiting agencies as an acceptable method to consider a candidate. 3rd party recruiting agencies must sign a standard Jensen Hughes agreement after being evaluated and accepted by a Human Resources or Talent Acquisition manager, or member of the talent acquisition team. Hiring managers and employees of Jensen Hughes are not authorized to accept resumes, engage in fee-based searches through recruiting firms or sign a search agreement. Please note this policy does not apply to “staffing firms” or firms that are involved with hiring temporary staff. Any recruiting agency interested in being considered may contact our recruiting team at jensenhughesrecruiting.com.
• Build and operate cloud platform on AWS using Terraform • Design and implement Infrastructure CI/CD and PR-driven infrastructure delivery • Own platform-grade observability • Enable secure, production-ready agentic AI capabilities
Platform Engineer – Kubernetes, GitOps, AI interest
CloudiaxGlobal Business Cloud provider for SAP B1, Cloud Infrastructure, AI Server & more - made in Germany, available worldwide
• You act as the interface between development and cloud administration • Your responsibility is to design our Kubernetes platform and deployment processes so developers can work efficiently, reproducibly, and with standardized workflows • You work closely with development, infrastructure, and security teams to provide a reliable platform • Enhance and operate our Kubernetes platform in on‑premise data centers, ensuring stability, scalability, and performance • Build and optimize self‑service processes that enable developers to perform efficient, standardized deployments • Optimize resource usage (CPU, RAM, storage, GPU) and develop fair workload strategies • Operate and evolve our PostgreSQL platform, including high availability, backup/recovery, and performance tuning • Automate infrastructure and deployment tasks using Terraform, Ansible, or similar tools • Build and run modern CI/CD and GitOps processes (e.g., Argo CD, Flux) • Define platform standards, templates, and best practices to ensure a consistent developer experience • Integrate and operate authentication and authorization systems (Keycloak) as well as API gateways (Kong) • Ensure multi‑tenancy, isolation, and compliance with security requirements • Implement and operate monitoring, logging, and tracing solutions (Prometheus, Grafana, OpenTelemetry) • Develop intelligent scaling mechanisms (e.g., HPA, custom metrics) beyond classical CPU/RAM metrics • Support development teams in using the platform and continuously improve the developer experience • Serve as the interface between development and cloud administration: gather requirements, improve workflows, and provide feedback • Bring your interest in AI topics and explore automation and orchestration opportunities



