Zocdoc is the beginning of a better healthcare experience for millions of patients every month.
Staff Enterprise and Cloud Engineer
Location
United States
Posted
2 days ago
Salary
$180K - $270K / year
Seniority
Lead
Job Description
Staff Enterprise and Cloud Engineer
Zocdoc
Role Description Zocdoc’s greatest asset is its people. As a Staff Cloud IAM Engineer on our Corporate Cloud Engineering team within Corporate IT, you’ll make it possible for every Zocdoc’r to work securely and efficiently. - Own the technical vision and strategy for identity and access management across our corporate stack, with Microsoft Entra ID, enterprise SSO/SCIM, and our SaaS and AI platforms at the center. - Design scalable identity governance that keeps teams productive while reducing risk. - Lead cross‑functional initiatives that make secure, least‑privilege access the default, not an afterthought. - Play a key role in the reliability and security of our core corporate infrastructure. - Help ensure our AWS/Azure/GCP environments, on‑prem VMware footprint, and foundational services are patched, healthy, and well‑run. Qualifications - Deeply fluent in Microsoft Entra ID (Identity Governance, Access Packages), SSO/SCIM standards (SAML, OIDC), and custom integrations for a diverse SaaS and AI estate. - Excited to scale AI platforms like OpenAI and Anthropic through thoughtful RBAC, tiered spend/quota governance, and secure, consumable access patterns. - Comfortable working the access queue to identify patterns, with a relentless focus on building the automation and self-service tools that retire repetitive manual work. - A cross-functional partner who models Staff-level behaviors by mentoring engineers, aligning stakeholders, and setting the technical standards that drive adoption across the organization. - An outcome-driven leader who brings humility, curiosity, and a sense of humor to solving challenging problems in a growing, high-scale environment. Requirements - Track record leading identity or enterprise platform initiatives at a multi-thousand-employee organization, with measurable outcomes (toil eliminated, audit findings reduced, time-to-access shortened, or comparable business metrics). - Demonstrated ability to drive adoption of standards across teams through RFCs, design reviews, and architectural pattern-setting. - 10+ years in IT/Systems (mid-to-large scale) as a "player-coach" with a proven track record of defining adoption-ready standards and writing the design docs/RFCs that become the organization’s source of truth. - Deep expertise in Microsoft Entra ID (Conditional Access, PIM, Identity Governance) and the ability to own the entire identity lifecycle, including onboarding/offboarding flows and permission hygiene. - Extensive experience delivering SSO and SCIM integrations (SAML, OIDC/OAuth) across a massive SaaS estate, with a focus on replacing manual access work with programmatic or self-service provisioning. - A systems-thinker comfortable being measured by toil eliminated; expert at automating workflows across IdP, HRIS (Workday), and SaaS platforms via APIs to remove repetitive manual tasks. - Experience governing IAM, spend, and quotas for AI platforms (OpenAI, Anthropic) and fluency in using Generative AI tools (Claude Code, LLMs) to accelerate engineering velocity. - Experience in audit-sensitive environments (HITRUST/SOC2 evidence collection) and owning the security hygiene of the identity certificate and token lifecycle. - Familiarity with the broader endpoint and security ecosystem, including Intune, Jamf, Google Workspace, and CrowdStrike, to ensure a cohesive identity posture across all platforms. - Hands-on experience with AWS infrastructure and networking primitives (VPC, DNS, Load Balancing) to debug connectivity, utilizing AWS CDK, Terraform, Python, or PowerShell for automation. Benefits - Remote Base Salary Range: $180,000 — $270,000 USD.
Related Guides
Related Categories
Related Job Pages
More Cloud Engineer Jobs
• Design, deploy, and manage core Azure infrastructure components to support mission-critical workloads. • Architect and implement robust Identity and Access Management (IAM) strategies within Azure (Entra ID) to ensure least-privilege access and secure cross-team collaboration. • Implement and maintain security controls and governance aligned with frameworks such as CMMC, FedRAMP, ISO 27001, or SOC 2. • Build and optimize reusable infrastructure as code (IaC) components and templates for consistent, secure, and scalable deployments. • Define and implement standardized deployment patterns and workflows for production workloads. • Collaborate with other teams (IT, InfoSec, Networking) to understand their requirements and translate them into secure, scalable Azure solutions. • Identify operational bottlenecks and architect innovative solutions that maximize system availability, reliability, and security. • Implement and manage Zero Trust Network Access (ZTNA) solutions to secure remote access and internal communications.
Senior Cloud Engineer
KATBOTZ®Driving Customer Success Through Finance Transformation: Advanced Processes, Analytics, & AI.
• Design, implement, and manage scalable cloud infrastructure solutions. • Support cloud migration, deployment, and modernization initiatives. • Build and maintain secure, highly available, and cost-effective cloud environments. • Automate infrastructure provisioning using Infrastructure as Code (IaC) tools. • Collaborate with DevOps, security, application, and infrastructure teams to support cloud operations. • Monitor system performance, troubleshoot issues, and optimize cloud resources. • Implement cloud security best practices, governance, and compliance controls. • Manage CI/CD pipelines and deployment automation processes. • Provide technical leadership and mentor junior engineering team members. • Create technical documentation, architecture diagrams, and operational procedures.
• Manage and support Azure infrastructure across dev, QA, staging, and production • Maintain operational health of Static Web Apps, Container Apps, PostgreSQL, Storage Accounts, SignalR, Service Bus, Azure AI Foundry, Azure Arc, and related services • Ensure resources are provisioned, configured, monitored, maintained, and retired per company standards • Support environment setup for new products, customers, and integrations • Identify and resolve infrastructure issues affecting performance, reliability, availability, or security • Build and maintain Terraform modules and environment configurations • Ensure infrastructure changes are version-controlled, peer-reviewed, tested, and approved • Manage Terraform state, workspaces, variables, secrets, and deployment workflows • Detect and resolve drift between Terraform and deployed Azure resources • Standardize naming, tagging, resource group structure, environment isolation, and module patterns • Build, maintain, and troubleshoot GitHub Actions workflows for application and infrastructure deployments • Support CI/CD pipelines across multiple SaaS products and environments • Implement promotion flows from dev to QA to staging to production • Add deployment safeguards: environment protection rules, approvals, rollback procedures, validation checks, release gates, and audit trails • Manage pipeline secrets, service principals, managed identities, and deployment credentials • Improve build and deployment reliability, speed, and traceability • Operate and monitor Azure AI services, including Azure AI Foundry and Speech-to-Text workloads • Support production operations for LLM integrations and AI-enabled product features • Monitor AI service availability, latency, quota usage, token consumption, API failures, throttling, and cost • Help define operational standards for AI workloads: access control, logging, alerting, failover, usage governance, and provider disruption handling • Partner with engineering to troubleshoot AI service issues, integration failures, degraded model responses, or provider-side disruptions • Support secure handling of AI secrets, endpoints, keys, managed identities, and private network access • Implement and maintain monitoring with Azure Monitor, Log Analytics, Application Insights, and related tools • Build dashboards for infrastructure, application, database, messaging, storage, AI service, and deployment health • Configure alerts for availability, latency, errors, resource saturation, queue depth, failed jobs, failed deployments, database health, quota exhaustion, and cost anomalies • Improve signal quality by reducing noise and ensuring alerts are actionable • Participate in production incident response for infrastructure, deployments, integrations, and platform services • Triage and resolve issues across Azure services, CI/CD, Terraform, networking, databases, messaging, and AI integrations • Create and maintain runbooks for common operational issues • Support root cause analysis and post-incident reviews • Implement preventive actions after incidents to improve reliability • Help define severity levels, escalation paths, response expectations, on-call processes, and production support procedures • Implement cloud security best practices across Azure environments • Manage Azure RBAC, managed identities, service principals, Key Vault access, and least-privilege permissions • Secure GitHub Actions workflows, deployment credentials, environment secrets, and production access • Support secret rotation, certificate management, and secure configuration management • Enforce network security via private endpoints, firewalls, IP restrictions, and environment-specific access rules • Support audit and compliance readiness for SOC 2, ISO 27001, or similar frameworks • Support Azure PostgreSQL operations: backups, restores, performance monitoring, connection limits, HA, and capacity planning • Monitor and maintain Azure Storage Accounts, lifecycle policies, access controls, backup strategy, and usage trends • Support Azure Service Bus operations: queue/topic monitoring, dead-letter handling, retry behavior, and throughput • Support SignalR operational health, connection metrics, scaling behavior, and related production issues • Monitor Azure spend across products, environments, services, and customers where applicable • Implement tagging standards to support cost allocation by product, environment, customer, or business unit • Build cost dashboards, budget alerts, anomaly detection, and recurring cost reviews • Identify underutilized resources and recommend right-sizing opportunities • Review AI service costs, LLM and token usage, STT usage, storage growth, database sizing, and environment costs • Recommend savings plans, reservations, scaling rules, lifecycle policies, or shutdown schedules • Define and maintain backup and recovery procedures for critical cloud services • Test database restores and validate backup reliability • Help define RTOs and RPOs for production systems • Support disaster recovery planning for SaaS products and customer-facing services • Improve resilience through scaling rules, failover patterns, health checks, synthetic monitoring, and production readiness reviews • Create and maintain CloudOps documentation, runbooks, deployment guides, and environment standards • Define standards for naming, tagging, logging, alerting, access control, Terraform structure, GitHub Actions patterns, and production changes • Document procedures for cloud services, CI/CD workflows, AI services, and incident response • Enable engineering teams with reusable patterns, templates, and self-service guidance
Senior Cloud Engineer – MGN Pak, MIT/Tech/Cloud
MashreqWe are Mashreq - Inspiring you to Rise Every Day
• Ensuring Mashreq Bank's cloud compliance with local and international regulations • Developing and implementing governance frameworks • Conducting compliance assessments • Collaborating with cross-functional teams on cloud operations




