Effectual logo
Effectual

Cloud Confidently®

Senior VMware Engineer

Location

United States

Posted

10 days ago

Salary

$155K - $165K / year

Seniority

Senior

Job Description

Senior VMware Engineer

Effectual

• Execute VMware Enterprise Migrations • Execute end-to-end VMware datacenter migrations using AWS Migration Services (MGN), CloudEndure, and HCX • Perform comprehensive VMware workload assessments, dependency mapping, and application discovery • Implement migration strategies aligned with the 7Rs framework (Rehost, Replatform, Refactor, Repurchase, Retire, Retain, Relocate) • Deploy and operate migration factory patterns for VMware datacenter exits with 200-500 VMs per wave • Execute wave-based migration planning with minimal business disruption and tested rollback procedures • Migrate complex VMware environments including vMotion transitions, NSX networking conversions, and vSAN/VMFS storage migrations • Build Migration Infrastructure • Build AWS landing zones and migration target environments following Well-Architected Framework principles • Implement hybrid connectivity solutions (Direct Connect, VPN, Transit Gateway) for migration networks • Configure AWS migration tools: MGN replication servers, DMS endpoints, Migration Hub tracking • Convert VMware NSX micro-segmentation policies to AWS Security Groups and Network ACLs • Migrate VMware vSAN/VMFS/NFS storage to appropriate AWS storage services (S3, EBS, EFS, FSx) • Implement disaster recovery and business continuity architectures post-migration • Conduct post-migration validation, performance testing, and optimization • Automate Migration Workflows • Develop Infrastructure-as-Code using Terraform and CloudFormation to replicate VMware infrastructure patterns in AWS • Create Python and PowerShell automation scripts for mass VM migration, validation, and cutover procedures • Build CI/CD pipelines for automated testing and deployment of migrated workloads • Automate post-migration tasks: tagging, monitoring setup, backup configuration, security hardening • Implement migration runbooks and standard operating procedures for repeatable execution • Enable Client Success • Provide hands-on technical guidance to client VMware and cloud engineering teams during migration execution • Lead migration workshops, cutover planning sessions, and incident response during maintenance windows • Troubleshoot complex VMware-to-AWS migration challenges (replication failures, networking issues, application dependencies) • Create comprehensive migration documentation: runbooks, architecture diagrams, lessons learned • Support post-migration hypercare and stabilization activities

Job Requirements

  • 8-10+ years of IT implementation or cloud engineering experience
  • 5+ years of VMware infrastructure experience with vSphere, vCenter, ESXi, and NSX
  • 5-6+ years developing AWS cloud infrastructure
  • 3+ years hands-on VMware to AWS migration experience using MGN, CloudEndure, or HCX
  • 2+ years of infrastructure as code experience (Terraform preferred)
  • 2+ years of DevOps toolchain experience (e.g., GitHub, GitLab, AWS Code* Suite)
  • AWS experience within past year
  • Hands-on proficiency with primary AWS services (Compute, Storage, Networking, RDS)
  • Strong track record implementing AWS migration services as VMware workload targets
  • Experience executing VMware to AWS migrations at scale (200-500 VMs per project)
  • Proven ability to execute migration factory patterns with wave planning and rollback procedures
  • Experience with VMware datacenter operations transitioning to AWS cloud operations
  • Deep hands-on experience with AWS migration tools (MGN, CloudEndure, DMS, Migration Hub)
  • Working knowledge of VMware HCX for hybrid migrations and live vMotion to AWS
  • Proficient in developing Infrastructure-as-Code (e.g., Terraform / AWS CloudFormation)
  • Proficient in coding configuration management tooling (e.g., Ansible, Chef, Puppet, etc.)
  • Proficient in Python, PowerShell, and Bash scripting for migration automation
  • Demonstrable knowledge of Agile project methodologies
  • Ability to work with multiple clients, in parallel
  • Strong attention to detail in complex migration execution
  • Excellent troubleshooting and problem-solving skills
  • Experience with VMware Site Recovery Manager (SRM) and disaster recovery migrations
  • VMware Tanzu to AWS ECS/EKS containerization migrations
  • AWS Database Migration Service (DMS) for SQL Server, Oracle, PostgreSQL, MySQL migrations
  • Compliance framework experience (PCI-DSS, HIPAA, SOC 2) in migration contexts
  • FinOps practices and AWS cost optimization with demonstrated VMware TCO improvements
  • Multi-cloud experience (AWS, Azure, GCP) and comparative migration approaches
  • SAP on VMware to SAP on AWS migration experience
  • Kubernetes migration experience from VMware to AWS EKS

Benefits

  • Medical, dental, and vision health insurances,
  • Short term disability, long term disability and life insurances,
  • 401k with Company match
  • Paid time off (PTO) (120 hours PTO that accrue over one year)
  • Paid time off for major holidays (14 days per year)
  • These and any other employee benefit offerings are subject to management’s discretion and may change at any time.

Related Categories

Related Job Pages

More Engineer Jobs

Primer logo

Senior DevEx Engineer – Infrastructure

Primer

Powerful no-code automation for payments and commerce.

Engineer10 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

• Own Primer's internal developer platform end to end: the CI/CD pipelines, deployment workflows (canary, blue/green), self-service tooling and ephemeral environments that every engineer here depends on to ship. • Build the human-AI development loop. This is the most open part of the role: the tooling and automation that let engineers work effectively alongside coding agents, and the patterns that make agent-driven workflows fast and safe at Primer. • Treat developer productivity as a measurable system. Instrument it, find where delivery actually slows down, and ship the changes that fix it instead of guessing. • Take real operational ownership. You'll join the Core Infrastructure on-call rotation and own the reliability of what you build. • Set the technical direction for how Primer's engineers build and ship, and bring less experienced engineers along with you as you do it. • Improve how engineers write code, not just how they ship it. Partner with product teams to make the tools, frameworks and internal APIs they use everyday more ergonomic, and help reduce the friction of building features at Primer. • Build Primers paved roads. Reduce the application boilerplate engineers write, shape standards around how services are built and configured, and own the golden-path tooling that lifts efficiency across every team. • Work in the open across a fully distributed team, making your decisions and trade-offs visible rather than holding them in your head.

Poland

Role Description We are seeking a Model Serving Engineer to design, build, and operate high-performance, highly reliable inference platforms for serving large machine learning models in production. The role focuses on the systems engineering side of AI deployment, including: - Request routing - Batching - Caching - Autoscaling - GPU utilization - End-to-end observability across diverse model workloads The ideal candidate brings strong distributed systems and performance engineering expertise, has shipped serving systems at scale, and understands the trade-offs between latency, throughput, cost, and quality in ML serving. Qualifications - Bachelor’s or Master’s degree in Computer Science or a related field - Six or more years of experience in distributed systems, infrastructure, or ML platform engineering - Strong proficiency in Python and a systems language such as Go, Rust, or C++ - Deep experience operating high-throughput, low-latency services in production - Hands-on experience with LLM or large model inference frameworks such as vLLM or TensorRT-LLM - Strong understanding of GPU architecture, memory hierarchies, and accelerator utilization - Familiarity with Kubernetes, autoscaling, and modern cloud platforms - Experience with observability stacks including metrics, tracing, and structured logging - Solid grounding in performance engineering and capacity planning - Strong communication and incident response skills Requirements - Design and operate model serving platforms supporting diverse workloads including LLMs, vision models, and recommendation systems - Optimize inference performance using continuous batching, paged attention, speculative decoding, and request multiplexing - Implement multi-tenant routing, rate limiting, and quality-of-service policies across model endpoints - Build autoscaling and capacity management systems that balance latency, throughput, and cost - Tune GPU utilization, memory management, and KV cache strategies for LLM serving workloads - Integrate model serving with API gateways, identity systems, and observability platforms - Implement caching, prompt deduplication, and response reuse strategies where appropriate - Drive end-to-end observability including latency histograms, queue dynamics, GPU utilization, and error tracking - Develop deployment workflows including canary releases, shadow testing, and automated rollback - Operate incident response for high-availability AI services and drive durable reliability improvements - Collaborate with ML and product teams to support new model releases and capability rollouts - Implement security controls including request signing, content filtering, and abuse detection at the serving layer - Document operational procedures, performance characteristics, and tuning guidance for internal teams - Stay current with AI serving research and translate advances into production capabilities Benefits - Competitive base salary commensurate with experience - Benefits package How to Apply For immediate consideration, please send your resume to [email protected] or contact us at (908) 676-4399. Learn more about Bright Vision Technologies at www.bvteck.com .

United States
100K - 150K / year
Job Closed
Okta logo

Demo Engineer

Okta

The World's Identity Company

Engineer10 days ago
Full TimeRemoteTeam 5,001-10,000Since 2010H1B Sponsor

Role Description Demo Engineering sits at the intersection of go-to-market and product, serving as a global accelerator for Okta's technical sales success. Our team provides our internal and partner technical sellers worldwide with the capabilities to quickly and securely showcase how Okta secures identity for employees, customers, and AI agents. As a member of this team, you'll have a global impact on Okta’s go-to-market technology strategy: the demo components and customer experiences you build will be used by hundreds of Solution Engineers across dozens of countries, reaching thousands of customers and prospects. You'll work in a highly collaborative, engineering-driven culture where we prioritize reusability, automation, and self-service enablement. We're building the platform and tooling that makes Okta's field organization more effective, scalable, and customer-focused. What you get to do in this role: - Design and build reusable demonstration assets that encapsulate product configurations used across multiple demonstrations for solution-oriented outcomes. - Work directly with field teams to understand customer perspectives to capture their expectations, preferences, and aversions in a “Voice of the Customer.” - Work with Field Readiness team to drive adoption of the Demo Platform within the go-to-market organization. - Build analytics and reporting to track demo usage and effectiveness. - Collaborate with Product Managers and Engineering to prepare assets to support demonstrations of new product introductions. - Participate in the release planning process to influence the product direction based on customer feedback. - Create supporting documents for demos like technical demonstration guides and video examples. Qualifications - Bachelor’s degree or equivalent experience. - 5+ years of developer experience with Enterprise SaaS products. - Full Stack development experience: React.js, Node.js, AWS Serverless patterns (Lambda, DynamoDB, SQS and SNS). - Experience designing and building for multi-tenancy and tenant isolation. - Experience with infrastructure-as-code or configuration management tools. - API integration experience for connecting applications to identity platforms. - Have working technical knowledge of digital identity and authentication (OAuth, OIDC & SAML). Requirements - Global first: comfortable working in and supporting a globally distributed team. - Self-directed and proactive: identifies opportunities to improve and drives solutions without waiting for direction. - Collaborative: thrives in cross-functional environments working with field teams, product, and engineering. - Systems thinker: balances immediate needs with long-term scalability and reusability. - Customer-centric approach: ability to gather and synthesize "Voice of Customer" feedback into actionable requirements. Nice to Have - Experience with demo engineering teams or technical marketing. - Experience with presales. - Experience with Okta or Auth0. Benefits - The OTE range for this position for candidates located in the San Francisco Bay area is between $179,000 — $246,000 USD. - The annual OTE range for this position for candidates located in California (excluding San Francisco Bay Area), Colorado, Illinois, New York, and Washington is between $160,000 — $220,000 USD. - Okta offers equity (where applicable) and benefits, including health, dental and vision insurance, 401(k), flexible spending account, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies.

United States
$160K - $246K / year
Job Closed

Role Description We are looking for an Observability Engineer to design and operate the metrics, logging, tracing, and alerting platforms that give engineering teams confidence in the systems they run. The role spans the full observability stack — from collection agents and pipelines to long-term storage, dashboards, and alerting workflows — with a strong focus on usability, signal quality, and operational ROI. The ideal candidate has built and operated observability platforms at scale, understands the trade-offs between open-source and SaaS approaches, and can translate noisy telemetry into actionable insight for both engineers and business stakeholders. Key Responsibilities - Design and operate enterprise-grade observability platforms covering metrics, logs, traces, events, and synthetic monitoring. - Architect Prometheus / Thanos / Mimir, Grafana, Loki, Tempo, OpenTelemetry, and Datadog deployments for high availability and scale. - Develop standards for service instrumentation, including OpenTelemetry adoption, metric naming, label cardinality, and structured logging conventions. - Define and enforce SLOs, SLIs, and error budgets, and build the dashboards and alerts that operationalize them. - Build alerting strategies that minimize noise, surface actionable signals, and integrate cleanly with on-call workflows in PagerDuty, Opsgenie, or similar tools. - Operate large-scale time-series and log storage platforms, balancing retention, query performance, and cost. - Design distributed tracing pipelines and help teams use traces to diagnose latency and reliability issues. - Develop self-service tooling, paved-road libraries, and templates that make adoption of observability standards easy for product teams. - Drive cost management and label-cardinality discipline across the observability estate. - Lead incident response readiness improvements through better dashboards, alerting hygiene, and post-incident analysis tooling. - Partner with SRE and platform teams to integrate observability into deployment pipelines, canary analysis, and progressive delivery workflows. - Evaluate and recommend observability vendors and open-source tools based on cost, capability, and operational maturity. - Mentor engineering teams on observability fundamentals, debugging techniques, and SLO-driven operations. - Maintain documentation, onboarding guides, and runbooks for the observability platform. Qualifications - Bachelor’s degree in Computer Science or a related field. - Five or more years of experience in SRE, platform engineering, or observability roles. - Deep hands-on experience with Prometheus, Grafana, and at least one major commercial observability platform such as Datadog, New Relic, or Splunk. - Strong understanding of OpenTelemetry, distributed tracing, and structured logging. - Proficiency in at least one general-purpose language such as Go, Python, or Java. - Experience operating high-cardinality, high-throughput metrics and log pipelines. - Strong understanding of SLOs, error budgets, and SRE principles. - Experience integrating observability with CI/CD and incident management tooling. - Solid grasp of Linux internals, networking, and container platforms. - Excellent communication and collaboration skills. Preferred Qualifications - Experience with Thanos, Mimir, Cortex, Loki, or Tempo at scale. - Contributions to OpenTelemetry or observability open-source projects. - Familiarity with eBPF-based observability tooling. - Experience driving observability cost optimization initiatives. - Exposure to regulated environments with audit-grade logging requirements. How to Apply Would you like to know more about this opportunity? For immediate consideration, please send your resume to [email protected] . Learn more about Bright Vision Technologies at www.bvteck.com .

United States
$100K - $150K / year
Job Closed