vCluster Labs

vCluster Labs is a venture-backed tech startup headquartered in San Francisco, California, with a distributed, remote-first team spanning eight time zones. Founded following a Seri

AI Infrastructure Engineer

Location

United States

Posted

72 days ago

Salary

$150K - $200K / year

Seniority

Mid Level

Job Description

AI Infrastructure Engineer

vCluster Labs

Role Description As vCluster’s AI Infrastructure Specialist, you will work directly with customers at the earliest and most critical stage of their journey: from bare metal GPU nodes through to a production-ready deployment. This is not a traditional professional services role; you operate pre-sale as part of a proof of value engagement scoped to reach production. You will be one of the first team members a neocloud or AI Factory engages with at a technical depth, and the playbooks you develop will scale the motion for the next hire and customer. vCluster is gaining rapid traction with GPU AI Clouds and enterprises building AI Factories: organizations that need to offer Kubernetes as a managed service on bare metal GPU infrastructure, and need to do it fast. This role exists to make that happen. - Lead Technical Deployments: Drive end-to-end technical deployments for GPU neocloud and AI Factory customers, from initial bare metal configuration to a validated vCluster environment. - Infrastructure Optimization: Configure and troubleshoot bare metal GPU node infrastructure, including CNI configuration, GPU Operator setup, distributed storage backends, and RDMA/InfiniBand. - Validation: Deploy and validate Kubernetes and vCluster to provide GPU-powered managed K8s. - Knowledge Transfer: Work alongside customer teams to build self-sufficiency, ensuring they can operate and grow the platform independently. - Scaling through Documentation: Document reusable playbooks and deployment architectures so your learnings become the next customer's head start. - Feedback Loop: Collaborate with Engineering and Product to surface recurring infrastructure challenges, acting as a direct feedback loop from the field into the roadmap. - Strategic Partnering: Join Sales in the pre-sales process where deep infrastructure work is required to achieve a meaningful proof of value. Qualifications - Production K8s Mastery: 5+ years of experience deploying and operating Kubernetes in production, ideally on bare metal or in high-complexity environments. - GPU Fluency: Practical knowledge of NVIDIA GPU Operators, CUDA tooling, and systems-level configuration for GPU nodes. - Networking Fundamentals: Deep understanding of CNI plugins, overlay networks, load balancing, and connectivity diagnosis in layered environments. - Storage Expertise: Experience with persistent volume configuration, CSI drivers, and distributed systems like Ceph, Rook, Weka, or Longhorn. - Operational Agility: Comfort operating in ambiguous, fast-moving environments where you are often writing the playbook in real time. - Modern Tech Mindset: You thrive in environments that reject legacy tech and prefer a modern stack where you can solve a variety of problems from pipelines to internal services. Requirements - Bonus points for: - Automation Skills: Experience writing automation scripts with Bash, Python, or Go. - Kubernetes Depth: Relevant certifications such as CKA (Certified Kubernetes Administrator) or experience writing Kubernetes Operators. - AI/ML Familiarity: Experience with inference serving, GPU scheduling, and the tooling around LLM deployment. - Documentation: Experience building AI Automation in documentation to contribute to a shared knowledge base. Benefits - Competitive Salary: We offer a competitive compensation package, including equity. - Platinum-Level Insurance: Health, dental, vision, and life Insurance, including plans for you and eligible dependents (benefits vary depending on country). - Flexible Working Schedule: You have a doctor’s appointment or need to head to the supermarket to get groceries at 2pm? We won’t have an issue with that. To us, results matter more than clocking in and out at the same time every day. - Workplace Flexibility: We’re very flexible about where you work. We know things can change in life and we’re happy to adjust the work environment for you along the way.

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

Salesforce logo

Senior Infrastructure Engineer, IT Mergers and Aquisitions

Salesforce

👋 We're Salesforce, the customer company. CRM + Data + AI + Trust.

Full TimeHybridTeam 10,001+Since 1999H1B Sponsor

Title: Senior Infrastructure Engineer, IT M&A Location: - New York - New York - Illinois - Chicago - Colorado - Denver - Georgia - Atlanta - Washington - Seattle - Indiana - Indianapolis Hybrid Full time Job Description: To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category Enterprise Technology & Infrastructure Job Details About Salesforce Salesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition meets action. Tech meets trust. And innovation isn't a buzzword - it's a way of life. The world of work as we know it is changing and we're looking for Trailblazers who are passionate about bettering business and the world through AI, driving innovation, and keeping Salesforce's core values at the heart of it all. Ready to level-up your career at the company leading workforce transformation in the agentic era? You're in the right place! Agentforce is the future of AI, and you are the future of Salesforce. As an Infrastructure Engineer within our Business Technology (BT) organization, you will play a critical role in the evolution of our global corporate IT environment. You will be responsible for assessing, migrating, and integrating diverse collaboration and infrastructure technologies during company acquisitions. This is a highly cross-functional role where you will partner with security, engineering, and business teams to provide expert recommendations and execute technical integrations that drive our collective success. What You'll Do (Responsibilities): - Collaborative Assessments: Conduct comprehensive audits of an acquisition's infrastructure (identity, compute, storage, network, and endpoints) to identify integration paths. - Strategic Recommendations: Develop and present high-level solution designs and integration strategies, providing clear insights into scope, complexity, and risk. - Execution & Migration: Lead the hands-on execution of technology migrations for email, calendar, and file storage, ensuring a smooth transition for new team members. - Standard Setting: Contribute to the continuous improvement of our integration playbooks and infrastructure standards. - Stakeholder Partnership: Navigate sensitive organizational changes with empathy, managing expectations and providing clear updates to stakeholders throughout the integration lifecycle. - Mentorship & Training: Partner with global teams to identify user training needs and ensure new employees are set up for success from day one. What We're Looking For (Qualifications): - Degree or equivalent relevant experience required. Experience will be evaluated based on the Values & Behaviors for the role (e.g. extracurricular leadership roles, military experience, volunteer roles, work experience, etc.) - Broad Technical Foundation: 5+ years of experience in IT operations, design, or architecture within enterprise environments. - Migration Experience: A proven track record of supporting or leading technology migration projects. - Adaptability: The ability to learn new technologies and concepts quickly in a fast-paced environment. - Discretion & Empathy: Experience navigating sensitive situations and a commitment to managing highly confidential information with integrity. - Project Management: Demonstrated ability to drive projects forward independently while collaborating effectively with global, cross-functional teams. - Clear Communication: Ability to document technical designs and processes clearly and present recommendations to both technical and non-technical audiences. Preferred Skills (Bonus points if you have these): - Experience with Mergers & Acquisitions (M&A) or large-scale organizational integrations. - Hands-on experience with Cloud platforms (AWS or GCP) and SaaS environments (Google Workspace or O365). - Familiarity with migration tools (e.g., CloudMigrator) and security best practices. - Prior experience with or a deep interest in the Salesforce platform and its internal IT processes. Logistics: - Travel: Occasional travel (up to 25%) may be required for on-site assessments; all travel is fully company-sponsored. - Flexibility: We value work-life integration and offer a hybrid work model, with an expectation of 10 days per quarter in the office. Benefits & Perks Check out our benefits site which explains our various benefits, including wellbeing reimbursement, generous parental leave, adoption assistance, fertility benefits, and more. Unleash Your Potential When you join Salesforce, you'll be limitless in all areas of your life. Our benefits and resources support you to find balance and be your best, and our AI agents accelerate your impact so you can do your best. Together, we'll bring the power of Agentforce to organizations of all sizes and deliver amazing experiences that customers love. Apply today to not only shape the future - but to redefine what's possible - for yourself, for AI, and the world. Accommodations If you need a reasonable accommodation during the application or the recruiting process, please submit a request via this Accommodations Request Form. Please note that Salesforce uses artificial intelligence (AI) tools to help our recruiters assess and evaluate candidates' resumes and qualifications throughout the recruiting process. Humans will always make any candidate selection and hiring decisions. Please see our Candidate Privacy Statement for more information about how we use your personal data and your rights, including with regard to use of AI tools and opt out options. Posting Statement Salesforce is an equal opportunity employer and maintains a policy of non-discrimination with all employees and applicants for employment. What does that mean exactly? It means that at Salesforce, we believe in equality for all. And we believe we can lead the path to equality in part by creating a workplace that's inclusive, and free from discrimination. Know your rights: workplace discrimination is illegal. Any employee or potential employee will be assessed on the basis of merit, competence and qualifications - without regard to race, religion, color, national origin, sex, sexual orientation, gender expression or identity, transgender status, age, disability, veteran or marital status, political viewpoint, or other classifications protected by law. This policy applies to current and prospective employees, no matter where they are in their Salesforce employment journey. It also applies to recruiting, hiring, job assignment, compensation, promotion, benefits, training, assessment of job performance, discipline, termination, and everything in between. Recruiting, hiring, and promotion decisions at Salesforce are fair and based on merit. The same goes for compensation, benefits, promotions, transfers, reduction in workforce, recall, training, and education. In the United States, compensation offered will be determined by factors such as location, job level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, and benefits. Salesforce offers a variety of benefits to help you live well including: time off programs, medical, dental, vision, mental health support, paid parental leave, life and disability insurance, 401(k), and an employee stock purchasing program. More details about company benefits can be found at the following link: https://www.salesforcebenefits.com. At Salesforce, we believe in equitable compensation practices that reflect the dynamic nature of labor markets across various regions. The typical base salary range for this position is $148,500 - $223,900 annually. In select cities within the San Francisco and New York City metropolitan area, the base salary range for this role is $178,900 - $246,000 annually. The range represents base salary only, and does not include company bonus, incentive for sales roles, equity or benefits, as applicable.

New York + 5 moreAll locations: New York | Illinois | Colorado | Georgia | Washington | Indiana
$148.5K - $246K / year
GovCIO logo

Senior Splunk Engineer - Infrastructure Operations

GovCIO

GovCIO is a service-disabled-veteran-owned small business (SDVOSB) that offers technology services to improve business performance for government organizations.

Role Description GovCIO is currently hiring for Systems Architect (Senior) / Senior Splunk Engineer - Infrastructure Operations to support our Administrative Office of the US Courts NLS project. The NLS currently ingests an average of 18-20TB of logging data daily across 60 indexers distributed in 2 data centers. This position is located within the United States and is fully remote. - Design, implement, and operate the Splunk Core, Enterprise Security, IT Service Intelligence (i.e., ITSI), Phantom (Security Orchestration, Automation, and Response (SOAR)), Splunk Cloud, Splunk On-Call, and Multi-Site Index Clustering environment. - Monitor overall Splunk health through the Monitoring Console (DMC) including indexer, search head, and cluster master status. - Track indexing rates, license usage, queue health, and search concurrency to identify performance or ingestion issues early. - Monitor CPU, memory, and disk utilization across all Splunk components to ensure optimal resource usage. - Respond promptly to health alerts, DMC warnings, or anomalies observed on monitoring dashboards. - Investigate and resolve common user-reported issues such as access problems, failed searches, or non-triggering alerts. - Troubleshoot data ingestion, parsing, and indexing issues across Universal Forwarders, Heavy Forwarders, and HEC endpoints. - Investigate missing or duplicate logs, timestamp errors, or sourcetype misassignments and escalate complex parsing issues to Engineering. - Validate new data source onboardings by confirming sourcetype assignment, timestamp accuracy, and field extraction integrity. - Support data source owners with forwarder deployment, syslog setup, and connectivity troubleshooting during initial onboarding. - Maintain data flow visibility from source → forwarder → indexer to confirm data completeness and performance. - Rotate and update credentials, API keys, or tokens used in data inputs, integrations, alerts, and scheduled searches. - Manage RBAC user and role mappings, handling access requests, entitlement reviews, and permission troubleshooting. - Provide end-user assistance with SPL searches, reports, alerts, and dashboards, including query optimization tips. - Maintain and update knowledge base articles, SOPs, and FAQs for repeatable issues and troubleshooting steps. - Log and escalate platform or parsing issues to the Engineering team with evidence such as logs, screenshots, and correlation IDs. - Open and manage Splunk Support cases for platform-level bugs, license problems, or critical system faults. - Monitor and manage ITSI service health, including KPIs, correlation searches, NEAP policies, and summary index latency. - Troubleshoot ITSI-related issues such as broken KPIs, delayed episodes, or missing notable events. - Perform capacity management by monitoring index growth, bucket rotation, and frozen data retention policies. - Conduct periodic system maintenance tasks, including orphaned object cleanup and knowledge object review. - Verify and maintain compliance with data governance and retention policies, ensuring secure and auditable configurations. - Participate in DR testing and validation to ensure Splunk data recovery and HA configurations are functioning as expected. - Document incidents, RCA findings, and preventive actions for future reference. - Collaborate closely with the Engineering team for escalations, root-cause investigations, and deployment verifications. Qualifications - Bachelor's with 10 years (or commensurate experience) OR - Masters Degree or higher (in a related discipline) with 7 years experience Requirements - Expert skills in Enterprise Security, ITSI, SOAR, and the Splunk product line. - Able to design, implement, and operate the Splunk Core, Enterprise Security, IT Service Intelligence (i.e., ITSI), Phantom (Security Orchestration, Automation, and Response (SOAR)), Splunk Cloud, Splunk On-Call, and Multi-Site Index Clustering environment. - Clearance Required: Must be able to obtain and maintain AOUSC Public Trust. Benefits - Employee Assistance Program (EAP) - Corporate Discounts - Learning & Development platform, to include certification preparation content - Training, Education and Certification Assistance* - Referral Bonus Program - Internal Mobility Program - Pet Insurance - Flexible Work Environment Company Description GovCIO is a team of transformers--people who are passionate about transforming government IT. Every day, we make a positive impact by delivering innovative IT services and solutions that improve how government agencies operate and serve our citizens. We are an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, disability, or status as a protected veteran. EOE, including disability/vets.

United States
$105K - $145K / year
Job Closed
CNX logo

Customer Engineer – Infrastructure – Azure Virtual Desktop / W365 - Bilingual (German and English)

CNX

We're Concentrix. The intelligent transformation partner. Solution-focused. Tech-powered. Intelligence-fueled. The global technology and services leader that powers the world’s best brands, today and into the future.

Full TimeRemoteTeam 10,001

Job Title: Customer Engineer – Infrastructure – Azure Virtual Desktop / W365 - Bilingual (German and English) Job Description Job Description Summary We're Concentrix. The intelligent transformation partner. Solution-focused. Tech-powered. Intelligence-fueled. The global technology and services leader that powers the world’s best brands, today and into the future. We’re solution-focused, tech-powered, intelligence-fueled. With unique data and insights, deep industry expertise, and advanced technology solutions, we’re the intelligent transformation partner that powers a world that works, helping companies become refreshingly simple to work, interact, and transact with. We shape new game-changing careers in over 70 countries, attracting the best talent. The Concentrix Technical Products and Services team is the driving force behind Concentrix’s transformation, data, and technology services. We integrate world-class digital engineering, creativity, and a deep understanding of human behavior to find and unlock value through tech-powered and intelligence-fueled experiences. We combine human-centered design, powerful data, and strong tech to accelerate transformation at scale. You will be surrounded by the best in the world providing market leading technology and insights to modernize and simplify the customer experience. Within our professional services team, you will deliver strategic consulting, design, advisory services, market research, and contact center analytics that deliver insights to improve outcomes and value for our clients. Hence achieving our vision. Our game-changers around the world have devoted their careers to ensuring every relationship is exceptional. And we’re proud to be recognized with awards such as "World's Best Workplaces," “Best Companies for Career Growth,” and “Best Company Culture,” year after year. Join us and be part of this journey towards greater opportunities and brighter futures.We are looking a Customer Engineer – Infrastructure – Azure Virtual Desktop / W365 Note: This position requires fluency in both English and German. Job Description: The AVD / W365 Customer Engineer will work directly with customers, as a consultant and technical advisor to: Architectural Design & Strategy - Design for Resilience: Lead architectural design sessions to build scalable, secure, and resilient virtual desktop solutions with strong focus on BCDR strategies for mission-critical environments. - Modernization: Guide customers from legacy on-premises VDI (Citrix/VMware) to cloud-native solutions like AVD and Windows 365. - Trusted Advisor: Act as the primary technical point of contact for customer IT executives and architects, bridging the gap between business goals and technical implementation. Technical Implementation & Engineering - Image & Profile Management: Design and implement automated image creation solutions (i.e. Azure Image Builder) and robust profile management strategies using FSLogix containers. - Endpoint Management: Drive the integration of Microsoft Intune for managing physical and virtual endpoints. - Application Strategy: Advise on application delivery and packaging, specifically modern formats like MSIX and App Attach to decouple applications from base images. - Automation: Utilize PowerShell, Azure CLI, ARM or Biceps to automate deployment, scaling, and monitoring tasks, reducing manual operational overhead. Operational Excellence & Troubleshooting - Deep Dive Troubleshooting: Apply a methodical, analytical approach to resolve complex performance issues (latency, login times, resource contention) in large-scale environments. - Monitoring: Implement Azure Monitor and Log Analytics to provide proactive insights into host pool health and user experience. Ideal candidate experience: Minimum of 5 years working as a depth expert and technology owner or consultant for AVD / W365. Minimum of 10-15 years of experience of working with Windows Client Environments, ideally also Azure environments Required Hard Skills - Core Virtualization: Deep, hands-on expertise in Azure Virtual Desktop and/or Windows 365. Strong background in Hyper-V and RDS. - Identity & Security: Solid understanding of Azure Entra ID, Hybrid Identity, Conditional Access, and RBAC models. - Infrastructure: Proficiency in Azure Infrastructure (Networking, Storage, Compute). - Automation: Confident in PowerShell scripting for automation and system management. - OS Proficiency: Deep knowledge of Windows 10/11. Professional Experience - Public Sector Focus: Passion for and willingness to work with public sector customers, understanding their unique compliance and security requirements. - Experience: Degree in Computer Science, IT, or equivalent practical experience. Long-term experience with large enterprise customers and complex IT landscapes. - Languages: Excellent command of German and English (spoken and written) is mandatory for this role. - Mobility: Valid driver’s license and willingness to travel frequently to customer sites across Germany. Preferred (Nice to Have) - Certifications: Microsoft Certified: Azure Virtual Desktop Specialty (AZ-140) is highly preferred. Other relevant certs: Azure Administrator (AZ-104) or Azure Solutions Architect (AZ-305). - Legacy Knowledge: Experience with Citrix DaaS or VMware Horizon is helpful for migration conversations but not strictly required. - Network Security: Understanding of hub-and-spoke topology, ExpressRoute, and firewall configuration for VDI. Location: DEU Gera Work-at-Home Language Requirements: Time Type:

Germany
Job Closed
Full TimeRemoteTeam 11-50

- Manage operations related to Azure cloud and on-premise IT infrastructures - server, storage, security, cloud services, server virtualization and business continuity & disaster recovery, and support database operations. - Manage requests from business divisions on provision of IT services and ensure seamless implementation & delivery, and manage and drive APM engagement and implementation. - Manage change initiatives to ensure availability, performance and reliability of the systems impacted by change, and manage backup and recovery of cloud infrastructure. - Manage security of cloud infrastructure, including but not limited to WAF & SSL certificate implementation & operations. - Implement policies and procedures to ensure a stable and secure infrastructure. - Ensure that company's cloud and local infrastructures run seamlessly, perform within agreed targets, and provide a secure platform for the company's business operations. - Perform any other related cloud infrastructure duties as assigned by the line manager. Requirements - BSc in Computer Science, Information Technology, Electrical & Electronic Engineering, or a related discipline from a reputable higher institution. - 2+ years experience in a similar role within the Fintech, Banking/Financial Services, Service Integrator, CSP, or other Tech Enterprise or Service provider, with Azure Architect Expert certification, Azure Admin certification, or other related certifications. - Proficient with Active Directory & ADFS Proficiency, Linux/Windows OS & VMware, Containerization, and Networking & WAF fundamentals and any scripting languages (i.e., Python, Bash, etc). - Expertise in Infrastructure Automation (i.e., Ansible, terraform, ARM, CICD expertise, etc) and Application & Infrastructure performance management tools delivery & management. - Expertise in project management, IT service management & team management. Benefits Qore provides the rare opportunity to make history in the financial space for Africa by Africans, while working with the smartest, brightest & coolest minds in Africa. Our people & culture team continuously thinks of innovative ways to improve employee experience and some of the other benefits of working with Qore includes: - Very Competitive & Rewarding Pay - Flexible work option (i.e., Remote) - Paid Lunch for onsite work - Lifelong Learnings

Niger + 1 moreAll locations: Niger | Nigeria