Staff HPC Engineer
Location
California
Posted
1 day ago
Salary
$214K - $268K / year
Seniority
Senior
Job Description
Staff HPC Engineer
Biohub
Title: Staff HPC Engineer Location: San Francisco, CA (Hybrid) Biohub is the first large-scale initiative bringing frontier AI models, massive compute, and frontier experimental capabilities under one roof. We're building a general-purpose system to accelerate scientific discovery, integrating frontier AI models, biological foundation models, and lab capabilities, with the ultimate goal of curing disease. Our technology powers scientists around the world, translating AI capabilities into tools that accelerate research everywhere. The Team The HPC Engineering team is part of the AI Compute Platform organization at Biohub, a non-profit research lab committed to open science and open-source AI. We own the design, operation, and reliability of hybrid GPU AI clusters that power frontier AI biology research: protein language models, genomic foundation models, and scientific reasoning systems built to be shared. Our infrastructure supports day-to-day AI researcher workflows. The team works at the intersection of AI tooling, distributed systems, HPC, and frontier AI, debugging deep AI infrastructure problems and building AI systems critical to the entire AI organization. The Opportunity We seek a Staff HPC Engineer to help lead the evolution of our advanced computing infrastructure into a next-generation hybrid HPC and AI platform. This role will help shape strategy, architecture, and operations for high-performance computing resources — including cutting-edge GPUs, large-scale storage, and high-speed networks — while enabling transformative science through AI and machine learning at scale. You will design, implement, and optimize a unified HPC-AI ecosystem blending on-prem Slurm-managed clusters, cloud GPU resources, and containerized environments. This hybrid environment will power everything from traditional HPC workloads to large AI training jobs, generative model development, real-time inference, and data-intensive pipelines. The successful candidate will be a thought leader in HPC infrastructure , capable of partnering with scientists, computational biologists, and software engineers to translate complex research needs into high-impact computing solutions. You will also foster adoption of emerging AI tools, and ensure our systems can scale to meet the demands of next-generation biomedical research. What You'll Do HPC Engineering - Build and support a hybrid HPC-AI environment with large-scale on-prem compute/storage and elastic cloud GPU clusters (Coreweave, AWS, GCP). - Architect and optimize environments for large-scale AI training and tuning, and low-latency scientific workloads. - Integrate MLOps and model deployment pipelines into HPC infrastructure, ensuring reproducibility and efficiency. - Implement advanced resource scheduling and orchestration (Slurm, Kubernetes, SUNK) optimized for mixed HPC and AI workflows. Operational Excellence - Support researchers with job optimization, GPU utilization best practices, and performance tuning for AI and HPC applications. - Evaluate, deploy, and maintain AI/ML software stacks (e.g., PyTorch, TensorFlow, Hugging Face, RAPIDS) and HPC toolchains. - Ensure robust data ingest, analysis, and management capabilities for AI and HPC workloads, including integration with parallel file systems and object storage. Collaboration & Enablement - Work with diverse science teams to translate research requirements into hardware/software solutions, from experimental design through publication. - Promote best practices for AI model training, validation, and deployment in shared computing environments. - Foster a culture of shared learning by running internal workshops on HPC-AI tooling (e.g., VS Code remote dev, containerization, MLOps workflows). What You'll Bring Essential - Bachelor’s or advanced degree in Computer Science, AI/ML, Data Science, Systems Engineering, or related field. - 10+ years building and managing HPC infrastructure, with significant experience integrating AI/ML workloads. - Proven track record architecting environments for large-scale GPU AI training and inference in hybrid on-prem/cloud environments. - Deep expertise with HPC scheduling (Slurm), container orchestration (Kubernetes), and cloud GPU services. - Strong hands-on experience with AI frameworks (PyTorch, TensorFlow, JAX) and distributed training strategies (Horovod, DeepSpeed, Ray). - Knowledge of MLOps best practices, including CI/CD for ML, model registry, experiment tracking, and performance monitoring. - Exceptional ability to collaborate with multidisciplinary teams and communicate complex technical concepts clearly. - Demonstrated leadership in guiding infrastructure teams, influencing organizational strategy, and fostering adoption of new technologies. Technical - Advanced Linux systems administration, HPC networking (Infiniband, Ethernet), and storage systems administration (VAST Lustre, Weka and ZFS) - Cloud platform expertise (Coreweave, AWS, GCP) including GPU provisioning, storage, and networking for AI workloads. - Proficiency in automation tools (Terraform, Ansible, Puppet), containerization (Docker, Singularity), and orchestration frameworks. - Strong experience debugging and troubleshooting hardware across the stack (network, GPU, compute and storage systems). - Strong scripting/programming skills (Python, Bash) and familiarity with version control (Git). - Experience integrating AI LLMs, AI coding assistants, and custom model development into HPC workflows. Compensation The San Francisco, CA base pay range for a new hire in this role is for a Staff HPC Engineer 214,000–$268,000 and for a Senior Staff HPC Engineer $241,000–$300,000. New hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process. This position may be eligible to participate in our discretionary annual performance bonus program. Bonus eligibility and targets are determined in accordance with our total rewards philosophy and may vary by role. Better Together As we grow, we’re excited to strengthen in-person connections and cultivate a collaborative, team-oriented environment. This role is a hybrid position requiring you to be onsite for at least 60% of the working month, approximately 3 days a week, with specific in-office days determined by the team’s manager. The exact schedule will be at the hiring manager's discretion and communicated during the interview process. Benefits for the Whole You We’re thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible. - Provides a generous employer match on employee 401(k) contributions to support planning for the future. - Paid time off to volunteer at an organization of your choice. - Funding for select family-forming benefits. - Relocation support for employees who need assistance moving If you’re interested in a role but your previous experience doesn’t perfectly align with each qualification in the job description, we still encourage you to apply as you may be the perfect fit for this or another role. #LI-Hybrid
Related Guides
Related Categories
Related Job Pages
More Engineer Jobs
[Fab15 HVM] - Dry Etch Process Equipment Engineer
Micron TechnologyMicron Technology specializes in memory and semiconductor technology, such as computer memory and image sensors. Since opening, Micron Technology has had a successful history and i
Our vision is to transform how the world uses information to enrich life for all . Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever. 職務内容 先端DRAM半導体製品において, 生産技術部門 DRYエッチプロセスの一員として, 量産技術改善から生産技術• 管理を横断的に担い, ものづくりを推進する役割になります。 業務詳細 安全トレーニングを受講し, マイクロンの安全ポリシーに従い業務を遂行してもらいます。 生産技術部門DRY グループ内において, Process&Equipment担当エンジニアの役割を担います。 Process&Equipment担当エンジニアは, より高い生産性と高品質の量産技術の確立に向けて, 多角的に活動し, 業務• 改善を進めて行きます。 良品率の改善や管理システムの改善, 生産性向上の改善を進めて行く必要があります。 また, 新規最先端装置の導入や改修もサポートし, より生産計画の向上を進めて行きます。 開発製品の導入時には, 製品や生産技術の問題点を洗い出し, 関連部門と連携しながら, 量産化を実現して行きます。 DRY内において, 装置担当エンジニアや, 他のプロセス担当エンジニア, そしてシフト勤務に従事するテクニシャンと, コミュニケーションをとり, 改善活動を牽引する。 日々の問題から, 長期的なプロジュクトまで参画しながら, 上司, チームメンバーそして関係者への報告を行う。マイクロンの海外生産拠点と技術交流にも参加して, 情報の共有から, 標準化を進めて行きます。 要件(全て満たしている必要は無い)• 先端の技術に興味を持ち, 仕事に前向きに取り組むことができる。• 課題を整理し, 解決できる能力を有する。• 関係部門と協議しプロジェクトを補佐する能力を有する。• 社内外の人とコミュニケーションを円滑に取ることができる。• Microsoft Office (Excel, PowerPoint, Word) などのソフトウェアを使用できる。• 自己管理(業務, 健康)ができる。• 物理化学, 機械, 電気等の基礎知識について, 大学卒業以上の知識を有することが望ましい。 GJS: E1 and above Fab Engineer 1 to Fab Engineer 4 About Micron Technology, Inc. We are an industry leader in innovative memory and storage solutions transforming how the world uses information to enrich life for all . With a relentless focus on our customers, technology leadership, and manufacturing and operational excellence, Micron delivers a rich portfolio of high-performance DRAM, NAND, and NOR memory and storage products through our Micron® and Crucial® brands. Every day, the innovations that our people create fuel the data economy, enabling advances in artificial intelligence and 5G applications that unleash opportunities - from the data center to the intelligent edge and across the client and mobile user experience. To learn more, please visit micron.com/careers All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status. To request assistance with the application process and/or for reasonable accommodations, please contact hrsupport_japan@micron.com Micron Prohibits the use of child labor and complies with all applicable laws, rules, regulations, and other international and industry labor standards. Micron does not charge candidates any recruitment fees or unlawfully collect any other payment from candidates as consideration for their employment with Micron. AI alert: Candidates are encouraged to use AI tools to enhance their resume and/or application materials. However, all information provided must be accurate and reflect the candidate's true skills and experiences. Misuse of AI to fabricate or misrepresent qualifications will result in immediate disqualification. Fraud alert: Micron advises job seekers to be cautious of unsolicited job offers and to verify the authenticity of any communication claiming to be from Micron by checking the official Micron careers website in the About Micron Technology, Inc.
Photo Lithography Process Engineer
Micron TechnologyMicron Technology specializes in memory and semiconductor technology, such as computer memory and image sensors. Since opening, Micron Technology has had a successful history and i
Our vision is to transform how the world uses information to enrich life for all . Micron Technology is a world leader in innovating memory and storage solutions that accelerate the transformation of information into intelligence, inspiring the world to learn, communicate and advance faster than ever. 職務概要 ATE ( Alpha Technology Engineering ) Photo Lithography プロセスにおけるプロセスエンジニアとして, 担当する Photo 工程の管理をはじめ, プロセス条件の最適化, トラブルシューティング, ならびに歩留まり• リワーク• サイクルタイム改善活動の支援を担います。また, 次世代テクノロジーノードにおいては, 技術開発( TD )フェーズから量産フェーズへの円滑かつ正確な移管を, 定められたスケジュール内で責任をもって遂行するテクノロジートランスファーの役割を担います。そのため, TD および量産チームにとどまらず, 設備, 購買, Facility チーム, さらには他 Fab との連携も含め, 良好な関係を構築し, 部門横断• 拠点横断でのコラボレーションを推進していくことが重要なミッションの一つとなります。 Job description As a process engineer in ATE (Alpha Technology Engineering) Photo Lithography, this role is responsible for managing assigned photo processes, optimizing process conditions, troubleshooting issues, and supporting improvement activities related to yield, rework, and cycle time. In addition, for next-generation technology nodes, this position is responsible for executing technology transfer from the Technology Development (TD) phase to the high-volume manufacturing (HVM) phase accurately and smoothly within defined schedules. To accomplish these goals, it is essential to build strong working relationships and promote effective collaboration not only with TD and manufacturing teams, but also with equipment, procurement, and facility teams, as well as with other fabs across the organization. 職務内容 担当工程の需要プロセスパラメーター(寸法, 重ね合わせ, 欠陥, 再工事率, サイクルタイム等)の管理 歩留り改善をはじめ, 各種改善活動をサポートもしくは牽引(Cpk, コスト, サイクルタイム等) 次世代テクノロジーノードの, 量産サイドまたは他Fabへの展開 Responsibilities and Tasks - Manage key process parameters for the assigned manufacturing process, including critical dimension (CD), overlay, defects, rework rate, and cycle time. - Support or lead various continuous improvement activities, including yield improvement, process capability (Cpk), cost reduction, and cycle time improvement. - Support the deployment and technology transfer of next-generation technology nodes to high-volume manufacturing and/or other fabs. 要件• 分析に基づく問題解決スキルを持っている。• チーム内外の関係者と適切なコミュニケーションがとれ, 連携した業務の遂行が出来る。• テクニカルな話題を理解, 発表, 教育, 開発を行う能力がある。• 各種プログラミングスキル, データベースに関する基本的な理解。• Digital skill (Script , RPA , Tableau) 作成の基本的な理解。• 時間管理, マルチタスク能力を持ち合わせている。• Microsoft Office ( Excel/Word/Power point )スキルを有する。• 英語による基本的なコミュニケーションスキルを有する(日常会話レベルもしくはそれ以上) Required Skills and Qualifications - Strong problem-solving skills based on data analysis. - Ability to communicate effectively with internal and external stakeholders and execute work in collaboration with cross-functional teams. - Ability to understand technical topics and to explain, present, train, and contribute to technical development activities. - Fundamental knowledge of programming languages and databases. - Basic understanding of digital skills and tools, such as scripting, RPA, and Tableau. - Strong time management skills with the ability to handle multiple tasks simultaneously. - Proficiency in Microsoft Office applications (Excel, Word, PowerPoint). - Basic English communication skills (daily conversational level or above). About Micron Technology, Inc. We are an industry leader in innovative memory and storage solutions transforming how the world uses information to enrich life for all . With a relentless focus on our customers, technology leadership, and manufacturing and operational excellence, Micron delivers a rich portfolio of high-performance DRAM, NAND, and NOR memory and storage products through our Micron® and Crucial® brands. Every day, the innovations that our people create fuel the data economy, enabling advances in artificial intelligence and 5G applications that unleash opportunities - from the data center to the intelligent edge and across the client and mobile user experience. To learn more, please visit micron.com/careers All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status. To request assistance with the application process and/or for reasonable accommodations, please contact hrsupport_japan@micron.com Micron Prohibits the use of child labor and complies with all applicable laws, rules, regulations, and other international and industry labor standards. Micron does not charge candidates any recruitment fees or unlawfully collect any other payment from candidates as consideration for their employment with Micron. AI alert: Candidates are encouraged to use AI tools to enhance their resume and/or application materials. However, all information provided must be accurate and reflect the candidate's true skills and experiences. Misuse of AI to fabricate or misrepresent qualifications will result in immediate disqualification. Fraud alert: Micron advises job seekers to be cautious of unsolicited job offers and to verify the authenticity of any communication claiming to be from Micron by checking the official Micron careers website in the About Micron Technology, Inc.
Senior Vulnerability Engineer
Apex SystemsApex Systems, an IT staffing and workforce solutions firm, provides recruiting and staffing services to large and small companies alike. Founded in 1995 by thre
Title: NXTG Senior Vulnerability Engineer Employee Type: Contract Location: Home, MD, US Job Type: Pay Range: $60 - $80 per hour Job#: 3035084 Job Description: Job Description: NXTG Senior Vulnerability Engineer Location: Maryland (Teleworker) Employment Type: Contract Role Overview We are seeking a Senior Vulnerability Engineer to support enterprise vulnerability management, exposure management, compliance auditing, and web application scanning operations. This role is responsible for engineering and optimizing vulnerability management capabilities using Tenable One, Nessus, and Tenable Web App Scanning across hybrid cloud and on-premises environments within a highly regulated federal setting. The ideal candidate will possess hands-on experience with authenticated and non-authenticated web application scanning, cloud-native asset visibility, and enterprise-scale vulnerability operations. Key Responsibilities - Engineer, maintain, and optimize enterprise vulnerability and exposure management platforms using Tenable One, Nessus, and Tenable WAS. - Configure and support authenticated and non-authenticated web application scanning, including Selenium-based authentication workflows and SSO integrations. - Perform credentialed vulnerability and compliance scanning across Linux, Windows, databases, cloud infrastructure, web applications, and network appliances. - Support continuous attack surface visibility, asset discovery, exposure prioritization, and scalable scan operations across hybrid cloud environments. - Troubleshoot complex operational issues involving TLS/SSL negotiation, authentication failures, load balancers, and distributed scanning infrastructure. - Deploy and maintain compliance audit configurations aligned to IRS Safeguards / SCSEM, CIS Benchmarks, NIST SP 800-53, DISA STIG, and FedRAMP requirements. - Integrate Tenable platforms with enterprise technologies including CyberArk, Splunk, ServiceNow, and AWS APIs. - Support remediation validation, compliance reporting, audit readiness activities, and operational dashboard development. Required Qualifications Education: Bachelor’s degree in Cybersecurity, Information Technology, Computer Science, Engineering, or a related field. Equivalent experience may be considered. Experience: 10+ years of experience supporting enterprise vulnerability management, exposure management, cybersecurity engineering, or security operations programs. Technical Skills: Hands-on experience with Tenable One, Nessus, Tenable WAS, and AWS cloud environments. Experience supporting authenticated and non-authenticated web application scanning. Strong understanding of vulnerability management, exposure management, and cloud-native security concepts. Preferred Qualifications - Familiarity with IRS Safeguards / SCSEM, CIS Benchmarks, NIST guidance, DISA STIG, and FedRAMP compliance frameworks. - Experience supporting enterprise integrations, automation workflows, and operational reporting capabilities. - Strong troubleshooting, analytical, and problem-solving skills across infrastructure, cloud, and application environments. - Project management, workflow, innovation and process improvement, and consulting skills. Compensation & Benefits The pay rate for this position is between $60.00 and $80.00 per hour. Please note that there will be only one bill rate regardless of the number of hours worked in a day or work week. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. We will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. Apex uses a virtual recruiter as part of the application process. Click here for more details. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at employeeservices@apexsystems.com or 844-463-6178. Everforth Apex is a world-class IT services company that serves thousands of clients across the globe. When you join Everforth Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing® in Talent Satisfaction in the United States and Great Place to Work® in the United Kingdom and Mexico. Everforth Apex uses a virtual recruiter as part of the application process. Click here for more details. Everforth Apex Benefits Overview: Everforth Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Everforth Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Everforth Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Everforth Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our ‘Welcome Packet’ as well, which an Everforth Apex team member can provide.
• Perform as an engineer on sediment remediation and waterfront projects nationwide through site investigation, feasibility study, design, and construction phases of projects • Manage, lead, and prepare field sampling plans, remedial investigation reports, feasibility studies, pre-design investigations, and remedial designs • Evaluate contamination, develop conceptual site models, evaluate remediation technologies, prepare calculations and cost estimates, reports, and design drawings • Lead preparation of client-ready documents and designs, provide senior-level review for technical content and quality • Effectively manage and lead multidisciplinary teams, including project scopes, schedules, budgets, and communications with internal teams, clients, and regulatory agencies • Provide mentoring and training to junior engineers and scientists • Act as a Subject Matter Expert (SME) on sediment investigation or remediation projects • Support business development opportunities, including client/regulatory meetings and proposal development

