uRun

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

Founding Engineer - ML Performance

Location

United States

Posted

7 days ago

Salary

$250K - $395K / year

Seniority

Mid Level

Job Description

Founding Engineer - ML Performance

uRun

Role Description Performance is uRun's core differentiator. We're not chasing incremental gains — we're building infrastructure that runs 10–100x faster than the status quo. As our ML Performance Engineer, you will be the person who makes that true. This is a founding technical hire. You will: - Write custom CUDA kernels, pushing GPU utilization to its limits. - Own inference latency end-to-end across the stack. - Work directly with the founding team on the hardest performance problems in production AI infrastructure. - Have your fingerprints on everything we ship. What you'll actually be doing day-to-day: - Write custom CUDA kernels that unlock performance headroom unavailable through off-the-shelf frameworks. - Optimize model inference end-to-end, targeting sub-50ms latency across our inference platform. - Drive 10x performance improvements across the stack: memory bandwidth, kernel fusion, operator scheduling, and beyond. - Implement zero-copy distributed memory optimizations across multi-GPU and multi-node environments. - Own GPU utilization and memory management, squeezing every available FLOP out of the hardware we run. - Profile, benchmark, and instrument the full inference pipeline to find and eliminate bottlenecks systematically. - Set the performance engineering bar for the team: define what fast looks like and build the tooling to measure it. Qualifications - Deep, hands-on CUDA expertise: you have written custom kernels in production, not just called into cuBLAS. - Strong background in model inference and post-training optimization at scale. - Fluency in GPU memory hierarchy, warp scheduling, kernel fusion, and hardware-aware algorithm design. - Experience profiling and benchmarking complex inference pipelines: you know where the time goes and how to get it back. - Able to operate at the frontier with minimal guidance — you identify the problem, design the approach, and ship the fix. Requirements - Public work in GPU optimization or inference efficiency — open source contributions, a published paper, or a side project that shows your depth (vLLM, Flash-Attention, TensorRT-LLM, PyTorch, or equivalent). - Experience with hardware-aware optimization frameworks: CuTe, Triton, TileLang, or similar. - Familiarity with distributed memory and communication primitives: NCCL, InfiniBand, NVLink, RoCE. - Contributions to or deep familiarity with PyTorch Distributed, Ray core, or similar systems. - Experience optimizing for video generation or other high-throughput, latency-sensitive generative workloads. - Prior work at an inference-focused company or research lab pushing the boundary of what GPU hardware can do. Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company. - Health, dental, and vision — full coverage. - 401(k) — company-supported retirement savings. - FSA/HSA — flexible spending accounts for healthcare costs. - Paid time off — we trust you to manage your time. - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster. - MacBook Pro and AirPods — the hardware you need, on us.

Related Categories

Related Job Pages

More Engineer Jobs

Role Description - Definir e executar estratégia de testes para jornadas críticas e integrações. - Implementar e manter suíte de testes UI E2E utilizando Playwright com TypeScript. - Automatizar testes de API REST e validar autenticação, contratos, paginação e versionamento. - Implementar testes de contrato entre Frontend, BFF e APIs. - Integrar testes automatizados aos pipelines CI/CD. - Apoiar definição de critérios de aceite, DoR/DoD e rastreabilidade de testes. - Atuar em conjunto com times de Frontend, Backend, Arquitetura e UX/UI. - Realizar análise de causa raiz, triagem de defeitos e melhoria contínua dos testes e pipelines. - Garantir evidências e relatórios auditáveis de execução de testes. - Documentar padrões e apoiar evolução da automação de testes. Qualifications - Experiência de pelo menos 5 anos em QA de aplicações web complexas. - Domínio de Playwright para automação UI E2E com TypeScript. - Experiência com testes de API REST. - Conhecimento em autenticação OAuth2/OIDC e validação de schemas. - Experiência com testes de contrato (Pact ou equivalente). - Vivência com Git e pipelines CI/CD. - Conhecimento em Docker para execução e troubleshooting de testes. - Experiência com execução cross-browser e testes em CI/headless. - Conhecimento em observabilidade e geração de evidências de testes. - Experiência com Jira e ferramentas de rastreabilidade de testes. - Boa comunicação em português e leitura/escrita em inglês. - Perfil sênior com autonomia e atuação colaborativa. Requirements - Experiência com SAP Commerce Cloud (Hybris) e Spartacus/Composable Storefront. - Vivência em projetos B2B complexos. - Experiência com integrações SAP ECC/S/4HANA. - Conhecimento em testes de acessibilidade e qualidade web. - Experiência com testes de performance utilizando k6 ou JMeter. - Conhecimento em API Gateway (Kong) e IdP (Keycloak). - Vivência com GraphQL. - Experiência com ferramentas como Allure, Zephyr, Xray ou TestRail. Benefits - Oportunidades 100% remotas 👨🏻‍💻 - Vale home office 💻 - Feedbacks periódicos 💬 - Programa de indicações 🏅 - Acolhimento psicológico 🙋🏻‍♂️ - Ginástica laboral 🏋️ - Academia de conhecimento 🧠 - Convênio com escola de inglês 🔤 - Reuniões mensais de transparência 🔃 - Happy hour online 🍻 - Kit de boas-vindas 🎁

Brazil
GCI Communication Corp logo

Telecom Engineer II - Wireless Core

GCI Communication Corp

At GCI, we foster an environment where the unique perspectives of our employees, customers, and fellow Alaskans are celebrated. We add value to our community by nurturing and empowering each member of our workforce, ensuring equal opportunities for every Trailblazer. GCI is an equal opportunity employer. Qualified applicants are considered for employment without regard to race, color, religion, national origin, age, sex, sexual orientation, gender identity, marital status, mental or physical disability, veteran status, or any other status or classification protected under applicable state or federal law.

Engineer7 days ago
Full TimeRemoteTeam 1,001-5,000

Role Description GCI's Telecom Engineer II will apply engineering principles across Technology Planning & Engineering to design, implement, optimize, and support telecommunications network architectures that meet industry standards and business needs. Responsible for delivering scalable, reliable, and secure solutions, supporting project execution, maintaining accurate network documentation, and monitoring and optimizing network performance while resolving complex deployment and operational issues. Qualifications - A combination of relevant work experience and/or education sufficient to perform the duties of the job may substitute to meet the total years required on a year-for-year basis. - High School diploma or equivalent. - Bachelor’s degree in Electrical Engineering, Computer Science, Computer Engineering, Telecommunications, or relevant field. - Minimum of four (4) years of progressive engineering experience in information technology, development, and managing moderate to complex technical projects within telecom environments, or related background. - Experience within the telecommunications industry (preferred). - Relevant telecom industry or job specific certifications (preferred). Requirements - Solid understanding of 5G Core architecture (PCC/PCG, AMF, SMF, UPF, UDM, AUSF, PCF, NRF, NSSF, NEF). - Strong working knowledge of 4G EPC (MME, SGW, PGW, HSS) and experience supporting migration strategies to 5G core environments. - Working proficiency of 3G Core architecture (MSC, MediaGateway, SGSN, and GGSN) and familiarity with circuit switch and packet switch topologies. - Experience supporting IMS-based services including VoLTE, VoWiFi, and VoNR, with exposure to service integration and optimization. - Functional knowledge of ancillary telecom systems such as SMSC/MMSC, Prepaid platforms (OCS, IVR, CAMEL), Voicemail/VVM, and RTT/TTY, with experience supporting integrations, monitoring service performance, and troubleshooting service impacts. - Familiarity with network slicing and cloud-native or virtualized core deployments. - Proficiency with core protocols including Diameter, IP, SIP, GTP-C/U, PFCP, SCTP, HTTP/2, and TLS, with the ability to troubleshoot protocol-level issues. - Ability to contribute to end-to-end core network designs and implementations. Benefits - Some travel to remote sites throughout Alaska and to lower 48 States may be required. - Work is primarily sedentary, requiring daily routine computer usage. - Ability to work shifts as assigned, work in standard office/home office setting, and operate standard office equipment. - Must work well in a team environment and be able to work with a diverse group of people and customers. - Virtual workers must comply with remote work policies and agreements. Company Description At GCI, we foster an environment where the unique perspectives of our employees, customers, and fellow Alaskans are celebrated. We add value to our community by nurturing and empowering each member of our workforce, ensuring equal opportunities for every Trailblazer. GCI is an equal opportunity employer. Qualified applicants are considered for employment without regard to race, color, religion, national origin, age, sex, sexual orientation, gender identity, marital status, mental or physical disability, veteran status, or any other status or classification protected under applicable state or federal law.

United States
Tensordyne logo

Sr. ASIC EDA Workflow Engineer

Tensordyne

Tensordyne is a system solution company that specializes in the design of industry-leading high-performance, low-power AI inferencing. Our mission is to enable multimodal Generative AI inference acceleration at scale by providing safe, sustainable, high-performance AI-driven solutions for many markets. We are at the leading edge of advancing the latest research and product improvements for AI inference solutions that will make AI even more advantageous for compelling new applications. Well-funded, fast-paced startup company with headquarters in Sunnyvale, CA, and Munich, Germany. Many talented team members working remotely. Prioritize employees' well-being and their families. Value contributions and offer tailored benefits.

Engineer7 days ago
Full TimeRemoteTeam 51-200

Role Description In this hands-on, technology leadership role, you will lead EDA tool flow management, and associated engineering workflow development for Tensordyne's multimodal generative AI inference acceleration products. As a valued senior member of our ASIC team, you will: - Guide and assist colleagues to improve and invent EDA workflows within a fast-paced, agile HPC development environment. - Drive Tensordyne’s optimization, implementation, and exploration of new EDA tools and technologies for the full ASIC chip design process. - Continuously innovate and improve scalable, reliable, high-performance systems and tools for the next generation of Tensordyne products. - Work closely with ASIC team members engaged in the design and verification of Tensordyne products to understand and improve their workflows and EDA needs. Qualifications - Experience leading the development and support for compilation, build automation, testing, packaging, and installation project generators (CMake, GNU make, Ninja). - Hands-on ASIC engineering experience, including knowledge of VLSI/SoC chip design and verification workflows, with ASIC EDA tool suites from Synopsys and/or Cadence. - Knowledge of Linux system administration and familiarity with cloud-based DevOps, with experience in supporting EDA tools. - Programming and debugging skills with key languages to automate tasks and improve efficiency using scripts. - Prior work experience supporting ASIC engineers with EDA workflows, including installation of new tool versions, FlexLM license management, and debugging/fixing issues with EDA vendors. - Excellent analytical, written, and verbal interpersonal skills, with the ability to collaborate productively within a global engineering team. - Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field. Requirements - Experience with CI/CD and modern Git Branching workflows. Benefits - Comprehensive benefits. - Competitive compensation. - Flexible spending options. - Recognition programs. Company Description Tensordyne is an AI system solution company that builds very high-performance, low-power generative AI inference systems. Our mission is to enable multimodal Generative AI inference acceleration at scale, with safe, sustainable, high-performance systems for our hyperscaler and neocloud data center customers. We are a well-funded, fast-paced startup with headquarters in Sunnyvale, CA, and Munich, Germany, and many talented team members working remotely across North America and Europe.

Northern America + 1 moreAll locations: Northern America | Europe
CapIntel logo

Context Engineer

CapIntel

We're an investment sales platform for wealth enterprises and professionals. Sign up for free and grow your practice!

Engineer7 days ago
Full TimeRemoteTeam 51-200Since 2019H1B No Sponsor

• Design and implement LLM-powered features into our core application via model APIs (e.g. Anthropic, OpenAI, Cohere), with a focus on reliability and production-readiness • Architect and maintain retrieval-augmented generation (RAG) pipelines, connecting language models to internal knowledge bases, databases, and live data sources • Manage context window strategy, determining what information enters the model, when, in what format, and at what level of compression to optimise for accuracy, cost, and latency • Design and implement agentic workflows enabling the platform to handle multi-step, autonomous tasks • Build guardrail and output validation layers that constrain model behaviour and ensure AI features act within well-defined, compliant boundaries • Develop reusable agent primitives, prompt templates, and workflow components that other engineers can build on independently • Build evaluation frameworks to measure context effectiveness, output quality, and agent reliability in production • Monitor deployed AI systems for failure patterns and implement mitigation strategies, feeding learnings back into continuous improvement cycles • Collaborate with Product, Product Engineering, Implementation, and Data teams to translate business requirements, and proof of concepts into production AI system specifications • Act as an internal practitioner and resource helping upskill the broader engineering team on context engineering principles and agentic best practices

Canada
$120K - $140K / year