uRun

View details: Technical Writer

Role Description Infrastructure is only as powerful as the developers who can use it. As uRun's first Technical Writer, you will own the entire documentation surface: - API references - Integration guides - SDK docs - Conceptual explainers - The developer experience that ties it all together This is a founding hire. You will build the documentation foundation from scratch, work directly with engineers to get inside the product, and set the standard for how uRun communicates with the developers who build on top of us. The quality of your work is a direct part of our product. What you'll actually be doing day-to-day: - Own the full documentation lifecycle end-to-end: plan, write, review, publish, and iterate across API references, quickstarts, integration guides, and conceptual docs - Embed with engineering to understand the product deeply, translating complex distributed systems and inference infrastructure concepts into clear, accurate developer documentation - Build and maintain the docs infrastructure: tooling, structure, versioning, and publishing pipeline - Write and maintain SDK documentation and code samples across multiple languages, ensuring they are accurate, tested, and developer-friendly - Create onboarding and integration content that helps developers go from zero to production on uRun as fast as possible - Partner with the GTM and engineering teams to produce technical content that supports developer adoption: tutorials, explainers, changelog entries, and release notes - Establish and maintain a documentation style guide and content standards that scale as the team grows Qualifications - 5+ years of technical writing experience, with a portfolio that demonstrates clear, accurate documentation for developer or infrastructure products - Strong command of API documentation: you have written reference docs, OpenAPI specs, or equivalent and understand what developers actually need from them - Comfortable reading and writing code: you can work through a code sample, spot an error, and write accurate examples in at least one language (Python preferred) - Experience owning docs infrastructure: static site generators, docs-as-code workflows, version control, and CI/CD for documentation - Strong editorial instincts: you write with precision, cut ruthlessly, and know the difference between explaining and obscuring - Able to operate independently as the first and only technical writer, setting standards without a template to follow - Comfortable working closely with engineers and translating highly technical concepts for a developer audience Requirements - Experience documenting infrastructure, cloud, or developer platform products (compute, networking, storage, or equivalent) - Familiarity with ML infrastructure concepts: model serving, inference APIs, GPU compute, or distributed systems - Experience with docs tooling such as Mintlify, Readme, Docusaurus, or similar - A background that includes developer relations, developer advocacy, or direct experience as a software engineer - Prior experience as a founding or sole technical writer at an early-stage company Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company - Health, dental, and vision — full coverage - 401(k) — company-supported retirement savings - FSA/HSA — flexible spending accounts for healthcare costs - Paid time off — we trust you to manage your time - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster - MacBook Pro and AirPods — the hardware you need, on us

United States

$130K - $190K / year

Job Closed

Founding Engineer - Site Reliability

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

DevOps Engineer69 days ago

View details: Founding Engineer - Site Reliability

Role Description Reliability at uRun isn't a feature — it's the product. When model labs and production teams build on top of our inference platform, they are trusting us with their uptime, their latency, and their users. As our Site Reliability Engineer, you will own that trust end-to-end. This is a founding SRE hire. You will define the reliability culture from scratch: - The observability stack - The incident response playbooks - The SLOs - The on-call process You will work directly with infrastructure and platform engineers to close the gap between what we ship and what stays up. What you'll actually be doing day-to-day: - Define and own SLOs and error budgets across uRun's inference platform and supporting infrastructure - Build and maintain the observability stack end-to-end: metrics, logging, tracing, and alerting across a distributed GPU compute environment - Lead incident response: detection, triage, resolution, and blameless postmortems that drive lasting fixes - Partner with ML infrastructure engineers to embed reliability into the deployment pipeline from day one - Design and maintain runbooks, on-call rotations, and escalation paths as the team scales - Drive capacity planning and traffic management across heterogeneous compute to protect latency and availability under load - Identify and eliminate toil through automation, building systems that scale without scaling the team proportionally Qualifications - 7+ years in site reliability, production engineering, or infrastructure engineering in a high-availability, low-latency environment - Deep experience owning SLOs, error budgets, and on-call processes in production at scale - Strong observability background: you have built or owned monitoring stacks (Prometheus, Grafana, Datadog, or equivalent) and know what good alerting looks like - Proven incident response experience: you have led real incidents under pressure and written postmortems that actually changed behaviour - Hands-on with Kubernetes and cloud infrastructure (AWS preferred): you can debug a failing pod and a misconfigured VPC in the same afternoon - Strong software engineering fundamentals: you write automation, not just runbooks - Comfortable operating as the first and only SRE, setting standards without a template to follow Requirements - Experience supporting GPU compute or ML inference infrastructure in production - Familiarity with stateful workloads, long-running sessions, or streaming inference systems - Exposure to multi-tenant platforms where isolation, noisy neighbour problems, and billing-aware scheduling matter - Prior founding or sole SRE experience at an early-stage company Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company - Health, dental, and vision — full coverage - 401(k) — company-supported retirement savings - FSA/HSA — flexible spending accounts for healthcare costs - Paid time off — we trust you to manage your time - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster - MacBook Pro and AirPods — the hardware you need, on us

United States

$185K - $285K / year

Founding Engineer - ML Performance

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

Engineer69 days ago

View details: Founding Engineer - ML Performance

Role Description Performance is uRun's core differentiator. We're not chasing incremental gains — we're building infrastructure that runs 10–100x faster than the status quo. As our ML Performance Engineer, you will be the person who makes that true. This is a founding technical hire. You will: - Write custom CUDA kernels, pushing GPU utilization to its limits. - Own inference latency end-to-end across the stack. - Work directly with the founding team on the hardest performance problems in production AI infrastructure. - Have your fingerprints on everything we ship. What you'll actually be doing day-to-day: - Write custom CUDA kernels that unlock performance headroom unavailable through off-the-shelf frameworks. - Optimize model inference end-to-end, targeting sub-50ms latency across our inference platform. - Drive 10x performance improvements across the stack: memory bandwidth, kernel fusion, operator scheduling, and beyond. - Implement zero-copy distributed memory optimizations across multi-GPU and multi-node environments. - Own GPU utilization and memory management, squeezing every available FLOP out of the hardware we run. - Profile, benchmark, and instrument the full inference pipeline to find and eliminate bottlenecks systematically. - Set the performance engineering bar for the team: define what fast looks like and build the tooling to measure it. Qualifications - Deep, hands-on CUDA expertise: you have written custom kernels in production, not just called into cuBLAS. - Strong background in model inference and post-training optimization at scale. - Fluency in GPU memory hierarchy, warp scheduling, kernel fusion, and hardware-aware algorithm design. - Experience profiling and benchmarking complex inference pipelines: you know where the time goes and how to get it back. - Able to operate at the frontier with minimal guidance — you identify the problem, design the approach, and ship the fix. Requirements - Public work in GPU optimization or inference efficiency — open source contributions, a published paper, or a side project that shows your depth (vLLM, Flash-Attention, TensorRT-LLM, PyTorch, or equivalent). - Experience with hardware-aware optimization frameworks: CuTe, Triton, TileLang, or similar. - Familiarity with distributed memory and communication primitives: NCCL, InfiniBand, NVLink, RoCE. - Contributions to or deep familiarity with PyTorch Distributed, Ray core, or similar systems. - Experience optimizing for video generation or other high-throughput, latency-sensitive generative workloads. - Prior work at an inference-focused company or research lab pushing the boundary of what GPU hardware can do. Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company. - Health, dental, and vision — full coverage. - 401(k) — company-supported retirement savings. - FSA/HSA — flexible spending accounts for healthcare costs. - Paid time off — we trust you to manage your time. - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster. - MacBook Pro and AirPods — the hardware you need, on us.

AI/ML AI Performance Optimization LLM PyTorch Ray

United States

$250K - $395K / year

Founding Engineer - Platform

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

Platform Engineer70 days ago

View details: Founding Engineer - Platform

Role Description You'll design and own the scalable, low-latency infrastructure that powers uRun's real-time inference runtime, the platform that makes live, interactive, multi-user AI workloads possible. - This is not classic ops or cloud management; you'll be deep in the AI runtime itself. - Latency, frame rate, and interactive quality of service are first-class platform properties. - The workloads are GPU-constrained, memory-bound, and bursty. - You will often write platform features, custom controllers, and scaling logic. - You will report directly to our founder, Keegan McCallum, and set the technical direction for the engineering organization. Qualifications - 7+ years as an engineer, with a proven track record architecting and owning large-scale production systems. - Deep Kubernetes expertise, including GPU-heavy clusters (NVIDIA tooling, autoscaling on GPU nodes) and service-mesh patterns. - Strong cloud and infrastructure-as-code experience: AWS, GCP, or Azure; Terraform, Pulumi, or equivalent; networking and security (VPC, IAM, API-gateway-style routing). - SRE-style thinking and observability depth: Prometheus/Grafana, OpenTelemetry, distributed tracing, SLOs, incident response, and post-mortems. - Proficiency in at least one of Python, Go, or TypeScript/Node.js for platform tooling, automation, and glue code. - Experience with streaming or real-time systems: WebRTC, low-latency video pipelines, or comparable latency-sensitive workloads. - A track record of mentoring engineers and influencing cross-functional teams. Requirements - Hands-on experience with GPU-constrained, memory-bound, or bursty workloads. - Experience writing custom Kubernetes controllers, scaling logic, or other platform features in-house. - Early-stage startup experience: owning ambiguous problems end-to-end and setting technical direction with limited scaffolding. Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company. - Health, dental, and vision — full coverage. - 401(k) — company-supported retirement savings. - FSA/HSA — flexible spending accounts for healthcare costs. - Paid time off — we trust you to manage your time. - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster. - MacBook Pro and AirPods — the hardware you need, on us. Company Description We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. - Day-to-day, that means a small team, a high bar, and real ownership. - You won't wait for permission or inherit a backlog of someone else's decisions. - In a founding security role, the function is what you make it. - It also means ambiguity: priorities shift, not everything is documented. - You'll often be the person who decides what "secure enough, for now" means.

AI Kubernetes AWS GCP Azure Terraform Pulumi Amazon IAM Observability/Monitoring Prometheus Grafana OpenTelemetry Python TypeScript Node.js WebRTC

United States

$250K - $350K / year

Founding Engineer - Software

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

Backend Engineer70 days ago

AI Observability/Monitoring Python TypeScript Node.js Docker/Containers WebRTC Kubernetes

Role Description You'll build the services, APIs, and core application systems that power uRun's runtime—the software layer that turns our real-time inference platform into something product and applied AI teams can actually build on. - This is not a conventional CRUD backend role. - The work centers on low-latency, high-throughput systems: real-time interaction, evolving session state, and request handling that stays reliable under heavy compute and concurrency. - You'll work closely with product, infrastructure, and applied AI teams, in an early-stage environment where the architecture is still being set. What you'll actually be doing day-to-day: - Build and maintain backend services, APIs, and internal platform components that power uRun's real-time inference runtime. - Design systems for real-time interaction, evolving session state, and scalable request handling across production environments. - Partner with infrastructure and platform engineers to keep services observable, reliable, and efficient under heavy compute and concurrency. - Translate experimental AI capabilities into robust, user-facing software, working closely with product and applied AI teams. - Shape architecture decisions on data flow, service boundaries, performance optimisation, and fault tolerance for interactive systems. - Raise engineering quality through testing, monitoring, code review, documentation, and sound operational practice. Qualifications - 7+ years building and shipping backend software in production. - Proficiency in one or more backend languages — Python, Go, or TypeScript/Node.js. - Experience designing APIs, service-oriented systems, and distributed application components. - Solid understanding of cloud infrastructure, containers, and modern deployment workflows. - Ability to reason about performance, concurrency, reliability, and debugging in complex systems. - Experience with real-time, interactive, streaming, or latency-sensitive systems; this is central to the role, not a bonus. Requirements - WebRTC or WebSockets for real-time communication. - AI infrastructure, inference-adjacent systems, media pipelines, or event-driven architectures. - Kubernetes, observability tooling, and hands-on production operations. - Early-stage startup experience — owning problems end-to-end and moving quickly with limited scaffolding. Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company. - Health, dental, and vision — full coverage. - 401(k) — company-supported retirement savings. - FSA/HSA — flexible spending accounts for healthcare costs. - Paid time off — we trust you to manage your time. - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster. - MacBook Pro and AirPods — the hardware you need, on us.

View details: Founding Engineer - Software

United States

$200K - $350K / year

Founding ML Infrastructure Engineer

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

Infrastructure Engineer71 days ago

AI AI/ML Observability/Monitoring Distributed Systems Kubernetes LLM AWS GCP Azure

Role Description We are building the next generation of AI inference infrastructure. As our ML Infrastructure and Platform Engineer, you will own the architecture and scaling of our GPU compute platform from the ground up. This is a founding technical hire with end-to-end ownership across the full infrastructure stack, from bare metal to model serving. You will work directly with the founding team and define how we build. What you'll actually be doing day-to-day: - Design and scale our GPU compute platform to support 1,000+ GPU clusters, ensuring high availability and low-latency inference across the fleet. - Build and maintain the infrastructure layer for our compute marketplace, including multi-tenant scheduling, isolation, and billing-aware resource allocation. - Own production reliability for ML systems end-to-end: observability, incident response, and SLA achievement across model serving and infrastructure. - Architect feature stores and model registry systems that support rapid iteration and reproducibility at scale. - Design an experiment tracking infrastructure capable of handling thousands of concurrent runs with full auditability. - Build resource orchestration and scheduling systems that optimise for throughput, cost, and latency across heterogeneous hardware. - Set engineering standards for infrastructure reliability, capacity planning, and operational excellence as an early technical leader. Qualifications - Proven experience designing and operating large-scale distributed infrastructure at 1,000+ nodes or equivalent complexity, in any domain. - Deep expertise in distributed systems, cluster orchestration (Kubernetes, Slurm, or custom schedulers), and large-scale resource scheduling. - Strong production reliability instincts: observability, incident response, capacity planning, and SLA ownership across complex systems. - Experience building infrastructure that other engineers build on top of, not just operating it. - Ability to operate as a technical lead: set direction, make tradeoffs under uncertainty, and raise the bar for the team around you. - Startup orientation. You are energised by ambiguity, move fast, and build for scale from day one. Requirements - Exposure to ML infrastructure concepts: GPU networking (NCCL, InfiniBand, RoCE), model serving frameworks (vLLM, SGLang, TensorRT-LLM), or hardware-aware performance tuning (CuTe, Triton, TileLang). - Experience with multi-cloud GPU procurement and capacity management across AWS, GCP, Azure, and bare metal providers. - Familiarity with inference marketplace architectures, dynamic routing, or spot/preemptible workload management. - Prior experience at a Series A or earlier stage company scaling from early infrastructure to production. Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company. The band above is our target; for an exceptional candidate we'll go higher. Equity is real — you're early, and the grant reflects that. - Health, dental, and vision — full coverage. - 401(k) — company-supported retirement savings. - FSA/HSA — flexible spending accounts for healthcare costs. - Paid time off — we trust you to manage your time. - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster. - MacBook Pro and AirPods — the hardware you need, on us.

View details: Founding ML Infrastructure Engineer

United States

$200K - $350K / year

Founding Security Engineer / Head of Security

View details: Founding Security Engineer / Head of Security

We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions. In a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented. You'll often be the person who decides what "secure enough, for now" means.

Security Engineer72 days ago

Full Time Remote Lead

Role Description You'll be uRun's first dedicated security hire. This is a founding role: you'll own security end-to-end as a hands-on engineer, and as the company and team grow, you'll have the opportunity to build out and lead the function. The problem you're here to solve: build a security foundation worthy of the infrastructure we run. That means: - Hardening a distributed AWS and Kubernetes stack running stateful inference at scale - Standing up the compliance program that unlocks enterprise deals - Embedding security into engineering without becoming a blocker You'll join as we move from stealth to scale, begin enterprise partnerships, and approach our Series A — the point where this work has the most leverage. Qualifications - 6+ years in security engineering, including time as a founding or sole security hire, or otherwise owning security with minimal support - Proven track record delivering SOC 2 end-to-end as program owner — not just as a contributor - Deep AWS experience: IAM, KMS, GuardDuty, CloudTrail, EKS, and Kubernetes security - Hands-on with compliance automation tooling: Vanta, Drata, or equivalent - Comfortable embedding security into CI/CD: SAST, DAST, secrets scanning, dependency management - Strong incident response background: you've handled real incidents and built playbooks from scratch - A clear communicator who can represent security to technical and non-technical stakeholders, including customers - Able to work PST hours and thrive in a fast-moving, ambiguous environment Requirements - Own SOC 2 Type II end-to-end: scoping, control design, evidence collection, and audit - Drive ISO 27001 and additional frameworks as we scale into enterprise partnerships - Stand up and manage compliance automation tooling (Vanta, Drata, or equivalent) - Respond to vendor security questionnaires and represent uRun's security posture on customer calls - Build and maintain security policies, procedures, and documentation - Harden our AWS environment: IAM, KMS, secrets management, GuardDuty, CloudTrail, VPC - Secure our Kubernetes and EKS stack: container security, RBAC, network policies, runtime controls - Embed security into CI/CD pipelines: SAST, dependency scanning, secrets scanning - Build detection and response capabilities: alerting, playbooks, and incident response processes - Drive vulnerability management end-to-end, from detection through remediation and reporting - Work directly with engineering to resolve security blockers and unblock partnership deals - Manage external auditor relationships and coordinate security reviews - Report on security posture and risk to leadership Benefits - Competitive salary and meaningful equity in an early-stage AI infrastructure company - Health, dental, and vision — full coverage - 401(k) — company-supported retirement savings - FSA/HSA — flexible spending accounts for healthcare costs - Paid time off — we trust you to manage your time - Top-tier tooling — access to the best AI tools available: Claude, Codex, Kimi, and whatever else helps you move faster - MacBook Pro and AirPods — the hardware you need, on us How we work (and what that feels like day-to-day) We build the stage, not the show. We're an infrastructure company, a developer-tools company, and a production partner for model labs, and focus is a deliberate choice we've made and hold to. Day-to-day, that means a small team, a high bar, and real ownership. You won't wait for permission or inherit a backlog of someone else's decisions; in a founding security role, the function is what you make it. It also means ambiguity: priorities shift, not everything is documented, and you'll often be the person who decides what "secure enough, for now" means. That suits some people and not others, and we'd rather you know that before you apply.

AWS Kubernetes Amazon IAM Amazon EKS CI/CD AI

PST (UTC-8)

$200K - $250K / year