Join our Engineering Talent community

Software EngineerSoftware EngineerFull TimeRemoteMid LevelTeam 51-200

Location

United States

Posted

78 days ago

Salary

0

Seniority

Mid Level

Job Description

Join our Engineering Talent community

Replicated

We’re looking for curious and conscientious builders, innovators, and collaborators. At Replicated, we believe true impact comes from caring deeply about your work, thinking in terms of long-term relationships, and approaching every challenge with curiosity and a willingness to revisit first principles. If you're someone who thrives on continuous learning, seeks purpose in solving real (and sometimes complex) problems, as well as cares about building sustainable, thoughtful solutions, you’ll find your home here. Replicated is a Commercial Software Distribution Platform that tackles some of the hardest challenges in modern enterprise software. We help vendors deliver Kubernetes applications into highly controlled customer environments — from VPCs and on-prem data centers to fully air-gapped networks. Our platform includes Kubernetes-native installers, automated release pipelines, container registries and proxies for restricted networks, license management, telemetry, and integrated support tooling. Engineers at Replicated work on distributed systems, networking, and developer experience — ensuring that even the most complex applications can be deployed and operated reliably in security-sensitive environments. We work with fast-growing enterprise software vendors such as KNIME, Puppet, Smartbear, BigID, and Swimlane, helping them bring their products into some of the world’s most security-conscious enterprises. We are fully remote and plan to stay that way! We do have a small office in Austin, TX where our local teammates enjoy spending time working together. We're open to any state in the US. In addition, for some roles, we're open to candidates in Canada, the UK, Australia, and New Zealand (we will specify on postings for these). Join our Engineering Talent Community If you're interested in joining the Replicated team but don't see an opportunity that you'd like to apply to, please fill out a quick application below to join our Talent Community to stay connected with us to learn about future opportunities that align with your background! What you bring: - A passion and curiosity for using AI to elevate your work and impact. You have used AI to build complex projects. You understand where to use AI and when a human is the best use case. At the same time you operate as the ultimate arbiter of what "good looks like". - You have a strong bias for action and ability to proactively collaborate with others. - Exceptional technical and non-technical communication and interpersonal skills. - Strong problem-solving skills, the ability to think critically, and act quickly under pressure. - A customer-centric mindset and a genuine desire to help others succeed. - Experience working remotely with teams across various time zones. - A willingness to travel and meet face to face. While we are a remote company there is an office in Austin, Texas where we get together roughly twice a year. Technical Skills: - Professional development experience with Go and associated tooling. - Professional experience with Kubernetes and cloud native projects. - You have developed software for external users, ideally for a highly technical audience - Experience developing and shipping software in a cloud-native, customer-facing infrastructure. - Experience taking on complex challenges and breaking them down into iterative deliverables. Note: Our engineering team does include some on-call coverage. Preferred Remote Location: - United States Your Growth Journey at Replicated In your first 30 Days: - Immerse Yourself: Dedicate yourself to learning about Replicated - the company, the global CRE team, our products, and our customers (vendors). - Hands-on Training: Complete comprehensive hands-on training with the Replicated platform, working through a structured onboarding checklist. - Team Connections: Meet with team members across Replicated. - Onboarding Improvement: Each of our roles has a structured onboarding process. You will work with your Manager and others to work though your onboarding plan. In your first 60 days: - Product Knowledge Expansion: Deepen your understanding of how Replicated's products are developed, how different services interact, and how they are used in customer-managed environments. - Documentation Review: Review existing support documentation and training materials, identifying areas for updates or improvements. In your first 90 days: - Continued Learning: Continue to invest in your personal and professional growth, leveraging Replicated's resources (like the curiosity budget) to expand your skills in Kubernetes, Linux, and other relevant technologies. Begin exploring opportunities to develop your Go coding skills. At Replicated, we value our teammates as individuals who are stronger together. We offer a robust pay and benefits package that rewards employees for their contributions to our success, supports their well-being, and helps all of us create a great remote work environment. For team members outside of the US, our salary ranges are at localized rates for the countries we support. This is dependent on several factors, including level, qualifications, and experience. We also offer stock options, as well as a unique home office allowance & a professional development budget. An overview is on our careers page here: https://www.replicated.com/careers/ We invest in our team and love candidates who are eager to learn and grow. We have a fantastic team of highly collaborative individuals who enjoy learning, growing, and mentoring others. We invest in our team and love candidates who are eager to learn and grow. We have a fantastic team of highly collaborative individuals who enjoy learning, growing, and mentoring others. OUR CORE VALUES Care Deeply: Care deeply about the work that you do. Because of that you are constantly learning and willing to go out on a limb, challenge assumptions, go back to first principles, etc. Longterm: Treat every interaction as part of a 30 year relationship, you’ll see everyone down the road again as customers, partners, coworkers, etc. Curious: We're always learning and we approach everyone and every problem with curiosity. When needed we challenge assumptions, and go back to first principles. BENEFITS We offer strong benefits to help you stay healthy and productive. For the US, our benefits are listed below: - Health/Dental/Vision - Life/AD&D - LTD/STD - FSA - 401K - Stock options - Partner perk programs - Generous time off, we expect you to take a minimum of 3 weeks of per year - Laptop+accessories you need to get set up - Generous home office set up allowance or co-working space allowance - up to $10,000 per year! - Curiosity Budget to help you keep learning and growing! Replicated is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We encourage applicants of all backgrounds and we work to make sure that all team members have an equal opportunity to succeed. Please note at this time we are unable to provide sponsorship to individuals in the United States. We do not accept unsolicited assistance from any headhunters, recruitment firms or any other third party for any of our job openings. Any unsolicited resumes sent from anyone other than the candidate, in any format, to any person at Replicated, will be considered Replicated property. Replicated will NOT pay a fee for any placement resulting from the receipt of an unsolicited resume. #LI-Remote

Related Job Pages

More Software Engineer Jobs

Full TimeRemoteTeam 10,001+Since 2017H1B Sponsor

Through our dedicated associates, Conduent delivers mission-critical services and solutions on behalf of Fortune 100 companies and over 500 governments - creating exceptional outcomes for our clients and the millions of people who count on them. You have an opportunity to personally thrive, make a difference and be part of a culture where individuality is noticed and valued every day. Application Development & Support Engineer About the Role Conduent is seeking an Application Development & Support Engineer to work with a highly energetic and dynamic team supporting medium to large transaction processing systems for Conduent Services customers. This position requires a detail-oriented individual who will be responsible for providing software support for all customer-escalated issues, with ongoing research, debugging and subsequent release and deployment of product updates. While moderate-to-extensive knowledge of each is required, we are also looking for individuals who excel in one or more of the following six categories: - Linux (command line) – Able to use advanced Linux commands for troubleshooting, OS configurations and software deployments. - Network Troubleshooting – Strong knowledge of networking concepts. Ability to investigate and research network issues using standard Linux networking tools. - SQL/DBMS – Understands and is able to write advanced queries to a database for troubleshooting. - Programming/Scripting – Understands fundamental programming/scripting concepts and is capable of creating programs and scripts to solve complex problems (typically in Perl, Bash, JavaScript). - File Transfer Tools – Strong knowledge in any of the File Movement tools (e.g. Linoma/GoAnywhere, Connect Direct, etc.) / File Transfer Protocols - Kubernetes – Experience managing pods, deploying pods, and proficiency with kubectl. Responsibilities - Responsible for resolving problems and incidents as they occur, work with Level 1 and 3 support personnel to ensure proper steps are taken to resolve those problems and incidents, document new processes and procedures for use by other analysts working on similar applications. - Will have to provide periodic On Call 24x7 support (once out of every 2 months you are on-call off-hours and weekends). - Must have the ability to work in a multi-system and multi-platform environment. - Responsible for coordination with operations center staff to ensure monitoring of production systems. - Responsible in deploying application upgrades in a Linux environment. - Responsible for reading and writing Bash and Perl scripts for automated tasks. - Responsible for tracing errors in Java using source and stack trace data and debug Bash and Perl scripts. - Responsible for following procedures for processing of batch files, troubleshooting and reporting errors. - Responsible for documentation of processes and procedures relating to deployments, troubleshooting and environment maintenance for our applications. - All other duties as assigned. Requirements - Excellent working knowledge of Linux and network technologies. - Moderate skills on TCP/IP, FTP/SFTP, networking concepts and understanding of firewalls and proxies required - Knowledge of advanced Unix/Linux/Windows commands. - Knowledge of Apache, Tomcat and Glassfish preferred. - Experience writing SQL for relational databases (Oracle, Sybase, MS SQL Server, etc.) - Prior experience in a Level 2 or Technical Helpdesk role preferred. - Prior experience in a PCI environment or the Payment Industry preferred. - Ability to utilize computer operating systems utilities. - Strong communication, customer service, organizational skills, and strong troubleshooting skills are a must. - Excellent oral and written communication skills, ability to work in a team environment, good customer interaction skills are desired. - ITIL Service Management knowledge desirable - Experience supporting and/or migrating applications to a cloud environment (Azure / AWS) is a plus. Education: - Bachelor’s degree in a related technical field or equivalent work experience. Each year of relative work experience may be substituted for a year of college education, up to two years. Flexible Working At Conduent, we want you to be yourself. We recognize that everyone is different and that how people want to work and deliver at their best is different for everyone too. In this role, you can expect the following working conditions: - Remote work: Enjoy the convenience of working from home and maximize your time by unplugging at the end of your workday. Working For You Perks and rewards designed for you: - Career Growth Opportunities: We help you thrive, so together, we can grow. We provide opportunities to advance your career with a vast portfolio of businesses and a global footprint. - Great Work Environment: We are proud of our award-winning culture and the recognition we’ve received for our diversity efforts. - Join Us At Conduent, we are one team, one mission. We understand that our success is directly related to the success of our associates. We strive to create a culture where you can: Bring your authentic self to work Grow and thrive, both personally and professionally Make a difference with our clients, in our communities, and with the millions of people we support When you join Conduent, you are engaged in creating the future - both our company’s and your own. With more than 60,000 associates across 24 countries, we will provide you the opportunity to grow with a team of people who will challenge and inspire you to be the best! Pay Transparency Laws in some locations require disclosure of compensation and/or benefits-related information. For this position, actual salaries will vary and may be above or below the range based on various factors including but not limited to location, experience, and performance. In addition to base pay, this position, based on business need, may be eligible for a bonus or incentive. In addition, Conduent provides a variety of benefits to employees including health insurance coverage, voluntary dental and vision programs, life and disability insurance, a retirement savings plan, paid holidays, and paid time off (PTO) or vacation and/or sick time. The estimated salary range for this role is $80,080 - $104,000. Conduent is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, creed, religion, ancestry, national origin, age, gender identity, gender expression, sex/gender, marital status, sexual orientation, physical or mental disability, medical condition, use of a guide dog or service animal, military/veteran status, citizenship status, basis of genetic information, or any other group protected by law. For US applicants: People with disabilities who need a reasonable accommodation to apply for or compete for employment with Conduent may request such accommodation(s) by submitting their request through this form that must be downloaded: click here to access or download the form. Complete the form and then email it as an attachment to FTADAAA@conduent.com. You may also click here to access Conduent's ADAAA Accommodation Policy.

United States
$80.1K - $104K / year
NBCUniversal logo

Sr Software Engineer (.Net/Java/Angular)

NBCUniversal

Here you can create the extraordinary. Join us.

Full TimeRemoteTeam 10,001+Since 2004H1B Sponsor

Company Description NBCUniversal is one of the world's leading media and entertainment companies. We create world-class content, which we distribute across our portfolio of film, television, and streaming, and bring to life through our global theme park destinations, consumer products, and experiences. We own and operate leading entertainment and news brands, including NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo, and Peacock, our premium ad-supported streaming service. We produce and distribute premier filmed entertainment and programming through our powerhouse film and television studios, including Universal Pictures, DreamWorks Animation, and Focus Features, and the four global television studios under the Universal Studio Group banner, and operate industry-leading theme parks and experiences around the world through Universal Destinations & Experiences, including Universal Orlando Resort, home to Universal Epic Universe, and Universal Studios Hollywood. NBCUniversal is a subsidiary of Comcast Corporation. Visit www.nbcuniversal.com for more information. Our impact is rooted in improving the communities where our employees, customers, and audiences live and work. We have a rich tradition of giving back and ensuring our employees have the opportunity to serve their communities. We champion an inclusive culture and strive to attract and develop a talented workforce to create and deliver a wide range of content reflecting our world. Job Description NBCUniversal is seeking a Senior Software Engineer who can deliver modern, reusable solutions across both back-end and front-end stacks. You’ll leverage strong design skills and hands-on coding expertise in technologies like .NET, Angular, and Java to create scalable systems. In this role, you’ll collaborate closely with engineering leadership, influence technical direction, and help build foundational components that power multiple Ad Tech applications. Responsibilities - Design and develop scalable, maintainable web applications using .NET and Angular, ensuring modern and reusable front-end and back-end solutions. - Build and enhance backend services and integrations leveraging Java (preferred), Node.js, or Python to support enterprise-grade applications. - Participate in technical design and architecture discussions, contributing to decisions that shape long-term system scalability and performance. - Drive code quality, security, and performance by implementing automated testing, CI/CD pipelines, and adhering to compliance standards (including PCI). - Collaborate with the business stakeholders to translate requirements into robust technical solutions. - Work closely with engineering leadership to influence technical direction and contribute to building foundational components for multiple Ad Tech applications. Qualifications - Bachelor's degree or higher, or a combination of relevant education, experience, and/or training in Computer Science or a related field. - 5+ years of experience in software development, including full-stack web development. - Proficiency in .NET (C#) and Angular. - Experience in at least one of the following: Java (preferred), Node.js, or Python. - Strong knowledge of RESTful APIs, microservices, and relational databases. - Solid understanding of software engineering principles and Agile methodologies. Desired Characteristics: - Experience with cloud infrastructure (Azure, AWS, or GCP). - Experience with DevOps tools and automated testing. - Experience with Payment Card Industry (PCI) data and compliance. Additional Requirements: - Fully Remote: This position has been designated as fully remote, meaning that the position is expected to contribute from a non-NBCUniversal worksite, most commonly an employee’s residence. This position is eligible for company sponsored benefits, including medical, dental and vision insurance, 401(k), paid leave, tuition reimbursement, and a variety of other discounts and perks. Learn more about the benefits offered by NBCUniversal by visiting the Benefits page of the Careers website. Salary range: $110,000 - $160,000 We are accepting applications for this position on an ongoing basis. Additional Information As part of our selection process, external candidates may be required to attend an in-person interview with an NBCUniversal employee at one of our locations prior to a hiring decision. NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law. If you are a qualified individual with a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access nbcunicareers.com as a result of your disability. You can request reasonable accommodations by emailing [email protected]. For LA County and City Residents Only: NBCUniversal will consider for employment qualified applicants with criminal histories, or arrest or conviction records, in a manner consistent with relevant legal requirements, including the City of Los Angeles' Fair Chance Initiative For Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act, where applicable. - Business Segment: Operations & Technology - Compensation: USD 11000 - USD 160000 - yearly

United States
$110K - $160K / year
Job Closed
Full TimeRemoteTeam 11-50

About Us Co-founded in 2023 by Joe Laws and Grant Verstandig, Trase Systems is AI, Uncomplicated. Trase empowers enterprise leaders to harness the full potential of AI without the associated complexity and risks. We are an end-to-end solution for deploying, managing, and optimizing AI in the enterprise. Our platform specializes in bridging the “last mile” of AI adoption, unlocking AI's full potential while driving efficiency and significant cost savings. Trase is at the forefront of AI Agent innovation, topping the Hugging Face GAIA Leaderboard for Generalized AI Assistants, ahead of industry giants such as Google, Meta, Microsoft, and OpenAI. We are leveraging our cutting-edge technologies to develop mission-critical agentic applications in complex industries such as Healthcare, Oil & Gas, and National Security. About The Role As Principal Software Engineer, you’ll own the core execution model and platform architecture of Trase OS - the shared platform (“agentic operating system”) that powers all Trase deployments in regulated environments. You’ll define the abstractions and APIs that connect workflows, agents, tools, and product surfaces, and ensure the correctness, scalability, and extensibility of the system. This is a company-critical role: you are responsible for how the system behaves under real-world conditions, including failure, scale, and security constraints. Your work sets the technical direction for the platform and acts as a force multiplier across all engineering teams. Clean abstractions and correctness-under-failure are critical because we operate long-lived agents in healthcare/defense environments where auditability and reliability are non-negotiable. Why This Role Is Needed Trase OS is an orchestration-heavy system coordinating long-lived workflows, agents, and tools across multiple services and environments. As the platform evolves, the primary risks shift from implementation to system design quality: - Poor abstractions create tight coupling across services - Workflow execution becomes difficult to reason about under failure - Platform capabilities fragment instead of becoming reusable primitives - Scaling introduces complexity instead of leverage This role exists to: - Define clean, durable abstractions for the platform execution model - Ensure correctness and determinism in workflow execution - Translate evolving product requirements into coherent platform architecture - Enable teams to build on Trase OS without introducing systemic complexity What Makes This Role Hard - You are designing systems where failure is the norm, not the exception, and correctness must be preserved across retries, restarts, and partial execution - You must balance clean abstractions with real-world constraints (performance, security, multi-tenant environments) - Decisions made here become foundational primitives used across all products and teams - The system must remain understandable and auditable, even as complexity and scale increase Responsibilities - Architect & lead the core execution model (state machine, lifecycle, resource model, failure semantics) - Design platform APIs/SDKs connecting workflows, agents, tools, and product surfaces; drive versioning & compatibility - Guarantee correctness via idempotency, deterministic replays, compensating actions, and data integrity - Engineer reliability at scale: concurrency controls, rate limits, backpressure, sharding/partitioning, and workload isolation - Build security & governance into the core: RBAC/ABAC, policy enforcement, fine-grained audit & lineage - Deliver observability: distributed tracing, structured logs, metrics, and evaluation hooks; build an “explainable trail” of agent actions - Own quality: design reviews, test strategy (unit, property, chaos), performance baselines, SLOs, incident response, and postmortems - Mentor & unblock senior engineers; partner with Product, Security, and Customer teams to translate requirements into durable primitives - Make pragmatic choices on storage, queueing, and compute; create paved roads that accelerate all other teams - Define system boundaries and reduce cross-service coupling through clear architectural patterns - Drive platform-wide standards for correctness, reliability, and API design across teams - Balance short-term delivery with long-term architectural integrity, ensuring the platform evolves without accumulating systemic risk Principal-level Technical Leadership - Define and drive the long-term technical architecture of Trase OS across teams and domains - Influence company-wide technical direction for platform and product systems - Lead cross-team initiatives that shape how workflows, agents, and platform primitives are built and evolve - Partner with leadership to align technical architecture with product and business strategy - Mentor senior and staff engineers and raise the bar for system design and architectural thinking Requirements - 12-15+ years of experience building distributed/platform systems, including significant experience defining architecture across teams or domains - 10+ years owning mission-critical runtimes or workflow/orchestration systems - Deep expertise with durable execution (e.g., state machines, event sourcing, saga/compensation, idempotency, exactly/at-least-once semantics) - Proven track record with security & governance in production systems (auth, RBAC, audit, policy) - Hands-on with observability (Grafana or equivalent), including trace correlation across async boundaries - Strong systems design across storage, queues, schedulers, and evented architectures; performance tuning under load - Excellence in a modern language (e.g., Go, Rust, Java, or TypeScript) and cloud-native stacks (containers, CI/CD, IaC) - Comfortable operating in regulated or high-assurance environments; bias toward correctness, clarity, and documentation - Proven ability to influence technical direction across an organization and drive adoption of architectural standards - Ability to incorporate advance LLM capabilities into system design and platform architecture decisions where appropriate Nice to Have - Prior work on workflow engines (Temporal/Cadence/AWS Step Functions, Argo, Airflow) or serverless runtimes - Experience with policy engines (OPA), secrets/KMS, or data-handling controls (PII/PHI) - ML/LLM evaluation frameworks, tool/plugin architectures, or embedding model governance into execution - Government or healthcare experience (HIPAA, audit readiness) and multi-tenant isolation Salary Range: $240,000-290,000. This represents the typical salary range for this position based on experience, skills, and other factors. #LI-RCP Our Trase Benefits: For full-time roles only - Career track opportunity with potential for rapid advancement with strong performance as the firm grows - 100% employer paid, comprehensive health care including medical, dental, and vision for you and your family. - Paid maternity and paternity for 14 weeks at employees' normal pay. - Unlimited PTO, with management approval. - Opportunities for professional development and continued learning. - Optional 401K, FSA, and equity incentives available. - Mental health benefits are available through Tara Mind. We’re an Equal Opportunity Employer: You’ll receive consideration for employment without regard to race, sex, color, religion, sexual orientation, gender identity, national origin, protected veteran status, or on the basis of disability. Applicant Data Disclosure By submitting an application, you acknowledge that Red Cell Partners, LLC ("Red Cell") uses third-party service providers to facilitate its recruitment and hiring processes. These providers include applicant tracking systems, candidate verification platforms, and fraud detection tools (collectively, "Hiring Platforms"). Your application materials, including your résumé, cover letter, work samples, responses to application questions, and any other information you submit, may be transmitted to and processed by these Hiring Platforms for the following purposes: - Managing and administering your application throughout the hiring process; - Verifying the accuracy and authenticity of application materials, including by cross-referencing information you provide against publicly available sources and proprietary databases; - Identifying indicators of potentially fraudulent, fabricated, or materially misleading application content, including but not limited to discrepancies between submitted materials and publicly available professional profiles, geographic anomalies, and fabricated work histories. Applications that are flagged through this process as containing indicators of fraud or material misrepresentation may be declined from further consideration. If you have questions about the status of your application or the evaluation process, please contact talent@redcellpartners.com. Red Cell requires its Hiring Platform providers to process your information solely for the purposes described above and in accordance with applicable law. Your information will be retained only for as long as necessary to fulfill these purposes and any applicable legal obligations, after which it will be deleted in accordance with Red Cell's data retention policies. For more information about how your data is used, please refer to our Privacy Policy and Applicant Privacy Notice.

United States
$240K - $290K / year

Principal Software Engineer (Platform Architecture & Execution Model)

Red Cell Partners

Red Cell Partners, founded in 2020, is a dynamic and rapidly growing firm specializing in launching and scaling innovative companies across various industries.

About Us Red Cell Partners is an incubation firm building and investing in rapidly scalable technology-led companies that are bringing revolutionary advancements to market in three distinct practice areas: healthcare, cyber, and national security. United by a shared sense of duty and deep belief in the power of innovation, Red Cell is developing powerful tools and solutions to address our Nation’s most pressing problems. About Trase Co-founded in 2023 by Joe Laws and Grant Verstandig, Trase Systems is AI, Uncomplicated. Trase empowers enterprise leaders to harness the full potential of AI without the associated complexity and risks. We are an end-to-end solution for deploying, managing, and optimizing AI in the enterprise. Our platform specializes in bridging the “last mile” of AI adoption, unlocking AI's full potential while driving efficiency and significant cost savings. Trase is at the forefront of AI Agent innovation, topping the Hugging Face GAIA Leaderboard for Generalized AI Assistants, ahead of industry giants such as Google, Meta, Microsoft, and OpenAI. We are leveraging our cutting-edge technologies to develop mission-critical agentic applications in complex industries such as Healthcare, Oil & Gas, and National Security. About The Role As Principal Software Engineer, you’ll own the core execution model and platform architecture of Trase OS - the shared platform (“agentic operating system”) that powers all Trase deployments in regulated environments. You’ll define the abstractions and APIs that connect workflows, agents, tools, and product surfaces, and ensure the correctness, scalability, and extensibility of the system. This is a company-critical role: you are responsible for how the system behaves under real-world conditions, including failure, scale, and security constraints. Your work sets the technical direction for the platform and acts as a force multiplier across all engineering teams. Clean abstractions and correctness-under-failure are critical because we operate long-lived agents in healthcare/defense environments where auditability and reliability are non-negotiable. Why This Role Is Needed Trase OS is an orchestration-heavy system coordinating long-lived workflows, agents, and tools across multiple services and environments. As the platform evolves, the primary risks shift from implementation to system design quality: - Poor abstractions create tight coupling across services - Workflow execution becomes difficult to reason about under failure - Platform capabilities fragment instead of becoming reusable primitives - Scaling introduces complexity instead of leverage This role exists to: - Define clean, durable abstractions for the platform execution model - Ensure correctness and determinism in workflow execution - Translate evolving product requirements into coherent platform architecture - Enable teams to build on Trase OS without introducing systemic complexity What Makes This Role Hard - You are designing systems where failure is the norm, not the exception, and correctness must be preserved across retries, restarts, and partial execution - You must balance clean abstractions with real-world constraints (performance, security, multi-tenant environments) - Decisions made here become foundational primitives used across all products and teams - The system must remain understandable and auditable, even as complexity and scale increase Responsibilities - Architect & lead the core execution model (state machine, lifecycle, resource model, failure semantics) - Design platform APIs/SDKs connecting workflows, agents, tools, and product surfaces; drive versioning & compatibility - Guarantee correctness via idempotency, deterministic replays, compensating actions, and data integrity - Engineer reliability at scale: concurrency controls, rate limits, backpressure, sharding/partitioning, and workload isolation - Build security & governance into the core: RBAC/ABAC, policy enforcement, fine-grained audit & lineage - Deliver observability: distributed tracing, structured logs, metrics, and evaluation hooks; build an “explainable trail” of agent actions - Own quality: design reviews, test strategy (unit, property, chaos), performance baselines, SLOs, incident response, and postmortems - Mentor & unblock senior engineers; partner with Product, Security, and Customer teams to translate requirements into durable primitives - Make pragmatic choices on storage, queueing, and compute; create paved roads that accelerate all other teams - Define system boundaries and reduce cross-service coupling through clear architectural patterns - Drive platform-wide standards for correctness, reliability, and API design across teams - Balance short-term delivery with long-term architectural integrity, ensuring the platform evolves without accumulating systemic risk Principal-level Technical Leadership - Define and drive the long-term technical architecture of Trase OS across teams and domains - Influence company-wide technical direction for platform and product systems - Lead cross-team initiatives that shape how workflows, agents, and platform primitives are built and evolve - Partner with leadership to align technical architecture with product and business strategy - Mentor senior and staff engineers and raise the bar for system design and architectural thinking Requirements - 12-15+ years of experience building distributed/platform systems, including significant experience defining architecture across teams or domains - 10+ years owning mission-critical runtimes or workflow/orchestration systems - Deep expertise with durable execution (e.g., state machines, event sourcing, saga/compensation, idempotency, exactly/at-least-once semantics) - Proven track record with security & governance in production systems (auth, RBAC, audit, policy) - Hands-on with observability (Grafana or equivalent), including trace correlation across async boundaries - Strong systems design across storage, queues, schedulers, and evented architectures; performance tuning under load - Excellence in a modern language (e.g., Go, Rust, Java, or TypeScript) and cloud-native stacks (containers, CI/CD, IaC) - Comfortable operating in regulated or high-assurance environments; bias toward correctness, clarity, and documentation - Proven ability to influence technical direction across an organization and drive adoption of architectural standards - Ability to incorporate advance LLM capabilities into system design and platform architecture decisions where appropriate Nice to Have - Prior work on workflow engines (Temporal/Cadence/AWS Step Functions, Argo, Airflow) or serverless runtimes - Experience with policy engines (OPA), secrets/KMS, or data-handling controls (PII/PHI) - ML/LLM evaluation frameworks, tool/plugin architectures, or embedding model governance into execution - Government or healthcare experience (HIPAA, audit readiness) and multi-tenant isolation Salary Range: $240,000-290,000. This represents the typical salary range for this position based on experience, skills, and other factors. #LI-RCP Our Red Cell Partners Benefits: For full-time roles - Career track opportunity with potential for rapid advancement with strong performance as the firm grows - 100% employer paid, comprehensive health care including medical, dental, and vision for you and your family. - Paid maternity and paternity for 14 weeks at employees' normal pay. - Unlimited PTO, with management approval. - Opportunities for professional development and continued learning. - Optional 401K, FSA, and equity incentives available. - Mental health benefits are available through Tara Mind. - Cost effective GLP-1 solutions available through Crux. We’re an Equal Opportunity Employer: You’ll receive consideration for employment without regard to race, sex, color, religion, sexual orientation, gender identity, national origin, protected veteran status, or on the basis of disability. Applicant Data Disclosure By submitting an application, you acknowledge that Red Cell Partners, LLC ("Red Cell") uses third-party service providers to facilitate its recruitment and hiring processes. These providers include applicant tracking systems, candidate verification platforms, and fraud detection tools (collectively, "Hiring Platforms"). Your application materials, including your résumé, cover letter, work samples, responses to application questions, and any other information you submit, may be transmitted to and processed by these Hiring Platforms for the following purposes: - Managing and administering your application throughout the hiring process; - Verifying the accuracy and authenticity of application materials, including by cross-referencing information you provide against publicly available sources and proprietary databases; - Identifying indicators of potentially fraudulent, fabricated, or materially misleading application content, including but not limited to discrepancies between submitted materials and publicly available professional profiles, geographic anomalies, and fabricated work histories. Applications that are flagged through this process as containing indicators of fraud or material misrepresentation may be declined from further consideration. If you have questions about the status of your application or the evaluation process, please contact talent@redcellpartners.com. Red Cell requires its Hiring Platform providers to process your information solely for the purposes described above and in accordance with applicable law. Your information will be retained only for as long as necessary to fulfill these purposes and any applicable legal obligations, after which it will be deleted in accordance with Red Cell's data retention policies. For more information about how your data is used, please refer to our Privacy Policy and Applicant Privacy Notice.

Washington + 1 moreAll locations: Washington | Virginia
$240K - $290K / year