Principal Software Engineer (Platform Architecture & Execution Model)
Location
United States
Posted
74 days ago
Salary
$240K - $290K / year
Seniority
Lead
Job Description
Principal Software Engineer (Platform Architecture & Execution Model)
Trase Systems
About Us Co-founded in 2023 by Joe Laws and Grant Verstandig, Trase Systems is AI, Uncomplicated. Trase empowers enterprise leaders to harness the full potential of AI without the associated complexity and risks. We are an end-to-end solution for deploying, managing, and optimizing AI in the enterprise. Our platform specializes in bridging the “last mile” of AI adoption, unlocking AI's full potential while driving efficiency and significant cost savings. Trase is at the forefront of AI Agent innovation, topping the Hugging Face GAIA Leaderboard for Generalized AI Assistants, ahead of industry giants such as Google, Meta, Microsoft, and OpenAI. We are leveraging our cutting-edge technologies to develop mission-critical agentic applications in complex industries such as Healthcare, Oil & Gas, and National Security. About The Role As Principal Software Engineer, you’ll own the core execution model and platform architecture of Trase OS - the shared platform (“agentic operating system”) that powers all Trase deployments in regulated environments. You’ll define the abstractions and APIs that connect workflows, agents, tools, and product surfaces, and ensure the correctness, scalability, and extensibility of the system. This is a company-critical role: you are responsible for how the system behaves under real-world conditions, including failure, scale, and security constraints. Your work sets the technical direction for the platform and acts as a force multiplier across all engineering teams. Clean abstractions and correctness-under-failure are critical because we operate long-lived agents in healthcare/defense environments where auditability and reliability are non-negotiable. Why This Role Is Needed Trase OS is an orchestration-heavy system coordinating long-lived workflows, agents, and tools across multiple services and environments. As the platform evolves, the primary risks shift from implementation to system design quality: - Poor abstractions create tight coupling across services - Workflow execution becomes difficult to reason about under failure - Platform capabilities fragment instead of becoming reusable primitives - Scaling introduces complexity instead of leverage This role exists to: - Define clean, durable abstractions for the platform execution model - Ensure correctness and determinism in workflow execution - Translate evolving product requirements into coherent platform architecture - Enable teams to build on Trase OS without introducing systemic complexity What Makes This Role Hard - You are designing systems where failure is the norm, not the exception, and correctness must be preserved across retries, restarts, and partial execution - You must balance clean abstractions with real-world constraints (performance, security, multi-tenant environments) - Decisions made here become foundational primitives used across all products and teams - The system must remain understandable and auditable, even as complexity and scale increase Responsibilities - Architect & lead the core execution model (state machine, lifecycle, resource model, failure semantics) - Design platform APIs/SDKs connecting workflows, agents, tools, and product surfaces; drive versioning & compatibility - Guarantee correctness via idempotency, deterministic replays, compensating actions, and data integrity - Engineer reliability at scale: concurrency controls, rate limits, backpressure, sharding/partitioning, and workload isolation - Build security & governance into the core: RBAC/ABAC, policy enforcement, fine-grained audit & lineage - Deliver observability: distributed tracing, structured logs, metrics, and evaluation hooks; build an “explainable trail” of agent actions - Own quality: design reviews, test strategy (unit, property, chaos), performance baselines, SLOs, incident response, and postmortems - Mentor & unblock senior engineers; partner with Product, Security, and Customer teams to translate requirements into durable primitives - Make pragmatic choices on storage, queueing, and compute; create paved roads that accelerate all other teams - Define system boundaries and reduce cross-service coupling through clear architectural patterns - Drive platform-wide standards for correctness, reliability, and API design across teams - Balance short-term delivery with long-term architectural integrity, ensuring the platform evolves without accumulating systemic risk Principal-level Technical Leadership - Define and drive the long-term technical architecture of Trase OS across teams and domains - Influence company-wide technical direction for platform and product systems - Lead cross-team initiatives that shape how workflows, agents, and platform primitives are built and evolve - Partner with leadership to align technical architecture with product and business strategy - Mentor senior and staff engineers and raise the bar for system design and architectural thinking Requirements - 12-15+ years of experience building distributed/platform systems, including significant experience defining architecture across teams or domains - 10+ years owning mission-critical runtimes or workflow/orchestration systems - Deep expertise with durable execution (e.g., state machines, event sourcing, saga/compensation, idempotency, exactly/at-least-once semantics) - Proven track record with security & governance in production systems (auth, RBAC, audit, policy) - Hands-on with observability (Grafana or equivalent), including trace correlation across async boundaries - Strong systems design across storage, queues, schedulers, and evented architectures; performance tuning under load - Excellence in a modern language (e.g., Go, Rust, Java, or TypeScript) and cloud-native stacks (containers, CI/CD, IaC) - Comfortable operating in regulated or high-assurance environments; bias toward correctness, clarity, and documentation - Proven ability to influence technical direction across an organization and drive adoption of architectural standards - Ability to incorporate advance LLM capabilities into system design and platform architecture decisions where appropriate Nice to Have - Prior work on workflow engines (Temporal/Cadence/AWS Step Functions, Argo, Airflow) or serverless runtimes - Experience with policy engines (OPA), secrets/KMS, or data-handling controls (PII/PHI) - ML/LLM evaluation frameworks, tool/plugin architectures, or embedding model governance into execution - Government or healthcare experience (HIPAA, audit readiness) and multi-tenant isolation Salary Range: $240,000-290,000. This represents the typical salary range for this position based on experience, skills, and other factors. #LI-RCP Our Trase Benefits: For full-time roles only - Career track opportunity with potential for rapid advancement with strong performance as the firm grows - 100% employer paid, comprehensive health care including medical, dental, and vision for you and your family. - Paid maternity and paternity for 14 weeks at employees' normal pay. - Unlimited PTO, with management approval. - Opportunities for professional development and continued learning. - Optional 401K, FSA, and equity incentives available. - Mental health benefits are available through Tara Mind. We’re an Equal Opportunity Employer: You’ll receive consideration for employment without regard to race, sex, color, religion, sexual orientation, gender identity, national origin, protected veteran status, or on the basis of disability. Applicant Data Disclosure By submitting an application, you acknowledge that Red Cell Partners, LLC ("Red Cell") uses third-party service providers to facilitate its recruitment and hiring processes. These providers include applicant tracking systems, candidate verification platforms, and fraud detection tools (collectively, "Hiring Platforms"). Your application materials, including your résumé, cover letter, work samples, responses to application questions, and any other information you submit, may be transmitted to and processed by these Hiring Platforms for the following purposes: - Managing and administering your application throughout the hiring process; - Verifying the accuracy and authenticity of application materials, including by cross-referencing information you provide against publicly available sources and proprietary databases; - Identifying indicators of potentially fraudulent, fabricated, or materially misleading application content, including but not limited to discrepancies between submitted materials and publicly available professional profiles, geographic anomalies, and fabricated work histories. Applications that are flagged through this process as containing indicators of fraud or material misrepresentation may be declined from further consideration. If you have questions about the status of your application or the evaluation process, please contact talent@redcellpartners.com. Red Cell requires its Hiring Platform providers to process your information solely for the purposes described above and in accordance with applicable law. Your information will be retained only for as long as necessary to fulfill these purposes and any applicable legal obligations, after which it will be deleted in accordance with Red Cell's data retention policies. For more information about how your data is used, please refer to our Privacy Policy and Applicant Privacy Notice.
Related Guides
Related Job Pages
More Software Engineer Jobs
Datastage, IBM Websphere Developer
Gainwell TechnologiesGainwell Technologies is an award-winning digital health technology company that supports the administration of healthcare and human services programs. In past flexible hiring, the
• Develop and maintain enterprise Web Services and integration solutions. • Work extensively with IBM ACE and/or IBM WebSphere. • Deliver reliable and scalable integration services. • Collaborate with SOA Integration team.
Senior Batch Developer, UNIX, C, SQL
Gainwell TechnologiesGainwell Technologies is an award-winning digital health technology company that supports the administration of healthcare and human services programs. In past flexible hiring, the
• Code, test, debug, implement and document moderately complex programs • Analyze, design, and write specifications and requirements from which we develop and code programs • Evaluate existing systems and programs; assist project manager in assigning tasks and work closely with a team • Liaise with clients and management to provide technical consulting on complex programming projects • Validate program requirements and resolve issues that arise • Develop test plans and participate in QA activities such as coding reviews • Participate as a member of a development team and lead a small sub-team of developers as needed • Design system components and delegate responsibilities to team members • Participate in Production support activities
Advisor Batch Developer – UNIX/LINIX/C, Healthcare Claims
Gainwell TechnologiesGainwell Technologies is an award-winning digital health technology company that supports the administration of healthcare and human services programs. In past flexible hiring, the
• Apply your skills to support Gainwell as we help clients deliver better health and human services outcomes. • Analyzes customer information requirements and product specifications to define technical content strategy and plan. • Designs and develops written and/or visual product-related information, hard copy, web (user/configuration/troubleshooting guides), and online information (interactive demos, help systems) integrated into product, for a variety of audiences (end user, system administrators, internal support engineers, product developers, training developers). • Codes, builds, compiles, and tests modifications to the system using established team standards. • As customer advocate, helps define/refine product requirements. • Interfaces with cross-functional areas as a member of the product development team. • Works on problems/projects of moderately complex scope. • Exercises independent judgment within defined practices and procedures to determine appropriate action. • Acts as an informed team member providing analysis of information and limited project direction input. • Follows established guidelines and interprets policies. • Evaluates unique circumstances and makes recommendations.
Senior Batch Developer – UNIX, C, SQL, Claims
Gainwell TechnologiesGainwell Technologies is an award-winning digital health technology company that supports the administration of healthcare and human services programs. In past flexible hiring, the
• Code, test, debug, implement, and document low to moderately complex programs • Create appropriate documentation in work assignments such as program code, and technical documentation • Design systems and programs to meet business needs • Prepare detailed specifications from which programs are developed and coded • Ensure programs meet standards and technical specifications; perform technical analysis and component delivery • Gather information from existing systems, analyze program and time requirements
