TMSfirst logo
TMSfirst

#1 AI DIGITAL Logistics TMS - maximum customer experience artificial intelligence freight matching-global visibility

Senior/Principal Performance Engineer

EngineerEngineerFull TimeRemoteSeniorTeam 201-500Since 2014H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

29 days ago

Salary

0

Seniority

Senior

Postgraduate Degree15 yrs expEnglishLinux

Job Description

Senior/Principal Performance Engineer

TMSfirst

• Design, develop, and maintain comprehensive benchmarking frameworks spanning OS, kernel, and application layers. • Profile workloads across CPU, memory, I/O, network, and accelerator (GPU/NPU) subsystems to identify bottlenecks and optimization opportunities. • Establish and own performance baselines across CIQ's product and solutions portfolio. • Leverage AI-assisted tooling and agentic workflows to accelerate profiling, analysis, and root cause identification. • Build and maintain automated performance regression-detection pipelines integrated into CI/CD workflows using Fuzzball. • Identify, triage, and resolve regressions across user space, kernel space, and application layers with urgency and rigor. • Collaborate across engineering teams to root-cause regressions introduced by upstream kernel changes, compiler updates, or library modifications. • Drive proactive performance improvements - not just reactive fixes - to keep CIQ solutions ahead of the competition across every layer of the stack. • Own core operating system performance: kernel subsystem tuning (scheduler, memory management, I/O, networking), system call overhead reduction, and user space library and runtime optimizations. • Identify and implement kernel-level enhancements, including patches, configuration changes, and upstream contributions that yield measurable performance gains for CIQ's customer workloads. • Optimize for AI inference and training workloads, including LLM serving, model parallelism, and accelerator utilization. • Tune performance for HPC workloads, including modeling, simulation, and tightly coupled parallel applications (MPI, OpenMP, etc.). • Optimize general computing and service workloads - web services, databases, messaging systems, and other production software that runs on CIQ's OS platform. • Work at all levels of the stack: compiler flags, kernel parameters, scheduler tuning, NUMA topology, memory allocation, and application-level algorithmic improvements. • Champion an AI-first engineering philosophy - use AI tools, agents, and automation to accelerate your own productivity and the quality of performance insights. • Identify and prioritize optimization opportunities that directly impact AI training throughput and inference latency/cost. • Stay current on state-of-the-art techniques in ML system performance, including quantization, batching strategies, kernel fusion, and hardware-software co-design. • Develop deep expertise in CIQ's Fuzzball platform - its architecture, scheduling, and workload execution model. • Integrate performance benchmarks, regression tests, and user-facing workloads into Fuzzball-based pipelines. • Contribute to the performance characterization of Fuzzball itself, ensuring the platform adds minimal overhead and scales efficiently. • Develop broad familiarity with the full CIQ product portfolio — including Rocky Linux and RLC (and its variants), Fuzzball, Apptainer (formerly Singularity), and Warewulf - understanding how performance considerations span and interconnect across each. • Collaborate deeply with the engineering teams behind each product line to surface, prioritize, and deliver performance improvements that benefit customers across the entire CIQ ecosystem. • Partner with product and customer success teams to translate real-world performance pain points into engineering priorities and measurable outcomes. • Document and communicate findings clearly - from low-level profiling data to executive-level summaries. • Contribute to technical publications, conference presentations, and thought leadership that reinforces CIQ's reputation for performance excellence.

Job Requirements

  • A deep, principled understanding of operating system internals - Linux kernel scheduler, memory subsystem, I/O stack, and networking.
  • Proven experience identifying and resolving performance regressions across kernel and user space in production environments.
  • Hands-on expertise with profiling and tracing tools: perf, eBPF/bpftrace, Flamegraphs, VTune, Nsight, strace, ftrace, and similar.
  • Strong background in AI/ML workload performance - including inference optimization (TensorRT, ONNX, vLLM, or similar), training efficiency, and GPU/accelerator utilization.
  • Experience with HPC workloads: MPI, OpenMP, parallel filesystems, RDMA/InfiniBand, and job schedulers (Slurm, PBS, etc.).
  • Familiarity with modern AI-first development workflows and comfort using LLM-based tools to accelerate engineering work.
  • Experience building automated performance testing and regression detection pipelines in CI/CD environments.
  • Excellent analytical skills - able to form hypotheses, design experiments, and draw actionable conclusions from complex data.
  • Strong written and verbal communication skills; able to present findings to both deeply technical audiences and business stakeholders.
  • A collaborative, humble, and always-learning mindset - combined with the confidence to champion performance as a first-class engineering concern.

Benefits

  • Medical, dental, and vision insurance.
  • Flexible paid time off.
  • Employee stock options.
  • Remote work; no travel required for most positions.

Related Categories

Related Job Pages

More Engineer Jobs

Terabase Energy logo

Sr. Controls Engineer

Terabase Energy

A solar technology company whose mission is to reduce the cost and increase the scalability of large-scale solar.

Engineer29 days ago
Full TimeRemoteTeam 51-200Since 2019H1B Sponsor

Role Description The Sr. Controls Engineer – OT SCADA Projects leads the design, configuration, commissioning, and support of plant control systems for utility-scale solar, storage, and hybrid renewable energy projects. This engineer works with minimal oversight, applies expert knowledge of grid functionality and Utility/ISO standards, mentors junior engineers, and contributes to product standards. Approximately 80% of this role is project execution, while up to 20% is continuous improvement and product development. Responsibilities - Project Execution & Technical Delivery (~80%) - Lead end-to-end controls design for utility-scale solar, BESS, and hybrid projects – from initiation through commissioning and closeout. - Program and commission SEL controllers using AcSELerator; develop control logic in Codesys using IEC 61131-3 structured text. - Identify project-specific deviations from standard product scope during contracting, forecast and scope project-specific development work. - Implement and validate closed-loop Active Power and Reactive Power, AVR and PFR Algorithms. - Maintain version control for all code artifacts according to established version control procedure. - Troubleshoot complex SCADA and controls issues using Wireshark, breakpoints, cross-reference, and watch list tools. - Produce project deliverables: System Architecture Diagrams, Control Narratives, Logic Diagrams, commissioning documents, and operator manuals. - Product & Process Improvement (~20%) - Contribute to new feature development and bug fixes in collaboration with the Product Engineering team. - Review documentation prepared by junior controls engineers (technical and non-technical). - Lead the Continuous Improvement and Lessons Learned program, feed field insights back into product standards and templates. - Coach junior engineers through FAT preparation and customer-facing presentation. - Stakeholder Communication & Collaboration - Serve as primary controls technical contact for EPCs, asset owners, and grid operators through project execution, FAT, and commissioning. - Lead FATs as formal presentations; communicate to non-technical audiences with supporting materials prepared in advance. - Flag technical risks and schedule pressures to management with context and proposed solutions. - Maintain Jira tickets daily with thorough detail; enforce Jira best practices with junior engineers. Expectations & Success Indicators - Deliver high-quality work independently across multiple concurrent projects with ownership and urgency. - Leverage standardized platforms and tools; avoid project-specific one-off engineering approaches. - Ensure 100% adherence to Terabase quality processes; enforce standards with junior engineers. - Mentor junior controls engineers through technical guidance, code review, and FAT coaching. - Project deliverables completed on time, within scope, and meeting Utility/ISO regulatory standards. - Jira, version control, and documentation consistently maintained without follow-up from management. - Recognized internally and externally as the go-to technical authority on Terabase SCADA and OT controls. - Travel up to 10% for on-site commissioning, FAT, and customer engagements. Qualifications - Bachelor’s degree in Engineering, Computer Science, Technology, or related field. - 3-5+ years of IEC 61131-3/PLC programming experience, preferably in Codesys or AcSELerator environment. - 3+ years of utility-scale power plant controls experience (solar, BESS, or hybrid). Requirements - Expert IEC 61131-3 programming; primary tooling is AcSELerator (SEL controllers) and Codesys, including Diagram Builder, traces, breakpoints, and watch lists. - Extensive knowledge of industrial protocols: Modbus-TCP, DNP3, OPC-UA, etc. - Expert knowledge of grid functionality: PFR, AVR, Reactive Power, Voltage Regulation, and Capacitor Banks – including the underlying grid rationale, not just controller behavior. - Proficiency with Utility/ISO testing and interconnection requirements (ERCOT, PJM, BPA, IEEE 2800, NERC, etc.). - Experience with Power Plant Controller (PPC) design, configuration, and commissioning. - Familiarity with PSCAD, PSSE, and/or TSAT modeling processes for utility-scale sites. Benefits - Generous time off and holiday policy. - Remote flexibility. - Flexible time off. - Comprehensive benefits package. - Career progression. - 401k match. - Stock options. - Home office set up allowance. - And much more!

United States
$160K - $190K / year
Full TimeRemoteTeam 5,001-10,000Since 1995H1B No Sponsor

• Analisar e migrar pipelines e notebooks (Spark/Databricks) • Refatorar ou reescrever processos para: SQL / Dataform e Dataflow • Criar transformações no BigQuery + Dataform • Construir camadas: Silver (Trusted) e Gold • Garantir qualidade, deduplicação e padronização • Implementar ingestão com: Dataflow (Apache Beam) para eventos (Kafka/Event Hubs)Datastream (CDC) • Trabalhar com a persistência de dados na camada Raw utilizando tabelas Iceberg gerenciadas pelo BigLake. • Provisionar recursos com Terraform (IaC) • Gerenciar pipelines com CI/CD (GitHub Actions) • Seguir modelo deIngestion Factory e repositórios por domínio • Implementar testes no Dataform • Garantir: Catalogação e linhagem (Dataplex) Compartilhamento seguro (Analytics Hub)

Brazil
NVIDIA logo

Senior Manufacturing Engineer – Mass Production Infrastructure

NVIDIA

NVIDIA is widely considered one of the world's most desirable employers in technology. We have some of the world's most forward-thinking and passionate people working for us. If you're creative and autonomous, we want to hear from you!

Engineer29 days ago
Full TimeRemoteTeam 10,001+Since 1993H1B Sponsor

• End-to-end management of P-Rel production infrastructures supply chain, from NPI Infrastructure review to Mass Production delivery and full implementation at production sites • Coordinating ongoing production capacity management and support, risk assessment, infrastructure establishment, maintenance activities, budgeting, and process improvements • Serving as the primary contact for multiple production sites to ensure full infrastructure support for NVIDIA Mass Production

Vietnam
Intertwine Associates logo

Senior Scientist / Engineer – Advanced Oncology, Medical Devices, Diagnostics

Intertwine Associates

Operational efficiency and trusted teams across science, tech, and government.

Engineer29 days ago
Full TimeRemoteTeam 1-10Since 2025H1B No Sponsor

• Serve as a senior scientific and technical advisor to the Program Manager, supporting the development and execution of new and existing ARPA‑H R&D programs • Conduct rigorous evaluations of the current scientific and technical landscape across oncology, diagnostics, and medical device development • Identify gaps, risks, and opportunities within research portfolios aligned with ARPA‑H mission objectives • Monitor and assess performer progress against defined technical milestones and quantitative performance metrics • Critically review complex scientific, engineering, and technical data using strong logical and analytical reasoning • Read, synthesize, and distill large volumes of information into concise, executive‑ready briefings for technical and non‑technical audiences • Support program execution through documentation, internal reporting, and coordination across multidisciplinary teams • Perform programmatic support tasks including data collection, analysis, and preparation of internal reports and briefings • Operate effectively both independently and as part of an integrated team supporting ongoing ARPA‑H efforts • Travel within CONUS (~10–35%) to support program reviews, meetings, and stakeholder engagements

South Carolina
$90K - $220K / year