Job Closed
This listing is no longer active.
Model Serving Engineer
Location
United States
Posted
7 days ago
Salary
0
Seniority
Mid Level
Job Description
Model Serving Engineer
Bright Vision Technologies
Role Description We are seeking a Model Serving Engineer to design, build, and operate high-performance, highly reliable inference platforms for serving large machine learning models in production. The role focuses on the systems engineering side of AI deployment, including: - Request routing - Batching - Caching - Autoscaling - GPU utilization - End-to-end observability across diverse model workloads The ideal candidate brings strong distributed systems and performance engineering expertise, has shipped serving systems at scale, and understands the trade-offs between latency, throughput, cost, and quality in ML serving. Qualifications - Bachelor’s or Master’s degree in Computer Science or a related field - Six or more years of experience in distributed systems, infrastructure, or ML platform engineering - Strong proficiency in Python and a systems language such as Go, Rust, or C++ - Deep experience operating high-throughput, low-latency services in production - Hands-on experience with LLM or large model inference frameworks such as vLLM or TensorRT-LLM - Strong understanding of GPU architecture, memory hierarchies, and accelerator utilization - Familiarity with Kubernetes, autoscaling, and modern cloud platforms - Experience with observability stacks including metrics, tracing, and structured logging - Solid grounding in performance engineering and capacity planning - Strong communication and incident response skills Requirements - Design and operate model serving platforms supporting diverse workloads including LLMs, vision models, and recommendation systems - Optimize inference performance using continuous batching, paged attention, speculative decoding, and request multiplexing - Implement multi-tenant routing, rate limiting, and quality-of-service policies across model endpoints - Build autoscaling and capacity management systems that balance latency, throughput, and cost - Tune GPU utilization, memory management, and KV cache strategies for LLM serving workloads - Integrate model serving with API gateways, identity systems, and observability platforms - Implement caching, prompt deduplication, and response reuse strategies where appropriate - Drive end-to-end observability including latency histograms, queue dynamics, GPU utilization, and error tracking - Develop deployment workflows including canary releases, shadow testing, and automated rollback - Operate incident response for high-availability AI services and drive durable reliability improvements - Collaborate with ML and product teams to support new model releases and capability rollouts - Implement security controls including request signing, content filtering, and abuse detection at the serving layer - Document operational procedures, performance characteristics, and tuning guidance for internal teams - Stay current with AI serving research and translate advances into production capabilities Benefits - Competitive base salary commensurate with experience - Full-time, direct W2 employment with Bright Vision Technologies - Long-term, multi-year engagement aligned to the Bright Vision SOW delivery roadmap How to Apply For immediate consideration, please send your resume to [email protected] or contact us at (908) 698-4899. We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company.
Related Guides
Related Categories
Related Job Pages
More Engineer Jobs
Underground Transmission Line Project Engineer
WSPWSP USA is the U.S. operating company of WSP, one of the world's leading engineering and professional services firms. Dedicated to serving local communities. Designs lasting solutions in the buildings, transportation, energy, water, and environment markets. More than 15,000 employees in over 300 offices across the U.S.
Role Description WSP is currently initiating a search for an Underground Transmission Line Project Engineer for our St. Louis, MO or San Diego, CA office locations, however, the selected candidate may work from any mutually acceptable location in the United States. Be involved in projects with our Transmission Line Engineering Team and be a part of a growing organization that meets our clients' objectives and solves their challenges. This Opportunity provides technical assistance and guidance for multi-site/phase due diligence, investigation, remediation, impact assessment, permitting, design, development, and construction of utility, industrial, and commercial scale projects in the public and private sector. - Routinely assist with the research, design, concept development, and construction of transmission and distribution substations, power distribution, power regulation, renewable energy, as well as protection and control systems. - Substantiates reports and documentation regarding material, installations, component, and construction specifications. - Ensures that responsibilities are delivered and adhered to with a level of quality that meets or exceeds acceptable industry standards for design, safety, and functionality. The successful candidate will be involved in all aspects of underground transmission line design and execution, including the following: - Development of engineering work plans, budgets, and proposals - Conceptual design and feasibility studies - Preparation of construction cost estimates - Cable system design - Detailed design and engineering including construction specifications and drawings - Supporting projects in construction - Both internal and external (Client) interactions - Participation in industry-related events (i.e. conferences) - Exercise responsible and ethical decision-making regarding company funds, resources, and conduct, and adhere to WSP’s Code of Conduct and related policies and procedures. - Perform additional responsibilities as required by business needs. Qualifications - Bachelor’s Degree in Engineering, or closely related discipline (or equivalent experience). - 5 to 7 years of relevant post education experience as an electrical engineer providing design deliverables for large capital projects. - Engineer in Training Certification. - Experience in the engineering of underground high voltage transmission lines including route design, knowledge of survey and geotechnical requirements, cable system calculations, detailed design, development of construction bid packages, support during construction, and successful completion of underground transmission line projects. - The candidate must have a strong command of the English language with good written and oral communication skills to communicate effectively with internal team members and external client personnel. - Experience using the Microsoft Office software suite, CYMCAP, and Pull Planner. - Proven track record of upholding workplace safety and ability to abide by WSP’s health, safety and drug/alcohol and harassment policies. - Proficient knowledge of engineering principles, practices, process, design/build, and the application to permitting and project work-related issues. - Experience with infrastructure planning, design, and construction management; including active involvement in a variety of rehabilitation and new design projects. - Working knowledge with process and concepts for reducing and eliminating the use or generation of hazardous substances/greenhouse gas. - Working knowledge of relevant civil construction laws, codes, regulations, compliance practices, and record-keeping requirements. - Ability to make technical computations and calculations involving the application of engineering principles, understanding plans and specifications, and making factual comparisons to the appropriate regulations. - Project management experience with small to mid-level projects including tracking hours and expenses for project work. - Ability to plan and conduct inspections and investigations on various aspects of the construction and design of facilities or structures, applying applicable regulations and policies. - Effective interpersonal and communication skills when interacting with others, expressing ideas effectively and professionally to an engineering and non-engineering audience. - Highly capable self-leadership with attention to detail, multi-tasking, and prioritization of responsibilities in a dynamic work environment. - Ability to work independently and provide guidance and leadership to junior team or project members, with strict adherence to QA/QC. - Proficiency with technical writing, office automation, discipline-specific design software (i.e., MicroStation, AutoCAD, Civil 3D), technology, math principles, predictive models, spreadsheets, and tools. - Developed critical thinking and problem-solving skills required to apply technical knowledge to reach conclusions from testing results, data collation, statistical analysis and arriving at the most effective, economical, and logical solution. - Ability to assertively direct others in the field such as subcontractors and others to consistently complete tasks safely and efficiently. - Ability to work schedules conducive to project-specific requirements that may extend beyond the typical workweek. - Occasional travel may be required depending on project-specific requirements. Preferred Qualifications - Master’s Degree in Engineering - Essential Professional Licensure/Certification - Experience managing small to mid-size projects - 40-Hour OSHA Health & Safety Training (HAZWOPER) (29 CFR 1910.120) preferred - Basic First Aid and Adult CPR training desired Benefits - WSP provides a comprehensive suite of benefits focused on providing health and financial stability throughout the employee’s career. - Benefits include coverage related to medical, dental, vision, disability, and life. - Retirement savings. - Paid sick leave. - Paid vacation (or other personal time). - Paid parental leave. - Paid time off for purposes of bereavement, voting, and/or attendance at naturalization proceedings. Compensation Expected Salary (all locations): $100,000-$125,000 WSP USA is providing the compensation range that the company in good faith believes it might pay and offer for this position, based on the successful applicant’s education, experience, knowledge, skills, abilities in addition to internal equity and specific geographic location. WSP USA reserves the right to ultimately pay more or less than the posted range and offer additional benefits and other compensation, depending on circumstances not related to an applicant’s sex or other status protected by local, state, and/or federal law.
• Meet with customers to understand their technical requirements • Build integrations and connectors • Run rapid field testing in customer environments • Document customer needs and relay to Engineering • Work alongside senior FDEs
Full Stack Engineer
UnitedHealth GroupUnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of
Role Description The Full Stack Engineer will join the Consumer Engineering team, which drives digital experiences across UHC and Optum RX. This role focuses on building solutions that serve both internal call center agents and external members/patients, supporting critical functions such as e-commerce checkout, prescription management, home delivery, and specialty services. The engineer will play a pivotal role in shaping and advancing this essential platform. You will enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges. Primary Responsibilities: - Design, develop, and maintain full-stack applications using ReactJS, Java-based microservices, and GraphQL. - Architect and implement solutions leveraging Azure cloud services and Kubernetes (preferably Azure Kubernetes Service - AKS). - Build and optimize CI/CD pipelines using GitHub Actions. - Develop and manage containerized applications with Docker. - Ensure robust automation testing for UI and APIs. - Collaborate with cross-functional teams to deliver high-quality software in an agile environment. - Monitor and troubleshoot applications using observability tools such as Azure Monitoring, App Insights, Splunk, and Dynatrace. - Maintain code quality and version control using GitHub. - Provide technical guidance and mentorship to team members. - Drive problem-solving and resolve complex technical issues efficiently. - Leverage enterprise-approved AI tools to streamline workflows, automate tasks, and drive continuous improvement. Qualifications - Bachelor's degree. - 3+ years of proven experience as a Full Stack Engineer with strong proficiency in ReactJS (or equivalent frameworks), microservices. - 3+ years of experience and solid understanding in CI/CD pipelines and tools, especially GitHub Actions. - 3+ years of hands-on experience or strong background in automation testing for UI and APIs. - 3+ years of proficient experience in GitHub for version control and team collaboration. Requirements - 3+ years of deep expertise in Azure cloud services and Kubernetes, preferably with Azure Kubernetes Service (AKS). - *All Telecommuters will be required to adhere to UnitedHealth Group’s Telecommuter Policy. Benefits - Comprehensive benefits package. - Incentive and recognition programs. - Equity stock purchase. - 401k contribution (all benefits are subject to eligibility requirements). - Salary range from $72,800 to $130,000 annually based on full-time employment.
Senior AI Engineering Lead
Encora DigitalEncora, a leader in digital engineering, drives innovation by crafting cutting-edge, cloud-first, data-first, and AI-first solutions that redefine industries. Since its inception i
Role Description - Lead and guide engineering teams in the adoption of AI tools and practices across the software development lifecycle, including planning, coding, testing, documentation, and delivery. - Coach and mentor engineering teams on effective, responsible, and scalable AI adoption to improve productivity, quality, and engineering consistency. - Act as a technical and people leader, facilitating collaboration, managing change, and addressing adoption challenges across teams. - Drive continuous improvement initiatives by identifying workflow gaps, supporting enablement strategies, and ensuring sustainable AI integration into engineering practices. Qualifications - Bachelor’s degree in Computer Science, Engineering, Mechatronics, Information Technology, or equivalent practical experience. - 10+ years of professional experience in software engineering or related engineering disciplines. - Strong engineering background with credibility working alongside senior engineering teams. - Proven experience leading or managing engineering teams as a Tech Lead, Engineering Manager, or similar leadership role. - Experience mentoring, coaching, or enabling engineers beyond hands-on software delivery. - Familiarity with AI tools and AI-assisted workflows within software engineering environments. - Strong understanding of software development lifecycle (SDLC) processes and engineering best practices. - Demonstrated experience managing organizational change, team dynamics, and technology adoption initiatives. - Strong communication, facilitation, stakeholder management, and collaboration skills. - Experience working in Agile environments with a focus on engineering productivity and continuous improvement. Requirements - Experience implementing AI-powered workflows across engineering organizations. - Familiarity with tools such as GitHub Copilot, Claude Code, Cursor, or similar AI development platforms. - Exposure to change management, organizational enablement, or transformation programs. - Experience working across multidisciplinary engineering environments. Company Description At Coforge, we hire professionals based solely on their skills and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.
