Job Closed
This listing is no longer active.
Bringing real world currency to the blockchain.
Senior AI Inference Engineer
Location
Brazil
Posted
86 days ago
Salary
0
Seniority
Senior
Job Description
Senior AI Inference Engineer
Tether.to
• You will own the inference backbone behind QVAC's local AI stack: the C++ systems layer that makes models run fast, reliably, and predictably on real user hardware. • The role is centered on engineering quality at runtime level, including startup behavior, memory pressure, throughput/latency balance, and long-session stability. • You will define and evolve the core abstractions that inference features depend on, so new capabilities can be added without sacrificing performance or maintainability. • This is a role for someone who enjoys low-level problem solving, clear technical ownership, and building infrastructure that other teams trust in production. • Your work directly enables private, on-device AI experiences and helps set the technical foundation for QVAC's next generation of peer-to-peer AI products. • Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx • Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments • Integrate AI features into existing products, enriching them with the latest advancements in machine learning
Job Requirements
- Excellent programming skills in C++, experience in Javascript is a bonus
- Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures
- Good understanding of deep learning concepts and model architectures
- Experience with transformers, LLMs, Diffusion models
- Demonstrated ability to rapidly assimilate new technologies and techniques
- A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D
- Bonus points if: You have experience with Javascript/Typescript
- You understand the difficulties, nuances and importance of p2p technology
- You have experience with any of Vulkan, Metal and OpenCL
- You have productionized models
Benefits
- Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles:
- Apply only through our official channels. We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page: https://tether.recruitee.com/
- Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles. If you’re unsure, you can confirm their identity by checking their profile or contacting us through our website.
- Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS. All communication is done through official company emails and platforms.
- Double-check email addresses. All communication from us will come from emails ending in @ tether.to or @ tether.io
- We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately.
- When in doubt, feel free to reach out through our official website.
Related Guides
Related Categories
Related Job Pages
More Artificial Intelligence Jobs
• Monitor, analyse and communicate changes in global regulatory policies, and advise executive leadership on potential regulatory risks and opportunities, while driving initiatives to influence regulatory frameworks as well as monitoring for change across regulations to ensure proactive compliance where required. • Leading or participating in regulatory agency interactions and advocacy work regarding legislation relevant to this field. • Engage with relevant regulatory authorities, industry bodies, collaborative communities, and policymakers to shape the regulatory landscape for AI and P2P. • Lead the development and execution of regulatory strategies, identifying preferable market/s from which to operate, and to ensure the company’s readiness for compliance with relevant national and international regulations. • Collaborate with internal teams (e.g., R&D, legal, compliance) in both Tether and with our partners to integrate regulatory considerations into the design and deployment of AI models and technologies, as applicable. • Act as a thought leader both externally and within the organization, educating internal teams on regulatory issues, trends, and best practices in the AI and P2P space.
• Own the inference backbone behind QVAC's local AI stack • Ensure that local AI capabilities ship reliably and perform well across devices • Balance hands-on technical work with team coordination • Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx • Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments • Integrate AI features into existing products, enriching them with the latest advancements in machine learning • Managing a cross functional team (pod) made of middleware (JS), foundation (C++), QA and documentation engineers to produce high quality deliverables • Regularly assess our position in the market with regards to similar products or platforms • Leverage the expertise of technical architects to ensure robust architectural choices and code quality • Ensure stable releases by following precise internal release processes
• Work on deploying machine learning models to edge devices using llama.cpp, ggml, and onnx • Collaborate closely with researchers to assist in coding, training, and transitioning models from research to production environments • Integrate AI features into existing products
• You will own the inference backbone behind QVAC's local AI stack: • Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx • Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments • Integrate AI features into existing products, enriching them with the latest advancements in machine learning • Managing a cross functional team (pod) made of middleware (JS), foundation (C++), QA and documentation engineers to produce high quality deliverables • Regularly assessing, both qualitatively and quantitatively, our position in the market with regards to similar products or platforms • Leveraging the expertise of technical architects to ensure robust architectural choices and code quality • Ensuring stable releases by following precise internal release processes
