Machine Learning Engineer Remote Jobs in Florida (US)
This page tracks remote machine learning engineer openings that are location-eligible for Florida.
This page tracks remote machine learning engineer openings that are location-eligible for Florida.
Open jobs
2,536
Hiring companies this week
8
Salary sample
$139,200 - $175,000
Jobs added last hour
0
2536 Jobs
1364 Companies
GitLab, founded in 2011 and based in San Francisco, California, maintains a distributed team of professionals that work remotely across multiple continents. GitLab advocates for pr
• Diagnose business problems before building solutions • Own AI initiatives end-to-end, from stakeholder discovery and technical design through implementation, deployment, and iteration • Design, develop, and ship AI-powered solutions quickly • Improve organizational flow by building solutions that reduce bottlenecks, shorten lead times, and increase throughput • Integrate AI capabilities into existing systems and workflows using APIs, orchestration tools, and modern AI platforms • Be Customer Zero: leverage and showcase GitLab's AI offerings wherever possible • Partner closely with stakeholders across functions to understand the real constraints • Define and track success through business metrics, flow metrics, and feedback loops that make performance visible and actionable • Contribute to technical direction by evaluating tools, documenting patterns, and creating reusable foundations
• Define end-to-end architecture for AI/ML and Gen AI, Agentic AI, MCP systems including data pipelines, model training/inference, and MLOps • Serve as a strategic advisor / consultant to clients, leading solution design discussions • Architect scalable solutions using cloud-native AI tools (Azure ML, AWS SageMaker, or GCP Vertex AI) • Lead the integration of Generative AI into components / features leveraging LLMs into enterprise applications using APIs • Design retrieval-augmented generation (RAG) systems with vector databases • Guide teams on MLOps frameworks for CI/CD, model versioning, monitoring, and automated retraining • Evaluate emerging technologies and trends in AI, ML, Gen AI, Agentic AI space • Mentor technical teams and guide solution architects, AI/ML engineers • Ensure ethical and responsible AI practices
Role Description Throughput. Latency. KV cache utilization. Move those three numbers in the right direction, and two things happen: customers get faster, cheaper inference, and our margins improve. That's the entire thesis of this role. Every kernel you tune, every quantization scheme you ship, every scheduler tweak you land shows up directly in a customer's p99 and on our P&L. This is a high-impact seat. It is also a high-autonomy seat as you'll be given the room to lead the technical direction of inference optimization at Kimchi, not execute someone else's roadmap. The problem: running LLMs in production is a moving target. The "right" model and serving configuration for a workload depend on: - Traffic shape - Sequence-length distribution - Batch dynamics - GPU SKU - Memory bandwidth - Quantization tolerance - A dozen other variables that shift week to week Most teams pick a model once, over-provision GPUs, and absorb the cost. Kimchi is the system that makes that decision automatically - continuously matching workloads to the most cost-efficient, best-performing LLM and serving configuration on a customer's infrastructure. We're building the optimization layer between the model and the hardware, and we need engineers who understand both sides deeply. Qualifications - 5+ years building real ML systems, with a portfolio that shows depth in inference or training infrastructure (not just model training notebooks). - Strong Python - production services, not scripts. - Hands-on experience with at least one of vLLM, SGLang, or TensorRT-LLM, and a working mental model of why an inference engine performs the way it does on a given GPU. - Fluency with quantization tradeoffs - you've measured quality regressions, not just compression ratios. - Comfort with distributed systems: collective communication, sharding strategies, and the practical failure modes of multi-GPU and multi-node setups. - A bias toward measurement. You instrument before you optimize, and you can tell the difference between a real win and a benchmark artifact. - Self-direction. This role comes with a wide mandate; you should be excited by that, not unsettled by it. Requirements - Push throughput. Continuous batching, speculative decoding, chunked prefill, kernel-level tuning across vLLM, SGLang, and TensorRT-LLM. Find the ceiling on each GPU SKU, then raise it. - Cut latency. Attack TTFT and TPOT separately. Profile, identify the actual bottleneck (compute, memory bandwidth, scheduling, networking), and fix it - not the bottleneck you assumed. - Get more out of the KV cache. Paged attention, prefix caching, eviction policies, cache reuse across requests, quantized KV. This is where a lot of the unrealized throughput lives, and it's an area you'll own. - Quantize without regressing quality. INT8, INT4, FP8 across weights, activations, and KV. Empirical work: measure quality on real workloads, not just perplexity benchmarks. - Shrink cold starts and memory footprint. Faster init, smarter weight loading, tighter memory accounting - the difference between a model that scales and one that doesn't. - Scale across nodes. Distributed inference topologies, network-aware placement, checkpointing strategies that don't bottleneck on storage or interconnect. - Set the technical direction. Decide what we benchmark, what we adopt, and what we build ourselves. Bring the team along with strong writeups and reproducible experiments. Benefits - Competitive salary (depending on the level of experience). - Enjoy a flexible, remote-first global environment. - Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology. - Equity options. - Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks. - Spend 10% of your work time on personal projects or self-improvement. - Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills. - Annual hackathon to spark new ideas and strengthen team bonds. - Team-building budget and company events to connect with your colleagues. - Equipment budget to ensure you have everything you need. - Extra days off to help maintain a healthy work-life balance. Hiring process - Screening call with Recruiter - Hiring Manager interview - Technical interview (system design) - Live coding - Culture Check interview with an executive As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr. Please note that Cast AI does not provide any form of visa sponsorship/work permit.
Role Description A well-funded seed-stage startup building the next generation of autonomous trading technology. You are building the intelligence layer on top of a purpose-built execution system for AI agents operating with real capital around the clock. What You'll Own - Learning System & RL Loop (~70%): - Design and implement the pipeline that connects live trade outcomes back to strategy improvement — signal quality, position sizing, timing, risk parameters. - Build the evaluation framework that separates genuine predictive signal from noise across agents, market conditions, and configurations. - Automate the strategy generation and testing cycle — the system should explore new configurations, validate them against real fleet data, and surface deployment candidates. - Detect regime shifts in market conditions and adapt fleet behavior accordingly. - Decompose every trade into its component drivers — signal quality, execution efficiency, exit timing — and wire those attributions back into strategy design. - Manage fleet-level coordination: concentration risk, capital allocation, and the exploration vs. exploitation balance. - Build the telemetry and data capture layer that makes all of the above possible. - Model & Inference Infrastructure (~30%): - Own the build-vs-buy decision on model hosting — evaluate proxied external APIs versus fine-tuned models on owned infrastructure and execute the chosen path. - Determine whether domain-specific training on trading data meaningfully outperforms prompted general-purpose models — then build the pipeline to act on that answer. - Optimize inference for the specific demands of a large autonomous agent fleet: concurrent agents, structured outputs, cost efficiency at scale. - Build the agent telemetry layer capturing every decision, signal score, and evaluation across the fleet. Qualifications - A production closed-loop system — model outputs drove real-world actions, outcomes were measured, and that feedback automatically improved the next decision. - Practical RL or online learning experience — you understand the challenges of learning from real-world feedback rather than static datasets. - Full-stack ML ownership — you build the pipeline, deploy the model, and own the outcome; Python primary, comfortable with Go or TypeScript in production services. - High-stakes sequential decision-making domain experience — finance preferred but not required; robotics, autonomous vehicles, game AI, ad bidding, and supply chain all transfer. Nice to Have - LLM fine-tuning and open-source model serving in production (vLLM, TGI, PEFT/LoRA). - Multi-agent system design. - Financial ML — signal generation, execution optimization, portfolio construction. - Onchain or DeFi experience. Interview Process - Fast — target first call to offer within two weeks. - Intro call with founders (60 min) — fit, motivation, your closed-loop experience. - Technical deep-dive (60 min) — open-ended system design, no right answer, evaluating how you think. - Paid trial project (1 week, part-time) if needed — real problem, compensated.
The CES Family of Companies is a collection of strong brands and businesses providing food equipment, supplies, service.
• Design and implement AI/GenAI features across applications and SDLC workflows • Build AI solutions using platforms like Azure AI Foundry and LLM APIs • Develop agent-based workflows using frameworks such as LangChain, LangGraph, or Semantic Kernel • Implement RAG-based solutions and prompt engineering strategies • Leverage GitHub Copilot for AI-assisted development and productivity • Build full-stack applications using Python, JavaScript/TypeScript, and/or C# (.NET) • Develop APIs, backend services, and integrate with databases (SQL/NoSQL) • Ensure application quality through testing, monitoring, and observability • Collaborate with cross-functional teams (product, UX, data science) • Troubleshoot and optimize AI models, pipelines, and integrations
Role Description Throughput. Latency. KV cache utilization. Move those three numbers in the right direction, and two things happen: customers get faster, cheaper inference, and our margins improve. That's the entire thesis of this role. Every kernel you tune, every quantization scheme you ship, every scheduler tweak you land shows up directly in a customer's p99 and on our P&L. This is a high-impact seat. It is also a high-autonomy seat as you'll be given the room to lead the technical direction of inference optimization at Kimchi, not execute someone else's roadmap. The problem: running LLMs in production is a moving target. The "right" model and serving configuration for a workload depend on: - Traffic shape - Sequence-length distribution - Batch dynamics - GPU SKU - Memory bandwidth - Quantization tolerance - A dozen other variables that shift week to week Most teams pick a model once, over-provision GPUs, and absorb the cost. Kimchi is the system that makes that decision automatically - continuously matching workloads to the most cost-efficient, best-performing LLM and serving configuration on a customer's infrastructure. We're building the optimization layer between the model and the hardware, and we need engineers who understand both sides deeply. Qualifications - 5+ years building real ML systems, with a portfolio that shows depth in inference or training infrastructure (not just model training notebooks). - Strong Python - production services, not scripts. - Hands-on experience with at least one of vLLM, SGLang, or TensorRT-LLM, and a working mental model of why an inference engine performs the way it does on a given GPU. - Fluency with quantization tradeoffs - you've measured quality regressions, not just compression ratios. - Comfort with distributed systems: collective communication, sharding strategies, and the practical failure modes of multi-GPU and multi-node setups. - A bias toward measurement. You instrument before you optimize, and you can tell the difference between a real win and a benchmark artifact. - Self-direction. This role comes with a wide mandate; you should be excited by that, not unsettled by it. Requirements - Push throughput. Continuous batching, speculative decoding, chunked prefill, kernel-level tuning across vLLM, SGLang, and TensorRT-LLM. Find the ceiling on each GPU SKU, then raise it. - Cut latency. Attack TTFT and TPOT separately. Profile, identify the actual bottleneck (compute, memory bandwidth, scheduling, networking), and fix it - not the bottleneck you assumed. - Get more out of the KV cache. Paged attention, prefix caching, eviction policies, cache reuse across requests, quantized KV. This is where a lot of the unrealized throughput lives, and it's an area you'll own. - Quantize without regressing quality. INT8, INT4, FP8 across weights, activations, and KV. Empirical work: measure quality on real workloads, not just perplexity benchmarks. - Shrink cold starts and memory footprint. Faster init, smarter weight loading, tighter memory accounting - the difference between a model that scales and one that doesn't. - Scale across nodes. Distributed inference topologies, network-aware placement, checkpointing strategies that don't bottleneck on storage or interconnect. - Set the technical direction. Decide what we benchmark, what we adopt, and what we build ourselves. Bring the team along with strong writeups and reproducible experiments. Benefits - Competitive salary (depending on the level of experience). - Enjoy a flexible, remote-first global environment. - Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology. - Equity options. - Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks. - Spend 10% of your work time on personal projects or self-improvement. - Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills. - Annual hackathon to spark new ideas and strengthen team bonds. - Team-building budget and company events to connect with your colleagues. - Equipment budget to ensure you have everything you need. - Extra days off to help maintain a healthy work-life balance. Hiring process - Screening call with Recruiter - Hiring Manager interview - Technical interview (system design) - Live coding - Culture Check interview with an executive As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr. Please note that Cast AI does not provide any form of visa sponsorship/work permit.
Top world’s largest social discovery company uniting 70+ brands with 500M+ users
• Conducting experiments with LLMs: Explore and analyze the effectiveness of different architectures and techniques (SFT, RLHF, Adapters, etc.) to enhance the capabilities of AI models. • Developing and implementing evaluation methodologies: Implement and maintain robust frameworks to assess the performance, accuracy, and user satisfaction of AI bots, including offline and online metrics. • Optimizing models for inference: Improve the efficiency and speed of AI models during inference to ensure they meet the performance and scalability requirements for production environments. • Collaborating with cross-functional teams: Work closely with data scientists, software engineers, and product managers to integrate AI solutions into the overall product pipeline.
Personal data collected during the recruitment process will be processed in accordance with the Privacy Notice of Aplaz, S.A. de C.V. (“Aplazo”), available at our Privacy and Policy Notice. Aplazo does not discriminate on the basis of race, religion, skin color, sex, gender, age, ethnic or national origin, marital status, disability, social or economic status, sexual preferences, or any other condition or characteristic. Selection is based solely on the qualifications and merits of the candidates.
Role Description We are looking for an AI Engineer to help build and scale a company-wide AI platform that will power customer-facing assistants, internal tools and future AI-driven products. This role is not a data science position. You will focus on systems, platforms, reliability and integration, ensuring AI capabilities are safe, scalable, cost-effective and reusable across teams. - Designing and building the core AI platform (gateways, orchestration, retrieval, tooling) - Integrating and operating LLMs via APIs and self-hosted solutions - Creating reusable infrastructure that enables product teams to ship AI features quickly - Implementing safety, compliance and observability for production AI systems - Partnering closely with Product, Backend, Data Science and Security teams to define AI capabilities Qualifications - 4+ years of experience in backend, platform or infrastructure engineering - Strong experience building production APIs and distributed systems - Experience with cloud platforms (AWS, GCP or Azure) - Solid understanding of system design, scalability and reliability - AI & ML literacy (not DS-heavy) - Practical experience integrating AI/ML models via APIs - Understanding of how LLMs work at a systems level (prompts, tokens, latency, cost) - Familiarity with concepts like: - Retrieval-augmented generation (RAG) - Model limitations and hallucinations - AI safety and guardrails - Ability to reason about AI behavior, risks and trade-offs Requirements - Strong programming skills in one or more backend languages (e.g., Python, Java, Go, Node.js) - Experience with data stores (SQL, NoSQL, Redis, search engines) - Familiarity with event-driven or async architectures - Experience with observability tools (logs, metrics, tracing) Nice to have - Experience building internal platforms or developer tooling - Exposure to fintech, payments or regulated environments - Experience with vector databases or search systems - Prior work on chatbots, assistants or workflow automation - Experience designing systems used by multiple teams What success looks like - Product teams can ship AI features without rebuilding infrastructure - AI systems are reliable, safe and cost-controlled in production - New AI use cases plug into a shared platform with minimal effort - Leadership has visibility into AI performance, quality and impact Benefits - Build foundational AI infrastructure used across the company - Work at the intersection of AI, platforms and real business impact - Shape how AI is safely deployed in a fintech/BNPL environment - High ownership, high visibility and long-term technical impact #Ly-remote Personal data collected during the recruitment process will be processed in accordance with the Privacy Notice of Aplaz, S.A. de C.V. (“Aplazo”), available at our Privacy and Policy Notice. Aplazo does not discriminate on the basis of race, religion, skin color, sex, gender, age, ethnic or national origin, marital status, disability, social or economic status, sexual preferences, or any other condition or characteristic. Selection is based solely on the qualifications and merits of the candidates.
Jerry.ai is America’s first and only super app to radically simplify car ownership. We are redefining how people manage owning a car, one of their most expensive and time-consuming assets. Backed by artificial intelligence and machine learning, Jerry.ai simplifies and automates owning and maintaining a car while providing personalized services for all car owners' needs. We spend every day innovating and improving our AI-powered app to provide the best possible experience for our customers. We are the #1 rated and most downloaded app in our category with a 4.7 star rating in the App Store. We have more than 5 million customers — and we’re just getting started. Founded in 2017 by serial entrepreneurs and has raised more than $240 million in financing. Join our team and work with passionate, curious and egoless people who love solving real-world problems. Help us build a revolutionary product that’s disrupting a massive market.
Role Description We are building the first super app to manage car ownership—an industry where the experience is stuck in the 90s. Lead a strategic area leveraging LLMs, agents, and internal APIs to automate a fragmented, $2T market. Work with partners at OpenAI to integrate Jerry.ai services directly into the ChatGPT App and pioneer new model capabilities. At Jerry.ai, we are moving past fragmented, time-consuming processes to create a seamless, automated platform. You will sit at the intersection of product, engineering, and applied AI as you build and scale a sophisticated system that already automates >70% of inbound sales and service requests (over 50k chats per month). Your role will include shaping how modern LLM systems, human-in-the-loop feedback, and computer-use agents redefine an entire industry. How You Will Make an Impact: - Lead end-to-end development for our AI platform and customer experiences, from initial roadmap to rollout. - Partner with engineers to design prompt strategies, evaluation frameworks, and guardrails—balancing latency, cost, and accuracy. - Serve as the technical translator between engineering and the broader organization, establishing AI best practices and platform standards. - Drive systematic improvement in answer quality, customer satisfaction, and automation rates through rigorous experimentation. - Work with partners at OpenAI to evaluate and deploy the next generation of voice models and workflow automation. Qualifications - 3+ years of experience in management consulting or technical product management at a fast-paced startup. - Proven interest in modern LLM systems. - Ability to navigate technical tradeoffs and lead by example when implementing AI platform standards. - A track record of taking complex, strategic ideas and turning them into scalable, production-grade products. Requirements - Intrinsically Motivated Technologist: You live and breathe AI. You read the release notes when a new model drops and you’ve already built your own custom workflows. - Systems Thinker: You are comfortable diving into technical conversations about API design and system architecture, translating complex concepts for any audience. - Optimistic Problem-Solver: You are a "how can we" thinker who seeks constant improvement and thrives on owning high-impact metrics. - Data-Driven with Conviction: You are familiar with SQL and comfortable diving into the data to answer your own questions and validate hypotheses. Benefits - Comprehensive benefits package including health, dental, and vision coverage. - Paid time off and paid parental leave. - 401(K) plan with employer matching. - Wellness benefits. - Equity opportunities may also be part of your total rewards package.
Jerry.ai is America’s first and only super app to radically simplify car ownership. We are redefining how people manage owning a car, one of their most expensive and time-consuming assets. Backed by artificial intelligence and machine learning, Jerry.ai simplifies and automates owning and maintaining a car while providing personalized services for all car owners' needs. We spend every day innovating and improving our AI-powered app to provide the best possible experience for our customers. We are the #1 rated and most downloaded app in our category with a 4.7 star rating in the App Store. We have more than 5 million customers — and we’re just getting started. Founded in 2017 by serial entrepreneurs and has raised more than $240 million in financing. Join our team and work with passionate, curious and egoless people who love solving real-world problems. Help us build a revolutionary product that’s disrupting a massive market.
Role Description We are building the first super app to manage car ownership—an industry where the experience is stuck in the 90s. Lead a strategic area leveraging LLMs, agents, and internal APIs to automate a fragmented, $2T market. Work with partners at OpenAI to integrate Jerry.ai services directly into the ChatGPT App and pioneer new model capabilities. At Jerry.ai, we are moving past fragmented, time-consuming processes to create a seamless, automated platform. You will sit at the intersection of product, engineering, and applied AI as you build and scale a sophisticated system that already automates >70% of inbound sales and service requests (over 50k chats per month). Your role will include shaping how modern LLM systems, human-in-the-loop feedback, and computer-use agents redefine an entire industry. How You Will Make an Impact: - Lead end-to-end development for our AI platform and customer experiences, from initial roadmap to rollout. - Partner with engineers to design prompt strategies, evaluation frameworks, and guardrails—balancing latency, cost, and accuracy. - Serve as the technical translator between engineering and the broader organization, establishing AI best practices and platform standards. - Drive systematic improvement in answer quality, customer satisfaction, and automation rates through rigorous experimentation. - Work with partners at OpenAI to evaluate and deploy the next generation of voice models and workflow automation. Qualifications - 4+ years of experience in management consulting or technical product management at a fast-paced startup. - Proven interest in modern LLM systems. - Ability to navigate technical tradeoffs and lead by example when implementing AI platform standards. - A track record of taking complex, strategic ideas and turning them into scalable, production-grade products. Requirements - Intrinsically Motivated Technologist: You live and breathe AI. You read the release notes when a new model drops and you’ve already built your own custom workflows. - Systems Thinker: You are comfortable diving into technical conversations about API design and system architecture, translating complex concepts for any audience. - Optimistic Problem-Solver: You are a "how can we" thinker who seeks constant improvement and thrives on owning high-impact metrics. - Data-Driven with Conviction: You are familiar with SQL and comfortable diving into the data to answer your own questions and validate hypotheses. Benefits - Comprehensive benefits package including health, dental, and vision coverage. - Paid time off and paid parental leave. - 401(K) plan with employer matching. - Wellness benefits. - Equity opportunities may also be part of your total rewards package. Company Description Jerry.ai is America’s first and only super app to radically simplify car ownership. We are redefining how people manage owning a car, one of their most expensive and time-consuming assets. - Backed by artificial intelligence and machine learning, Jerry.ai simplifies and automates owning and maintaining a car while providing personalized services for all car owners' needs. - We are the #1 rated and most downloaded app in our category with a 4.7 star rating in the App Store. - We have more than 5 million customers — and we’re just getting started. - Founded in 2017 by serial entrepreneurs and has raised more than $240 million in financing. Join our team and work with passionate, curious and egoless people who love solving real-world problems. Help us build a revolutionary product that’s disrupting a massive market.
2,526more opportunities are still waiting for you.Log in now and take your next shot before someone else does.
Python, JavaScript, TypeScript, Azure, PyTorch, Scikit-Learn