Techire Ai logo

Techire Ai

Remote Jobs

2 open rolesLatest: May 4, 2026, 5:38 PM UTC
Post Date
Minimum Salary
Experience

2 Jobs

Role Description Want to own the data infrastructure behind some of the most naturalistic voice models in production? You'll be joining a well-funded speech AI startup — just closed their Series A — with strong enterprise traction and revenue that more than doubled last quarter. They're building ultra-realistic voice technology that handles natural laughter, breathing, seamless language switching, and accurate pronunciation across languages and accents. Their models are powering hundreds of millions of conversations monthly. Before training a single model, they built their own corpus — full-duplex, studio-quality conversational speech annotated by PhD linguists. As their MLE, you'll own the pipelines that turn that raw material into clean, training-ready data. - Own end-to-end data pipelines from raw audio ingestion through to versioned, training-ready datasets - Build quality systems that catch annotation errors and alignment issues before they reach a training run - Maintain the training infrastructure that keeps GPUs fed — dataloaders, streaming datasets, multi-modal batching - Build and iterate on tooling across speech representations including neural codecs, semantic tokens and mel features - Handle full- and half-duplex pipeline work including two-channel alignment and overlap handling Qualifications - Strong engineering fundamentals with experience building ML data pipelines at scale - Hands-on experience with speech or audio data - Solid understanding of speech representations and the tradeoffs between them - Experience with multi-channel audio data including diarisation and alignment Requirements - Experience with multilingual data pipelines (Nice to have) - Large-scale training infrastructure experience — FSDP, DeepSpeed, Ray (Nice to have) - Annotation tooling and human-in-the-loop systems (Nice to have) Benefits - Remote-friendly - Competitive base plus stock

United States

Role Description Ready to own the data pipeline powering the voice of the next generation of AI characters? You'll be joining a well-funded startup building AI character technology, where speech is a core part of the product experience. Think super natural conversations, handling interruptions, personality shifts and more! You'll own the datasets that power their speech systems — from raw, messy audio through to clean, versioned training corpora that directly drive TTS and ASR model performance. - Own the full data lifecycle — defining specs, auditing and curating large-scale audio and text corpora - Build automated quality metrics and dashboards across SNR, VAD, WER, speaker verification and safety, validated against listening tests - Train and deploy lightweight classifiers for noise detection, diarisation, language ID, and content moderation Qualifications - Deep experience working with speech and audio data at scale — 1M+ hours - Strong ML engineering skills in Python and PyTorch, including training and fine-tuning models like Whisper or Wav2Vec - Practical knowledge of audio processing — torchaudio, librosa, spectrograms, DSP basics - A solid understanding of audio quality metrics — MOS, WER, PESQ/STOI, SNR, speaker verification Requirements - Experience with Spark/Beam, Airflow, SQL or similar data engineering tools - Open-source contributions or publications in speech or audio ML - Background in denoising and enhancement, and how it affects downstream model quality Benefits - Remote, with a preference for European or overlapping timezones - Competitive compensation and equity

Europe + 1 moreAll locations: Europe | EMEA