Speech Recognition Integration
Implement ASR in web applications — audio capture, streaming transcription, file processing, and error handling for all 50+ supported languages.
- ◉LocaleNLP API key — get one from the Developer Hub
- ◉Basic JavaScript / TypeScript knowledge
- ◉Browser with getUserMedia support (Chrome, Firefox, Safari 14+)
Install the SDK
Install the LocaleNLP JavaScript SDK. It handles auth, audio codec negotiation, and streaming for you.
# npm npm install @localenlp/sdk # yarn yarn add @localenlp/sdk
Python, Go, and Rust SDKs also available — see API Reference.
Initialize the client
Create a client instance with your API key. Keys are scoped per project and rotatable from the dashboard.
import { LocaleNLP } from "@localenlp/sdk"; const client = new LocaleNLP({ apiKey: process.env.LOCALENLP_KEY, });
Streaming transcription
Capture microphone audio and stream to the ASR endpoint. Results arrive as partial transcripts during speaking, final on silence.
const stream = await navigator.mediaDevices .getUserMedia({ audio: true }); const session = await client.asr.stream({ lang: "sw-KE", // Swahili (Kenya) onPartial: (t) => console.log("partial:", t), onFinal: (t) => console.log("final:", t), }); session.pipe(stream);
File transcription
For batch processing of audio files. Supports WAV, MP3, OGG, FLAC. Files up to 4 hours — larger files are chunked automatically.
const result = await client.asr.transcribeFile({ file: audioFile, // File | Blob | Buffer lang: "yo-NG", // Yoruba (Nigeria) diarization: true, }); console.log(result.transcript); console.log(result.speakers); // if diarization: true
Error handling
All SDK errors include a machine-readable code. Handle gracefully with fallback to text input when appropriate.
try { const result = await client.asr.transcribeFile({ file, lang: "sw-KE" }); } catch (err) { if (err.code === "AUDIO_QUALITY") { showTextFallback(); } else if (err.code === "RATE_LIMIT_EXCEEDED") { retryWithBackoff(); } }
Supported Language Codes
sw-KESwahili (Kenya)sw-TZSwahili (Tanzania)am-ETAmharic (Ethiopia)rw-RWKinyarwanda (Rwanda)yo-NGYoruba (Nigeria)ha-NGHausa (Nigeria)ig-NGIgbo (Nigeria)wo-SNWolof (Senegal)Ready to go deeper?
Explore the full ASR endpoint reference for advanced parameters — speaker diarization, dialect hints, streaming vs. batch mode selection.