Skip to main content

Overview

SkyScribe supports transcription in 99 languages and translation between these languages, powered by OpenAI Whisper’s training on 680,000 hours of diverse audio data.

Supported Languages

SkyScribe supports transcription in the following 99 languages:
  • Afrikaans (af)
  • Albanian (sq)
  • Amharic (am)
  • Arabic (ar)
  • Armenian (hy)
  • Assamese (as)
  • Azerbaijani (az)
  • Bashkir (ba)
  • Basque (eu)
  • Belarusian (be)
  • Bengali (bn)
  • Bosnian (bs)
  • Breton (br)
  • Bulgarian (bg)
  • Burmese / Myanmar (my)
  • Catalan (ca)
  • Chinese (zh)
  • Cantonese (yue)
  • Croatian (hr)
  • Czech (cs)
  • Danish (da)
  • Dutch (nl)
  • English (en)
  • Estonian (et)
  • Faroese (fo)
  • Finnish (fi)
  • French (fr)
  • Galician (gl)
  • Georgian (ka)
  • German (de)
  • Greek (el)
  • Gujarati (gu)
  • Haitian Creole (ht)
  • Hausa (ha)
  • Hawaiian (haw)
  • Hebrew (he)
  • Hindi (hi)
  • Hungarian (hu)
  • Icelandic (is)
  • Indonesian (id)
  • Italian (it)
  • Japanese (ja)
  • Javanese (jw)
  • Kannada (kn)
  • Kazakh (kk)
  • Khmer (km)
  • Korean (ko)
  • Kurdish (ku)
  • Kyrgyz (ky)
  • Lao (lo)
  • Latin (la)
  • Latvian (lv)
  • Lingala (ln)
  • Lithuanian (lt)
  • Luxembourgish (lb)
  • Macedonian (mk)
  • Malagasy (mg)
  • Malay (ms)
  • Malayalam (ml)
  • Maltese (mt)
  • Maori (mi)
  • Marathi (mr)
  • Mongolian (mn)
  • Nepali (ne)
  • Norwegian (no)
  • Norwegian Nynorsk (nn)
  • Occitan (oc)
  • Pashto (ps)
  • Persian (fa)
  • Polish (pl)
  • Portuguese (pt)
  • Punjabi (pa)
  • Romanian (ro)
  • Russian (ru)
  • Sanskrit (sa)
  • Serbian (sr)
  • Shona (sn)
  • Sindhi (sd)
  • Sinhala (si)
  • Slovak (sk)
  • Slovenian (sl)
  • Somali (so)
  • Spanish (es)
  • Sundanese (su)
  • Swahili (sw)
  • Swedish (sv)
  • Tagalog (tl)
  • Tajik (tg)
  • Tamil (ta)
  • Tatar (tt)
  • Telugu (te)
  • Thai (th)
  • Turkish (tr)
  • Turkmen (tk)
  • Ukrainian (uk)
  • Urdu (ur)
  • Uzbek (uz)
  • Vietnamese (vi)
  • Welsh (cy)
  • Yiddish (yi)
  • Yoruba (yo)
Performance varies based on factors like audio quality, accents, and technical vocabulary. We recommend testing with your specific use case to ensure it meets your needs.

Translation Support

SkyScribe can translate audio between the 99 supported languages. How it works:
  1. Transcription - Your audio is first transcribed in its original language
  2. Translation - The transcript is then translated using the model’s built-in translation capabilities

Language Detection

SkyScribe can automatically detect the language spoken in your audio, or you can manually specify the language for better accuracy.

Auto-Detect

Auto-detect is convenient when you’re unsure of the language or processing multiple audio files with different languages. Benefits:
  • Saves time - no need to manually specify the language
  • Fast and automatic - language is identified as transcription begins
  • Works well with clear, single-language audio
Specifying the language manually provides better accuracy when you know what language is spoken in your audio. Choose manual selection when:
  • You know the language - Better transcription quality when you specify the exact language
  • Mixed-language audio - Specify the primary language when audio contains multiple languages
  • Specialized dialects - Better accuracy for specific regional variants
  • Consistency - Ensure the same language processing across multiple files
  • Challenging audio - More reliable when background noise may affect auto-detection

Accent and Dialect Variations

Our AI model shows improved robustness to diverse accents compared to many existing systems, but performance can vary. Strengths:
  • Handles major regional accents well (e.g., American, British, Australian English)
  • Better performance with international accents than most specialized models
  • Trained on diverse speaker demographics
Limitations:
The model exhibits disparate performance across different accents, dialects, and demographic groups. Word error rates may be higher for certain speakers.
Factors that can affect accuracy:
  • Regional dialects and non-standard pronunciations
  • Speaker demographics (age, gender, ethnicity)
  • Code-switching between languages
  • Heavy accents or unique speech patterns

Best Practices for Multilingual Transcription

To get optimal results when transcribing in different languages:
  1. Use high-quality audio - Clear speech improves accuracy across all languages
  2. Specify the language manually if auto-detect struggles with your dialect
  3. Review transcripts for critical applications to ensure accuracy
  4. Report issues to help us improve support for your language

What’s Next?

You’ve learned about SkyScribe’s language support and best practices for multilingual transcription. Want to learn more?

Need Help?

If you have questions about language support, accuracy, or specific dialect handling: