Build a speech transcription demo with OpenAI’s Whisper model in-browser using Transformers.js

Transformers.js

Transformers.js is a Javascript implementation of the Python library transformers, which was developed by HuggingFace. Transformers.js uses ONNX Runtime to run pre-trained AI models with JavaScript. Long story short, ONNX Runtime allows developers to share neural network models interchangeably among frameworks and support a wide range hardwares.

Common tasks supported by Transformers.js:

  • Natural Language Processing: text classification, named entity recognition, question answering, language modelling, summarization, translation, multiple choice, and text generation.
  • Computer Vision: image classification, object detection, segmentation, and depth estimation.
  • Audio: automatic speech recognition, audio classification, and text-to-speech.
  • Multimodal: embeddings, zero-shot audio classification, zero-shot image classification, and zero-shot object detection
    Continue reading Build a speech transcription demo with OpenAI’s Whisper model in-browser using Transformers.js