diff --git a/stt_voice_messages/README.md b/stt_voice_messages/README.md index dcb205e..e91ba42 100644 --- a/stt_voice_messages/README.md +++ b/stt_voice_messages/README.md @@ -1,28 +1,33 @@ -# Requirements +# About -## STT Models +This plugin allows you in conjuction with a _general-purpose speech recognition model_ to transcribe your voice messages to text. -### openai-whisper https://github.com/openai/whisper +In order to make use of this plugin, you need to have at least one of the following models installed: -#### Installation -`pip install -U openai-whisper` will install +#### OpenAI Whisper +- Website: https://github.com/openai/whisper +- Installable by: `pip install -U openai-whisper` -``` -mpmath, urllib3, tqdm, sympy, regex, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, -nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, -nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, -MarkupSafe, llvmlite, fsspec, filelock, charset-normalizer, certifi, triton, -requests, nvidia-cusparse-cu12, nvidia-cudnn-cu12, numba, jinja2, tiktoken, -nvidia-cusolver-cu12, torch, openai-whisper -``` +#### Faster Whisper +- Website: https://github.com/SYSTRAN/faster-whisper +- Installable by: `pip install -U faster_whisper` -#### Models -| Multi Langual Model | Download Size | VRAM Requirement | Relative Speed | -|---------------------|---------------| ---------------- |----------------| -| Tiny | 70 MB | ~1 GB | ~32x | -| Base | 140 MB | ~1 GB | ~16x | -| Small | 460 MB | ~2 GB | ~6x | -| Medium | 1.4 GB | ~5 GB | ~2x | -| Large | 2.9 GB | ~10 GB | ~1x | +# Hint + +_**The plugin is very much POC at this stage!**_ + +Currently a chosen model will be on first downloaded in the background, during which +Gajim's UI may not respond. + +Typical model sizes are in case of OpenAI Whisper: + +| Multi Langual Model | Download Size | +|---------------------|---------------| +| Tiny | 70 MB | +| Base | 140 MB | +| Small | 460 MB | +| Medium | 1.4 GB | +| Large | 2.9 GB | + diff --git a/stt_voice_messages/gtk/config_dialog.py b/stt_voice_messages/gtk/config_dialog.py index e9c7188..f29d2eb 100644 --- a/stt_voice_messages/gtk/config_dialog.py +++ b/stt_voice_messages/gtk/config_dialog.py @@ -207,7 +207,7 @@ class STTVoiceMessagesConfigDialog(Gtk.ApplicationWindow): self.config = config self.plugin = self.config.plugin self._add_prefs(prefs) - + self.show_all() def _add_prefs(self, prefs: list[tuple[str, type[PreferenceBox]]]):