50 lines
1.5 KiB
Markdown
50 lines
1.5 KiB
Markdown
# About
|
|
|
|
This plugin allows you in conjuction with a _general-purpose speech recognition model_ to transcribe your voice messages to text.
|
|
|
|
In order to make use of this plugin, you need to have at least one of the following models installed:
|
|
|
|
#### OpenAI Whisper
|
|
- Website: https://github.com/openai/whisper
|
|
- Installable by: `pip install -U openai-whisper`
|
|
|
|
#### Faster Whisper
|
|
- Website: https://github.com/SYSTRAN/faster-whisper
|
|
- Installable by: `pip install -U faster-whisper`
|
|
|
|
Additionally you have to checkout the following Gajim branch:
|
|
https://dev.gajim.org/mesonium/gajim/-/tree/stt_voice_messages
|
|
|
|
# Hint
|
|
|
|
_**The plugin is very much POC at this stage!**_
|
|
|
|
Currently a chosen model will be on first downloaded in the background, during which
|
|
Gajim's UI may not respond.
|
|
|
|
Typical model sizes are in case of OpenAI Whisper:
|
|
|
|
| Multi Langual Model | Download Size |
|
|
|---------------------|---------------|
|
|
| Tiny | 70 MB |
|
|
| Base | 140 MB |
|
|
| Small | 460 MB |
|
|
| Medium | 1.4 GB |
|
|
| Large | 2.9 GB |
|
|
|
|
# TODO
|
|
|
|
- [x] Offer multiple models
|
|
- [ ] Add various model settings
|
|
- [ ] Model receiving
|
|
- [ ] Hint model download state
|
|
- [ ] Allow to change model download location
|
|
- [ ] Allow to use local models
|
|
- [ ] Database Handling
|
|
- [ ] Store transcribed messages in a DB
|
|
- [ ] Option to delete DB
|
|
- [ ] Update UI
|
|
- [ ] Make it more pretty
|
|
- [ ] Show progress bar
|
|
- [ ] Highlight words on playback
|