Add support for Useful Sensors Moonshine model. #1808

njeffrie · 2024-10-26T02:14:15Z

For context on the moonshine model please see the Useful Sensors Moonshine repo

Adds the following:

c++ moonshine model
pybind for python moonshine model
moonshine model spec
support for multi-dimensional layernorm on CPU.
support for broadcasting layernorm weights for multi-dimensional layernorm on CPU.

For now the moonshine converter (safetensor -> ctranslate2 binary) will live in the moonshine repo. Planning to add a transformers converter once Moonshine is part of the transformers library.

BBC-Esq · 2024-10-26T10:01:26Z

I checked out your repo but didn't see anywhere to actually download the moonshine models. How is Ctranslate2 supposed to evaluate whether to incorporate this pull request if the models' can't be tested?

njeffrie · 2024-10-26T20:18:06Z

Thanks for taking a look - I've uploaded CTranslate2 models for moonshine base and tiny to UsefulSensors huggingface hub. In case it's helpful for testing, the following is a minimal python script to transcribe a wav file with CTranslate2 moonshine base (assuming the model was downloaded to ./ctranslate2/base):

from ctranslate2.models import Moonshine
from ctranslate2 import StorageView
import torchaudio
import tokenizers

tokenizer = tokenizers.Tokenizer.from_file("ctranslate2/base/tokenizer.json")
model = Moonshine('ctranslate2/base', device='cpu')

audio, sr = torchaudio.load('foo.wav')
if sr != 16000:
    audio = torchaudio.functional.resample(audio, sr, 16000)
audio_sv = StorageView.from_array(audio.numpy())

result = model.generate(audio_sv, [[1]], beam_size=5)[0]
tokens = result.sequences_ids[0]
text = tokenizer.decode(tokens).strip()
print(text)

BBC-Esq · 2024-10-26T20:25:59Z

Thanks for the info, but unfortunately I'm not knowledgeable enough to know how to use .h5 files, but I think this was what I was asking about that you did link to...

Also, unfortunately, I have no decision-making power regarding Ctranslate2 either so...But I would recommend that if you can't get a response from the Ctranslate2 people relatively quickly that you reach out to a guy named @MahmoudAshraf97 because, although he's not officially with "Systran," he's also interested in all-things Ctranslate2/TTS and is pretty good about responding and has a good repoir with them.

As I said, I'm just one annoying fan of this technology so...Good luck!

njeffrie · 2024-10-26T20:28:46Z

Just uploaded CT2 models for moonshine tiny and base: https://huggingface.co/UsefulSensors/moonshine/tree/main/ctranslate2

njeffrie · 2024-10-30T18:29:48Z

@minhthuc2502 perhaps you could take a look or assign somebody to review? Landing this in CTranslate2 is currently blocking us from releasing a faster-whisper style model as part of usefulsensors/moonshine.

Thanks!

minhthuc2502 · 2024-11-25T10:31:09Z

Could you add CUDA support by implementing it in layer_norm_gpu.cu? Additionally, I noticed there isn't a converter to transform the original model into CTranslate2's format, apart from the added spec.

And try to fix the pipeline please.

Thank you.

njeffrie · 2024-11-26T21:47:13Z

Thanks for taking a look. I'll cuda support for the layernorm changes, add our safetensors -> CTranslate converter and look into what's going on with the presubmit pipeline.

Additionally, I've added a fix to support batching to address this issue.

Adds the following: - c++ moonshine model - pybind for python moonshine model - moonshine model spec - safetensor moonshine model converter - support for GroupNorm-style weights for LayerNorm - support for multi-axis cuda layernorm

- Add a define to prevent quantizing the first conv layers in the Moonshine preprocessor - Add options to enable rotary positional embeddings in the Transformer Encoder spec.

Fixes bug when batch size > 1.

Converts safetensor model def + tokenizer_config.json to ctranslate2 model spec for Moonshine.

BBC-Esq · 2024-12-04T22:45:12Z

Thanks for taking a look. I'll cuda support for the layernorm changes, add our safetensors -> CTranslate converter and look into what's going on with the presubmit pipeline.

Additionally, I've added a fix to support batching to address this issue.

Can you please post when it's ready to review because I'm actually kind of curious to test out these models. I won't do it until all the multiple changes are near final or what not. Thanks!

njeffrie · 2024-12-05T06:58:57Z

Should be ready to go @minhthuc2502, @BBC-Esq.

keveman mentioned this pull request Oct 27, 2024

How to achieve live transcription usefulsensors/moonshine#19

Closed

keveman mentioned this pull request Nov 1, 2024

Tensorflow lite (or TFLM) for On-Device usefulsensors/moonshine#44

Open

njeffrie mentioned this pull request Nov 25, 2024

How to get inference for batch size > 1 usefulsensors/moonshine#55

Open

njeffrie added 4 commits November 26, 2024 13:51

Create ctranslate2 Moonshine implementation.

39a8ab5

Adds the following: - c++ moonshine model - pybind for python moonshine model - moonshine model spec - safetensor moonshine model converter - support for GroupNorm-style weights for LayerNorm - support for multi-axis cuda layernorm

Add special case to avoid quantizing conv in Moonshine

25ef63f

- Add a define to prevent quantizing the first conv layers in the Moonshine preprocessor - Add options to enable rotary positional embeddings in the Transformer Encoder spec.

Fix bug in layernorm loop ordering.

9a19c6e

Maintain batch dimension in preprocessor.

9506b57

Fixes bug when batch size > 1.

njeffrie force-pushed the master branch from 1cfabe0 to 9506b57 Compare November 26, 2024 21:52

njeffrie added 3 commits December 4, 2024 12:51

Add multi-axis layernorm cuda implementation

6c97db4

Add moonshine converter.

48398f6

Converts safetensor model def + tokenizer_config.json to ctranslate2 model spec for Moonshine.

Update text in moonshine guide.

ef096c0

njeffrie force-pushed the master branch from 9793a94 to ef096c0 Compare December 4, 2024 22:33

Format moonshine converter.

4d0e991

njeffrie added 3 commits December 4, 2024 14:47

Fix import ordering.

b6812f4

Add safetensors to requirements files.

687147c

Specify torch version of safetensors

488e362

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Useful Sensors Moonshine model. #1808

Add support for Useful Sensors Moonshine model. #1808

njeffrie commented Oct 26, 2024

BBC-Esq commented Oct 26, 2024

njeffrie commented Oct 26, 2024 •

edited

Loading

BBC-Esq commented Oct 26, 2024 •

edited

Loading

njeffrie commented Oct 26, 2024

njeffrie commented Oct 30, 2024

minhthuc2502 commented Nov 25, 2024 •

edited

Loading

njeffrie commented Nov 26, 2024

BBC-Esq commented Dec 4, 2024 •

edited

Loading

njeffrie commented Dec 5, 2024

Add support for Useful Sensors Moonshine model. #1808

Are you sure you want to change the base?

Add support for Useful Sensors Moonshine model. #1808

Conversation

njeffrie commented Oct 26, 2024

BBC-Esq commented Oct 26, 2024

njeffrie commented Oct 26, 2024 • edited Loading

BBC-Esq commented Oct 26, 2024 • edited Loading

njeffrie commented Oct 26, 2024

njeffrie commented Oct 30, 2024

minhthuc2502 commented Nov 25, 2024 • edited Loading

njeffrie commented Nov 26, 2024

BBC-Esq commented Dec 4, 2024 • edited Loading

njeffrie commented Dec 5, 2024

njeffrie commented Oct 26, 2024 •

edited

Loading

BBC-Esq commented Oct 26, 2024 •

edited

Loading

minhthuc2502 commented Nov 25, 2024 •

edited

Loading

BBC-Esq commented Dec 4, 2024 •

edited

Loading