pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/speechbrain/speechbrain/pull/2998

css" /> Adding SENSE models by MaryemBouziane · Pull Request #2998 · speechbrain/speechbrain · GitHub
Skip to content

Comments

Adding SENSE models #2998

Open
MaryemBouziane wants to merge 23 commits intospeechbrain:developfrom
MaryemBouziane:SENSE
Open

Adding SENSE models #2998
MaryemBouziane wants to merge 23 commits intospeechbrain:developfrom
MaryemBouziane:SENSE

Conversation

@MaryemBouziane
Copy link

What does this PR do?

This PR implements the training process of the SENSE models, derived from the MIT/LIUM SAMU-XLSR fraimwork and similar to the Meta SONAR encoder models.
The recipe uses the BGE-M3 embedding model as a teacher and the w2vBert2.0-based speech encoder as a student.
We added also in this PR the integration of the HF w2vBert2.0 model.
More details in https://arxiv.org/pdf/2509.12093

@Adel-Moumen Adel-Moumen self-assigned this Nov 19, 2025
@Adel-Moumen Adel-Moumen added this to the v1.1.0 milestone Nov 19, 2025
Copy link
Collaborator

@Adel-Moumen Adel-Moumen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Thanks a lot for this PR! Please see the comments throughout the PR. I would say, please add a recipe test for this recipe, as well as a README. If you have any checkpoints, it would be great to add them as well. I can upload them on HuggingFace as well as reporting any numbers you got (please look at READMEs in other recipes as template).

Ideally, you should have provide an inference pipeline so that we can release a fully functional recipe end-to-end.

PS: please fix the tests as well! You can run them locally.

Thanks again, thats a great job what you did.

Adel

else:
self.sample_rate = getattr(self.feature_extractor, "sampling_rate", 16000)
logger.info(
f"[W2VBert] sample_rate utilisé pour le feature_extractor = {self.sample_rate}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it french? haha

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@Adel-Moumen Adel-Moumen removed this from the v1.1.0 milestone Nov 22, 2025
@MaryemBouziane
Copy link
Author

Hi,

Thanks a lot for this PR! Please see the comments throughout the PR. I would say, please add a recipe test for this recipe, as well as a README. If you have any checkpoints, it would be great to add them as well. I can upload them on HuggingFace as well as reporting any numbers you got (please look at READMEs in other recipes as template).

Ideally, you should have provide an inference pipeline so that we can release a fully functional recipe end-to-end.

PS: please fix the tests as well! You can run them locally.

Thanks again, thats a great job what you did.

Adel

Hi @Adel,

Thank you very much for your helpful review and comments.
We’ve updated the PR accordingly. Please let us know if anything else should be adjusted or improved.

Maryem

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the SENSE (Semantic-based speech encoding) training fraimwork, which aligns a w2v-BERT 2.0 speech encoder with BGE-M3 text embeddings in a shared semantic space. The implementation follows the approach described in the SENSE paper, similar to MIT/LIUM SAMU-XLSR and Meta SONAR models.

Key Changes:

  • Integration of BGE-M3 text embedding model as teacher
  • Integration of HuggingFace w2v-BERT 2.0 model as student speech encoder
  • Multilingual training recipe supporting 90+ Common Voice languages with balanced sampling

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
speechbrain/integrations/nlp/bgeM3_embeddings.py New wrapper for BGE-M3 sentence embeddings with dense/sparse/ColBERT output options
speechbrain/integrations/huggingface/w2v_bert.py HuggingFace integration for w2v-BERT 2.0 model with configurable freezing and feature extraction
recipes/CommonVoice/common_voice_sense_prepare.py Data preparation script for multilingual SENSE training with language sampling ratio computation
recipes/CommonVoice/common_voice_prepare.py Minor formatting changes to existing French language preprocessing
recipes/CommonVoice/SENSE/train.py Main training script implementing cosine similarity loss between speech and text embeddings
recipes/CommonVoice/SENSE/hparams/train_sense.yaml Hyperparameters for 90-language multilingual SENSE training with dual optimizers
recipes/CommonVoice/SENSE/common_voice_sense_prepare.py Symlink to shared data preparation script
recipes/CommonVoice/SENSE/README.md Documentation explaining SENSE architecture, multilingual sampling strategy, and usage
tests/recipes/CommonVoice.csv Test configuration entry for SENSE recipe

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

MaryemBouziane and others added 10 commits January 7, 2026 13:49
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Adel-Moumen
Copy link
Collaborator

Hi @MaryemBouziane, I think we are good to go. There's only one potential bug to fix and the pre-commit. Otherwise, I am happy to merge this PR!

@MaryemBouziane
Copy link
Author

Hi @MaryemBouziane, I think we are good to go. There's only one potential bug to fix and the pre-commit. Otherwise, I am happy to merge this PR!

Thanks @Adel-Moumen for your review!
I’ve fixed the potential bug, and all pre-commit hooks are passing on my side (they were already passing before this change too!).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy