spec2vec is a novel similarity measure for comparing mass spectrometry data, which learns peak representations using Word2Vec.

1173 commits | Last update: September 17, 2020

Cite this software

Choose a version:
[[ releases.length > 0 ? releases[selectedIndex].doi : conceptDOI ]]
Copy to clipboard
Choose a reference manager file format:
Download file

What spec2vec can do for you

  • Allows to learn abstract mass spectra representations from large mass spectral data sets (unsupervised learning).
  • Computes mass spectra similarities that show a high correlation with actual molecular similarity.

Spec2vec is a novel spectral similarity score inspired by a natural language processing algorithm -- Word2Vec. Where Word2Vec learns relationships between words in sentences, spec2vec does so for mass fragments and neutral losses in MS/MS spectra. The spectral similarity score is based on spectral embeddings learnt from the fragmental relationships within a large set of spectral data.

Read more
  • Machine learning
  • Text analysis & natural language processing
Programming Language
  • Python
  • Apache-2.0
Source code

Participating organizations


  • Florian Huber
    Netherlands eScience Center
  • Justin J. J. van der Hooft
    Wageningen University & Research
  • Jurriaan H. Spaaks
    Netherlands eScience Center
  • Faruk Diblen
    Netherlands eScience Center
  • Stefan Verhoeven
    Netherlands eScience Center
  • Cunliang Geng
    Netherlands eScience Center
  • Christiaan Meijer
    Netherlands eScience Center
  • Simon Rogers
    University of Glasgow
  • Hanno Spreeuw
    Netherlands eScience Center
  • Adam Belloum
    Netherlands eScience Center
Show all contributors
Contact person
Florian Huber
Netherlands eScience Center

Information for page maintainers

OAI-PMH metadata:
429 Client Error: TOO MANY REQUESTS for url: https://zenodo.org/record/3873168
citation metadata: