Deep learning based similarity measure of mass spectrometry data.

360 commits | Last update: April 22, 2021

Cite this software

Choose a version:
[[ releases.length > 0 ? releases[selectedIndex].doi : conceptDOI ]]
Copy to clipboard
Choose a reference manager file format:
Download file

What MS2DeepScore can do for you

  • Predict chemical similarity based on MS/MS mass spectrums

A classical way to compare MS/MS mass spectra is to quantify their peak overlap, often done by using variations of cosine similarity scores. Those measures tend to work well for nearly equal spectra, i.e. cases of very high peak overlap. We recently introduced Spec2Vec an unsupervised machine learning approach for computing spectrum similarities based on learned relationships between peaks across large training datasets [ref]. Spec2Vec based similarity scores were observed to correlate more strongly than classical cosine-like scores with actual structural similarities between the underlying compounds. Additional core advantages are its fast computation, which allows to compare query spectra against very large libraries, and the fact that -as an unsupervised method- it can be trained on non-annotated data. However, the downside of an unsupervised approach is that it does not make use of the large fraction of labels that we have for the training data. The used training data (MS/MS spectra from GNPS) contains smiles/InChI annotations hence does allow to create molecular fingerprints for quantifying the structural similarities.

Read more
No tags available
Programming Language
  • Python
  • Apache-2.0
Source code


  • Florian Huber
  • Sven van der Burg
Contact person
Florian Huber

Information for page maintainers

OAI-PMH metadata:
citation metadata: