doc2vec-based assisted close reading with support for abstract concept-based search and context-based search

546 commits | Last update: May 11, 2022

Cite this software

Choose a version:
[[ releases.length > 0 ? releases[selectedIndex].doi : conceptDOI ]]
Copy to clipboard
Choose a reference manager file format:
Download file

What evidence can do for you

  • Provides AI/machine-learning support for close-reading-based research
  • Intuitive example based search throughout large corpora
  • browser-based usage / User interface
  • concept based search using abstract doc2vec representations
  • context based search using word frequency/TF-IDF represenations
  • automated processing of user-supplied corpora

Machine-supported research in humanities

While research in the humanities has been able to leverage the digitization of text corpora and the development of computer based text analysis tools to its benefit, the interface current systems provide the user with is incompatible with the proven method of scholarly close reading of texts which is key in many research scenarios pursuing complex research questions.

What this boils down to, is the fact that it is often restrictive and difficult, if not impossible, to formulate adequate selection criteria, in particular for more complex or abstract concepts, in the framework of a keyword based search which is the standard entry point to digitized text collections.

Querying by example - close reading with tailored suggestions

evidence provides an alternative, intuitive entry point into collections by leveraging the doc2vec framework. Using doc2vec evidence learns abstract representations of the theme and content of the elements of the user's corpus. Then, instead of trying to translate the scientific query into keywords, after compiling a set of relevant elements as starting points, i.e. examples of the concept the user is interested in, the user can query the corpus based on these examples of their concept of interest. Specifically, evidence retrieves elements with similar abstract representations and presents them to the user, using the users feedback to refine its retrieval. Furthermore, this concept-based query mode is complemented by the ability to perform additional retrieval using more-like-this context based retrieval function provided by elasticsearch. Together, this enables a user to combine the power of a close-reading approach with that of a large digitized corpus, selecting elements from the entire corpus which are likely to be of interest, but leaving the decision up to the user as to what evidence they deem useful.

Read more
No tags available
Programming Language
  • Go
  • TypeScript
  • Python
  • GPL-3.0
Source code

Participating organizations


  • Meiert Grootes
    Netherlands eScience Center
  • Willem van Hage
    Netherlands eScience Center
  • Lars Buitinck
    KNAW Humanities Cluster
  • Hayco de Jong
    KNAW Humanities Cluster
  • Bas Leenknegt
    KNAW Humanities Cluster
  • Faruk Diblen
    Netherlands eScience Center
  • Christiaan Meijer
    Netherlands eScience Center
  • Jurriaan H. Spaaks
    Netherlands eScience Center
  • Stefan Verhoeven
    Netherlands eScience Center
Show all contributors
Contact person
Meiert Grootes
Netherlands eScience Center

Information for page maintainers

OAI-PMH metadata:
citation metadata: