This demonstration shows how contextualized document vectors can be used to retrieve information from a large healthcare dataset, such as 11,000+ documentson COVID19 by semantic scholar (2020).The model used is trained on Wikipedia data, see our WWW2020 paper and GitHub for more details on the implementation.


How to use?

    Search: Enter the name of a disease and optionally a specific aspect into the query field. The system will retrieve the top 25 passages from the dataset that answer your query.

    Highlight: Browse through the dataset and analyze how relevant each sentence in a document is for your query. The shade of blue visualizes the relevance score of a sentence.

Try some examples

Datasets used in this demo: 11.1K articles from PMC Open Access, CC BY-NC-SA

Access a different dataset:


  1. Wikipedia (Encyclopedia articles about diseases)
  2. CORD-19 (COVID-19 Open Research Dataset)