Bioasq-qa: a manually curated corpus for biomedical question answering

Nature

Bioasq-qa: a manually curated corpus for biomedical question answering"


Play all audios:

Loading...

ABSTRACT The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golden standard (reference) answers and related material. The dataset has been


designed to reflect real information needs of biomedical experts and is therefore more realistic and challenging than most existing datasets. Furthermore, unlike most previous QA benchmarks


that contain only exact answers, the BioASQ-QA dataset also includes ideal answers (in effect summaries), which are particularly useful for research on multi-document summarization. The


dataset combines structured and unstructured data. The materials linked with each question comprise documents and snippets, which are useful for Information Retrieval and Passage Retrieval


experiments, as well as concepts that are useful in concept-to-text Natural Language Generation. Researchers working on paraphrasing and textual entailment can also measure the degree to


which their methods improve the performance of biomedical QA systems. Last but not least, the dataset is continuously extended, as the BioASQ challenge is running and new data are generated.


SIMILAR CONTENT BEING VIEWED BY OTHERS BENCHMARKING LARGE LANGUAGE MODELS FOR BIOMEDICAL NATURAL LANGUAGE PROCESSING APPLICATIONS AND RECOMMENDATIONS Article Open access 06 April 2025 AN


ASTRONOMICAL QUESTION ANSWERING DATASET FOR EVALUATING LARGE LANGUAGE MODELS Article Open access 18 March 2025 THOUGHTSOURCE: A CENTRAL HUB FOR LARGE LANGUAGE MODEL REASONING DATA Article


Open access 08 August 2023 BACKGROUND & SUMMARY More than 2 articles are published in biomedical journals every minute, leading to MEDLINE/PubMed1 currently comprising more than 32


million articles, while the number and size of non-textual biomedical data sources also increases rapidly. As an example, since the outbreak of the COVID-19 pandemic, there has been an


explosion of new scientific literature about the disease and the virus that causes it, with about 10,000 new COVID-19 related articles added each month2. This wealth of new knowledge plays a


central role in the progress achieved in biomedicine and its impact on public health, but it is also overwhelming for the biomedical expert. Ensuring that this knowledge is used for the


benefit of the patients in a timely manner is a demanding task. BioASQ3 (Biomedical Semantic Indexing and Question Answering) pushes research towards highly precise biomedical information


access systems through a series of evaluation campaigns, in which systems from teams around the world compete. BioASQ campaigns run annually since 2012, providing data, open-source software


and a stable evaluation environment for the participating systems. In the last ten years that the challenge has been running, around 100 different universities and companies, from all


continents, have participated in BioASQ, providing a competitive, but also synergetic ecosystem. The fact that the participants of the BioASQ challenges are all working on the same benchmark


data, facilitates significantly the exchange and fusion of ideas and eventually accelerates progress in the field. The ultimate goal is to lead biomedical information access systems to the


maturity and reliability required by biomedical researchers. BioASQ comprises two main tasks. In Task A systems are asked to automatically assign Medical Subject Headings (MeSH)4 terms to


biomedical articles, thus assisting the indexing of biomedical literature. Task B focuses on obtaining precise and comprehensible answers to biomedical research questions. The systems that


participate in Task B are given English questions that are written by biomedical experts and reflect real-life information needs. For each question, the systems are required to return


relevant articles, snippets of the articles, concepts from designated ontologies, RDF triples from Linked Life Data5, an ‘exact’ answer (e.g., a disease or symptom), and a paragraph-sized


summary answer. Hence, this task combines traditional information retrieval, with question answering from text and structured data, as well as multi-document text summarization. One of the


main tangible outcomes of BioASQ is its benchmark datasets. The BioASQ-QA dataset that is generated for Task B, contains questions in English, along with golden standard (reference) answers


and supporting material. The BioASQ data are more realistic and challenging than most existing datasets for biomedical expert question answering6,7. In order to achieve this, BioASQ employs


a team of trained experts, who provide annually a set of around 500 questions from their specialized field of expertise. Figure 1 provides the lifecycle of the BioASQ dataset creation, which


is presented in detail in the following sections. Using this process, a set of 4721 questions and answers have been generated so far, constituting a unique resource for the development of


QA systems. METHODS THE BIOASQ INFRASTRUCTURE AND ECOSYSTEM Figure 2 summarises the main components of the BioASQ infrastructure, as well as key stakeholders in the related ecosystem. The


BioASQ infrastructure includes tools for annotating data, tools for assessing the results of participating systems, benchmark repositories, evaluation services, etc. The infrastructure


allows challenge participants to access training and test data, submit their results and be informed about the performance of their systems, in comparison to other systems. The BioASQ


infrastructure is also used by the experts during the creation of the benchmark datasets and helps improve the quality of the data. In the following subsections, the different components of


the BioASQ ecosystems are described. EXPERT TEAM As the goal of BioASQ is to reflect real information needs of biomedical experts, their involvement was necessary in the creation of the


dataset. The biomedical expert team of BioASQ was first established in 2012, but has changed through the years. Several experts were considered at that time, from a variety of institutions


across Europe. The final selection of the experts was based on the need to cover the broad biomedical scientific field, representing as much as possible, medicine, biosciences and


bioinformatics. The members of the biomedical team hold positions in universities, hospitals or research institutes in Europe. Their primary research interests include: cardiovascular


endocrinology, psychiatry, psychophysiology, pharmacology, drug repositioning, cardiac remodeling, cardiovascular pharmacology, computational genomics, pharmacogenomics, comparative


genomics, molecular evolution, proteomics, mass spectometry, protein evolution, clinical information retrieval from electronic health records, and clinical practice guidelines. In total 21


experts have contributed to the creation of the dataset, 7 of whom have been involved most actively. The main job of the biomedical expert team is the creation of the QA benchmark dataset,


using an annotation tool provided by BiOASQ. With the use of the tool, the experts can set their questions and retrieve relevant documents and snippets from MEDLINE. Additionally, the


biomedical expert team assesses the responses of the participating systems. In addition to scoring the systems’ answers, during this process the experts have the opportunity to enrich and


modify the gold material that they have provided, thus improving the quality of the benchmark dataset. Regular physical and virtual meetings are organised with the experts. Partly, these


meetings aim to train the new members of the team and inform the existing ones about changes that have happened. In particular, the goals of the training sessions are as follows: *


Familiarization with the annotation and assessment tools used during the formulation and assessment of biomedical questions respectively. This step also involves familiarization of the


experts with the specific types of questions used in the challenge, i.e. factoid, yes/no, list and summary questions. At the same time, the experts provide feedback and help shaping the


BioASQ tools. * Familiarization with the resources used in BioASQ, both MEDLINE and various structured sources. The aim is to help the experts understand the data provided by these source,


in response to different questions they may formulate. * Resolution of issues that come up during the question composition and assessment tasks. This is a continuous process that extends


beyond the training sessions. Continuous support is provided to the experts, while the experts can also interact with each other and provide feedback on the data being created. DATA


SELECTION The QA benchmark is based primarily on documents indexed for _MEDLINE_. In addition, a wide range of biomedical concepts are drawn from ontologies and linked data that describe


different facets of the domain. The selected resources follow commonly used _drug-target-disease triangle_, which defines the prime information axes for medical investigations. The main


principle is shown in Figure 3. This “_knowledge-triangle_” supports the conceptual linking of biomedical knowledge databases and related resources. Based on this, systems can address


questions, linking natural language questions with relevant ontology concepts. In this context, the following resources have been selected for BioASQ. DRUGS: JOCHEM8, the Joint Chemical


Dictionary, is a dictionary for the identification of small molecules and drugs in text, combining information from UMLS, MeSH, ChEBI, DrugBank, KEGG, HMDB, and ChemIDplus. Given the variety


and the population of the different resources in it, Jochem is currently one of the largest biomedical resources for drugs and chemicals. TARGETS: GENE ONTOLOGY (GO)9,10 is currently the


most successful case of ontology use in bioinformatics and provides a controlled vocabulary to describe functional aspects of gene products. The ontology covers three domains: cellular


component, molecular function, and biological process. UNIVERSAL PROTEIN RESOURCE (UniProt11) provides a comprehensive, high-quality and freely accessible resource of protein sequence and


functional information. Its protein knowledge base consists of two sections: SwissProt, which is manually annotated and reviewed, and contains more than 500 thousand sequences, and TrEMBL,


which is automatically annotated and is not reviewed, and contains a few million sequences. In BioASQ the SwissProt component of UniProt is used. DISEASES: DISEASE ONTOLOGY (DO)12 contains


data associating genes with human diseases, using established disease codes and terminologies. Approximately 8,000 inherited, developmental and acquired human diseases are included in the


resource. The DO semantically integrates disease and medical vocabularies through extensive cross-mapping and integration of MeSH, ICD, NCI’s thesaurus, SNOMED CT and OMIM disease-specific


terms and identifiers. DOCUMENT SOURCES: The main source of biomedical literature is NLM’s MEDLINE and is accessible through PubMed and PubMed Central. PubMed, indexes over 34 million


citations, while PubMed Central (PMC) provides free access to approximately 8.5 million full-text biomedical and life-science articles. THE MEDICAL SUBJECT HEADINGS HIERARCHY (MeSH) is a


hierarchy of terms maintained by the US National Library of Medicine (NLM) and its purpose is to provide headings (terms), which can be used to index scientific publications in the life


sciences, e.g., journal articles, books, and articles in conference proceedings. The indexed publications may be searched through popular search engines, such as PubMed, using the MeSH


headings to filter semantically the results. This retrieval methodology seems to be in some cases beneficial, especially when precision of the retrieved results is important13. The primary


MeSH terms (called _descriptors_) are organized into 16 trees, and are approximately 30,200. MeSH is the main resource used by PubMed to index the biomedical scientific bibliography in


MEDLINE. LINKED DATA: During the first few years of BioASQ, the Linked Life Data platform was used to identify subject-verb-object triples related to questions. Linked Life Data is a data


warehouse that syndicates large volumes of heterogeneous biomedical knowledge in a common data model. It contains more than 10 billion statements. The statements are extracted from 25


biomedical resources, such as PubMed, UMLS, DrugBank, Diseasome, and Gene Ontology. This resource has been abandoned in recent editions of BioASQ, due to issues with the triple selection


process. QUESTION FORMULATION The members of the biomedical expert team formulate English questions, reflecting real-life information needs encountered during their work (e.g., in diagnostic


research). Figure 4 provides an overview of the most frequent topics covered in the questions generated so far by the experts. Each question is independent of all other questions and is


associated with an answer and other supportive information, as explained below. In addition to the training sessions mentioned above, guidelines are provided to the BioASQ experts to help


them create the questions, reference answers, and other supportive information14. The guidelines cover the number and types of questions to be created by the experts, the information sources


the experts should consider and how to use them, the types and sizes of the answers, additional supportive information the experts should provide, etc. The experts use the BioASQ annotation


tool for this process, which is accessible through a Web interface. The annotation tool provides the necessary functionality to create questions and select relevant information. The


annotation tool is designed to be easy to use, adopting a simple five-step-paradigm: authenticate, search, select, annotate and store. The authentication ensures that each question created


by a certain expert can be assigned to this given expert. The annotation process comprises the following steps: STEP 1: QUESTION FORMULATION The experts formulate an English stand-alone


question, reflecting their information needs. Questions may belong to one of the following four categories: YES/NO QUESTIONS: These are questions that, strictly speaking, require either a


“yes” or a “no” as an answer, though of course in practice a longer answer providing additional information is useful. For example, “_Do CpG islands colocalise with transcription start


sites?_” is a yes/no question. FACTOID QUESTIONS: These are questions that require a particular entity (e.g., a disease, drug, or gene) as an answer, though again a longer answer is useful.


For example, “_Which virus is best known as the cause of infectious mononucleosis?_” is a factoid question. LIST QUESTIONS: These are questions that require a _list_ of entities (e.g., a


list of genes) as an answer; again, in practice additional supportive information is desirable. For example, “_Which are the Raf kinase inhibitors?_” is a list question. SUMMARY QUESTIONS:


These are questions that do not belong in any of the previous categories and can only be answered by producing a short text summarizing the most prominent relevant information. For example,


“_How does dabigatran therapy affect aPTT in patients with atrial fibrillation?_” is a summary question. When formulating summary questions, the experts aimed at questions that they can


answer in a satisfactory manner with a one-paragraph summary, intended to be read by other experts of the same field. In all four categories, the experts aim at questions for which a limited


number of articles (min. 10, max. 60) are retrieved through PubMed queries. Questions which are controversial or that have no clear answers in the literature are avoided. Moreover, all


questions are related to the biomedical domain. For example, in the case of the following two questions: _Q_1: _Which are the differences between Hidden Markov Models (HMMs) and Artificial


Neural Networks (ANNs)_? _Q_2: _Which are the uses of Hidden Markov Models (HMMs) in gene prediction?_ Although HMMs and ANNs are used in the biomedical domain, _Q_1 is not suitable for the


needs of BioASQ, since there is not a direct indication that it is related to the biomedical domain. On the other hand, _Q_2 links to “gene prediction” and is appropriate. STEP 2: RELEVANT


CONCEPTS A set of terms that are relevant to each question is selected. The set of relevant terms may include terms that are already mentioned in the question, but it may also include


synonyms of the question terms, closely related broader and narrower terms etc. For the question “_Do CpG islands colocalise with transcription start sites?_”, the set of relevant terms


would most probably include the question terms “_CpG Island_” and “_transcription start site_”, but possibly also other terms, like the synonym“_Transcription Initiation Site_”. STEP 3:


INFORMATION RETRIEVAL Using the selected terms, the BioASQ annotation tool allows the experts to issue queries and retrieve relevant articles through PubMed. More than one query may be


associated with each question and each query can be enriched with the advanced search tags of PubMed. The search window (Figure 5) allows selecting information that is necessary to answer


the question. One of the main powers of the annotation tool is that it implements interfaces to different data sources of different types, i.e., unstructured, semi-structured or structured.


Given that we cannot expect domain experts to be familiar with Semantic Web standards, such as RDF, the annotation tool also implements an innovative natural language generation method that


converts RDF into natural language. The iterative improvement of the annotation tool has led to a framework that is widely accepted by the BioASQ biomedical expert team. Interestingly, a


study of the queries used by different experts to answer the same questions made clear that indeed “many roads lead to Rome”, i.e. different experts will use different queries for the same


question. Returning to the example question “_Do CpG islands colocalise with transcription start sites?_” a query may be “_CpG Island_” _AND_ “_transcription start site_”. Some of the


articles retrieved by this query are shown in Table 1. STEP 4: SELECTION OF ARTICLES Based on the results of Step 3, the experts select a set of articles that are sufficient for answering


the question. Using the annotation tool, they choose among the retrieved list of articles, the ones that contain relevant information to form an answer. STEP 5: TEXT SNIPPET EXTRACTION Using


the articles selected in step 4, the experts mark _every_ text snippet (piece of text) out of the articles selected in Step 4. Snippets can be easily extracted using the annotation tool


(Figure 6) and may answer the question either fully or partially. A text snippet should contain one or more entire and consecutive sentences. If there are multiple snippets that provide the


same (or almost the same) information (in the same or in different articles), _all_ of them are selected. Examples of relevant snippets are shown in Table 2. STEP 6: QUERY REVISION If the


expert judges that the articles and snippets gathered during steps 2 to 5 are insufficient for answering the question, the process can be repeated. The articles that the expert has already


selected can be saved before performing a new search, along with the snippets the expert has already extracted. The query can be revised several times, until the expert feels that the


gathered information is sufficient to answer the question. At the end, if the expert judges that the question can still not be answered adequately, the question is discarded. STEP 7: EXACT


ANSWER In steps 2 to 6, the expert identifies relevant material for answering the question. Given this material, the next step is to formulate the actual answer. For a yes/no question, the


exact answer is simply “yes” or “no”. For a factoid question, the exact answer is the name of the entity (e.g., gene, disease) sought by the question; if the entity has several synonyms, the


expert provides, to the extent possible, all of its synonyms. For a list question, the exact answer is a list containing the entities sought by the question; if a member of the list has


several synonyms, the expert provides again as many of the synonyms as possible. For a summary question, the exact answer is left blank. The exact answers of yes/no, factoid, and list


questions should be based on the information of the text snippets that the expert has selected, rather than personal experience. STEP 8: IDEAL ANSWER At this final step, the expert


formulates what we call an _ideal answer_ for the question. The ideal answer should be a one-paragraph text that answers the question in a manner that the expert finds satisfactory. The


ideal answer should be written in English, and it should be intended to be read by other experts of the same field. For the example question “_Do CpG islands colocalise with transcription


start sites?_”, an ideal answer might be the one shown in Table 3. Again, the ideal answer should be based on the information of the text snippets that the expert has selected, rather than


personal experience. The experts, however, are allowed (and should) rephrase or shorten the snippets, order or combine them etc., in order to make the ideal answer more concise and easier to


read. Notice that in the example above, the ideal answer provides additional information supporting the exact answer. If the expert feels that the exact answer of a yes/no, factoid, or list


question is sufficient and no additional information needs to be reported, the ideal answer can be the same as the exact answer. For summary questions, an ideal answer must always be


provided. Figure 7 presents the distribution of questions created each year of the challenge. Over the years, there is an increase in the number of factoid questions, and a decrease in the


number of list questions. The possible reason is that it is more difficult to find the material (i.e. articles and snippets) that are sufficient for answering a factoid question than a list


question. Table 4 presents the different versions of the BioASQ-QA dataset, including the number of questions, and the average number of documents and snippets. Each version of the training


dataset enriches its previous version with the new questions created the respective year. During the years that BioASQ has been running, three significant changes have taken place, in


response to feedback obtained by the experts and the challenge participants: * Since BioASQ 3 (2015), the focus of the experts is only on relevant articles and their contents. In other


words, the experts do not provide relevant concepts or statements, as it was found cumbersome and led to questionable results. Nevertheless, concepts are included in the gold dataset, as


they are added by the systems and assessed by the experts in the assessment phase). * Since BioSQ 4 (2016) only a sufficient set of articles, that allow the answer to be found with


confidence, is requested by the experts. This is again in contrast to earlier years, where the experts were asked to identify all relevant articles; something that proved to be unrealistic.


Again, if the participating systems retrieve more relevant documents, not identified in the annotation phase by the experts, these are added in the gold dataset, during the assessment phase.


* In early versions of the challenge, we considered using full-text articles from PubMed Central (PMC). Given the small percentage of the overall literature that appears in PMC, since


BioASQ 4 (2016) we decided to restrict the challenge to article abstracts only. ASSESSMENT Following each round of the challenge, the answers of the participating systems are collected and


assessed. Exact answers can be assessed automatically against the golden answers provided by the experts during the annotation phase. However, the ‘ideal’ answers are assessed manually by


the experts. In fact each expert gets to assess the answers to the questions they have created, in terms of _information recall_ (does the ‘ideal’ answer reports all the necessary


information?), _information precision_ (does the answer contain only relevant information?), _information repetition_ (does the ‘ideal’ answer avoid repeating the same information multiple


times? e.g., when sentences of the ‘ideal’ answer that have been extracted from different articles convey the same information), and _readability_ (is the ‘ideal’ answer easily readable and


fluent?). A 1 to 5 scale is used in all four criteria (1 for ‘very poor’, 5 for ‘excellent’). The assessment tool is designed to be a companion to the annotation tool and is implemented by


reusing most of its functionality. The tool can also be used to perform an inter-annotator agreement study. In that case, domain experts are provided with answers generated by other


(anonymous) domain experts and are asked to evaluate them. The design of the interface is such that the users can always see the answers/annotations only to questions that they are asked to


review (Figure 8). Moreover, the interface can adapt to different question types, by showing different answering fields for each of them. Finally, all information sources that were used to


answer the question can also be reviewed. By these means, domain experts can perform an informed assessment. The assessment tool plays a key role in the creation of the benchmark and the


quality assurance of the results generated by the experts during the BioASQ challenges. Moreover, the assessment tool allows the experts to improve their own gold answers and associated


material, based on the answers provided by the systems. In particular, the experts revise the documents and snippets returned by the systems, and enrich the gold answers with material


identified by the systems. This leads to an improvement of the benchmark datasets that are provided publicly. DATA RECORDS The dataset is available at Zenodo15 and follows the _JSON_ format.


Specifically, it contains an array of questions, where each question (represented as an object in the _JSON_ format) is constructed as shown in Table 5. TECHNICAL VALIDATION IMPROVING THE


STATE-OF-THE-ART PERFORMANCE The participation of strong research teams in the BioASQ challenge has helped to measure objectively the state-of-the-art performance in biomedical question


answering16,17,18. During BioASQ this performance has improved (Figures 10, 11, 12). It is particularly encouraging that the BioASQ biomedical expert team have assessed the ideal answers


provided by the participating systems as being of very good quality. The average manual scores are above 80% (above 4 out of 5 in Fig. 12). Still, there is much room for improvement and the


future challenges of BioASQ, as well as the benchmark datasets that it provides, will hopefully push further towards that direction. Based on the evaluation of the participating systems from


the experts, one very interesting result is that humans seem satisfied by “imperfect” system responses. In other words, they are satisfied if the systems provide the information and answer


needed, even if it is not perfectly formed. INTER-ANNOTATION AGREEMENT An inter-annotation agreement evaluation has been conducted in order to evaluate the consistency of the dataset. To


this end, during the first six years, a small subset of the created questions, was given to other experts, in order to compare the different formulated answers. The pairs of experts answer


the exact same questions. As each of them uses their own queries, they get a different list of possible relevant documents. The latter leads to the selection of different documents and


snippets to answer the questions, which leads to low mean F1 score (<50%) (Figure 9a). Nevertheless, in the formulated ideal answers, there is a high agreement between the experts (Figure


 9b). In other words, they reach the same or very similar answers, but following different paths. Another important point is that the BioASQ challenge, as well as the environment in which it


takes place, evolve. One consequence of this is the changes that we had to make in the data generation process in response to feedback from the experts and the participants. Additionally,


the evolution of vocabularies and databases cause complications. For example, each year’s data are annotated with the current version of the MeSH hierarchy, which is updated annually. In


addition, only the articles of the current year of annotation are used for formulating the answers, while articles that will appear in the future may also be of relevance. These are issues


that we need to handle and adapt to them, in order to have real-life, useful challenge and relevant dataset. The BioASQ challenge will continue to run in the coming years, and the dataset


will be further enriched with new interesting questions and answers. USAGE NOTES Up to date guidelines and usage examples pertaining the dataset can be found in:


http://participants-area.bioasq.org/ CODE AVAILABILITY BioASQ has created a lively ecosystem, supported by tools and systems that facilitate the creation of the benchmarks. All software is


provided with open-source licenses (https://github.com/BioASQ). In addition, the data produced are open to the public15. REFERENCES * National Library of Medicine. _Medline pubmed production


statistics._ https://www.nlm.nih.gov/bsd/pmresources.html (2022). * Chen, Q., Allot, A. & Lu, Z. LitCovid: an open database of COVID-19 literature. _Nucleic Acids Research_ 49,


D1534–D1540, https://doi.org/10.1093/nar/gkaa952 (2020). Article  CAS  PubMed Central  Google Scholar  * Nentidis, A., Krithara, A. & Paliouras, G. _BioASQ website._ www.BioASQ.org


(2022). * National Library of Medicine. The Medical Subject Headings (MeSH) thesaurus. https://www.nlm.nih.gov/mesh/meshhome.html (2022). * Linked Life Data (LLD). http://linkedlifedata.com/


(2012). * Wasim, M., Mahmood, D. W. & Khan, D. U. G. A survey of datasets for biomedical question answering systems. _International Journal of Advanced Computer Science and


Applications_ 8, https://doi.org/10.14569/IJACSA.2017.080767 (2017). * Jin, Q. _et al_. Biomedical question answering: A survey of approaches and challenges. _ACM Comput. Surv_. 55,


https://doi.org/10.1145/3490238 (2022). * Hettne, K. M. _et al_. A dictionary to identify small molecules and drugs in free text. _Bioinformatics_ 25, 2983–2991,


https://doi.org/10.1093/bioinformatics/btp535 (2009). Article  CAS  PubMed  Google Scholar  * Ashburner, M. _et al_. Gene ontology: tool for the unification of biology. _the gene ontology


consortium. Nat Genet_ 25, 25–29, https://doi.org/10.1038/75556 (2000). Article  CAS  PubMed  Google Scholar  * The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still


GOing strong. _Nucleic Acids Research_ 47, D330–D338, https://doi.org/10.1093/nar/gky1055 (2018). Article  CAS  PubMed Central  Google Scholar  * The UniProt Consortium. UniProt: the


Universal Protein Knowledgebase. _Nucleic Acids Research_ 51, D523–D531, https://doi.org/10.1093/nar/gkac1052 (2022). Article  Google Scholar  * Schriml, L. M. _et al_. Human Disease


Ontology 2018 update: classification, content and workflow expansion. _Nucleic Acids Research_ 47, D955–D962, https://doi.org/10.1093/nar/gky1032 (2018). Article  CAS  PubMed Central  Google


Scholar  * Doms, A. & Schroeder, M. GoPubMed: exploring PubMed with the Gene Ontology. _Nucleic Acids Research_ 33, W783–W786, https://doi.org/10.1093/nar/gki470 (2005). Article  CAS 


PubMed  PubMed Central  Google Scholar  * Malakasiotis, P., Androutsopoulos, I., Almirantis, Y., Polychronopoulos, D. & Pavlopoulos, I. _Tutorials and guidelines 2_


http://www.bioasq.org/sites/default/files/PublicDocuments/BioASQ_D3.7-TutorialsGuidelines2ndVersion_final_0.pdf (2013). * Krithara, A., Nentidis, A., Bougiatiotis, K. & Paliouras, G.


BioASQ-QA: A manually curated corpus for biomedical question answering. _zenodo_ https://doi.org/10.5281/zenodo.7655130 (2023). * Nentidis, A. _et al_. Overview of BioASQ 2020: The eighth


BioASQ challenge on large-scale biomedical semantic indexing and question answering. In _11th International Conference of the CLEF Association_, vol. 12260 of _Lecture Notes in Computer


Science_, 194–214, https://doi.org/10.1007/978-3-030-58219-7_16 (2020). * Nentidis, A. _et al_. Overview of BioASQ 2021: The Ninth BioASQ Challenge on Large-Scale Biomedical Semantic


Indexing and Question Answering. In _12th International Conference of the CLEF Association_, vol. 12880 of _Lecture Notes in Computer Science_, 239–263,


https://doi.org/10.1007/978-3-030-85251-1_18 (2021). * Nentidis, A. _et al_. Overview of BioASQ 2022: The Tenth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question


Answering. In _13th International Conference of the CLEF Association_, vol. 13390 of _Lecture Notes in Computer Science_, 337–361, https://doi.org/10.1007/978-3-031-13643-6_22 (2022).


Download references ACKNOWLEDGEMENTS Google was a proud sponsor of the BioASQ Challenge in 2020, 2021, and 2022. BioASQ was also sponsored by Atypon Systems inc. and VISEO. BioASQ is


grateful to the biomedical experts, who have created and manually curated the dataset, as well as to the participants during all these years. Also, BioASQ is grateful to NIH/NLM, who has


supported the project by two conference grants (grant n.5R13LM012214-02 and 5R13LM012214-03). For the first two years, BioASQ has received funding from the European Commission’s Seventh


Framework Programme (FP7/2007-2013, ICT-2011.4.4(d), Intelligent Information Management, Targeted Competition Framework) under grant agreement n. 318652.Last but not least, BioAQ is grateful


to all past collaborators, namely the University of Houston (US), Transinsight GmbH (DE), Universite Joseph Fourier (FR), University Leipzig (DE), Universite Pierre et Marie Curie Paris 6


(FR), Athens University of Economics and Business – Research Centre (GR). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Institute of Informatics and Telecommunications, National Center for


Scientific Research “Demokritos”, Athens, Greece Anastasia Krithara, Anastasios Nentidis, Konstantinos Bougiatiotis & Georgios Paliouras * School of Informatics, Aristotle University of


Thessaloniki, Thessaloniki, Greece Anastasios Nentidis Authors * Anastasia Krithara View author publications You can also search for this author inPubMed Google Scholar * Anastasios Nentidis


View author publications You can also search for this author inPubMed Google Scholar * Konstantinos Bougiatiotis View author publications You can also search for this author inPubMed Google


Scholar * Georgios Paliouras View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS G.P. and A.K. originated the BioASQ challenge and dataset


creation. All authors participated in the collection of the data, the development of the annotation and the evaluation tools, and validated the data. A.K. and A.N. drafted the manuscript.


All authors reviewed the manuscript. CORRESPONDING AUTHOR Correspondence to Anastasia Krithara. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL


INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. RIGHTS AND PERMISSIONS OPEN ACCESS This


article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as


you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party


material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s


Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.


To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Krithara, A., Nentidis, A., Bougiatiotis, K.


_et al._ BioASQ-QA: A manually curated corpus for Biomedical Question Answering. _Sci Data_ 10, 170 (2023). https://doi.org/10.1038/s41597-023-02068-4 Download citation * Received: 19


December 2022 * Accepted: 13 March 2023 * Published: 27 March 2023 * DOI: https://doi.org/10.1038/s41597-023-02068-4 SHARE THIS ARTICLE Anyone you share the following link with will be able


to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing


initiative


Trending News

Abc news – breaking news, latest news and videos

1 Trump administration ramps up arrests at immigration hearings 8:29Playing 2 GOP Sen. Bernie Moreno talks spending bill...

Nfl draft 2025 picks by round - espn

KHALIL MACK SAYS LOS ANGELES CHARGERS EXTENSION MADE SENSE San Diego Chargers outside linebacker Khalil Mack, who mulled...

Life on the spectrum: transition from childhood to adulthood

TOUGH ROAD TO ADULTHOOD As individuals with autism grow older, their autistic traits may become more pronounced due to i...

Newsletter: great reads: black-and-white film, paper boys and tom wolfe

Oct. 17, 2015 4:05 AM PT Hey there. I'm Kari Howard, and I edit the Great Reads (a.k.a. Column Ones) for the Los An...

Betting with surelock : the silicon sleuth

SURELOCK’s bets for today are $40 to win and place on Tempted Queen and a $10 exacta box, Tempted Queen-Jo Lo’s Joy, in ...

Latests News

Bioasq-qa: a manually curated corpus for biomedical question answering

ABSTRACT The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golden standard ...

Sentence for rape is 146 years, 8 months

A Northridge man was sentenced Thursday to 146 years and eight months in state prison for kidnaping and raping a Palmdal...

Indian nurse and two kids murdered in england, husband arrested

A 40-year-old woman from India and her two children were murdered at Kettering in Northamptonshire, England on Thursday,...

Inflation is rising in kenya: here’s why, and how to fix it

If people make illegal water or power connections, honest people pay for that. If a tender for building a road is inflat...

You may not be able to book your rail tickets without aadhaar in near future

You soon will be needing your Aadhaar card for booking rail tickets on IRCTC's website, claimed a Mumbai Mirror rep...

Top