BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale
Table 3
The statistics of datasets in intrinsic evaluation tasks.
There are six datasets in total. #groups: the number of groups in a dataset. Each group has a related set and an unrelated set of genes based on drug-gene interactions provided by CTD or gene sets provided by MSIGDB. #distinct concepts: the total number of distinct genes in a dataset. Avg #concepts per group: the average of number of genes in a group; note that one gene may be in multiple groups. #pairs: the total number of pairs in a dataset. Avg #pairs per group: the average of the number of pairs per group.