The Anatomy of a Vulnerability Database: A Systematic Mapping Study

Software vulnerabilities play a major role, as there are multiple risks associated, including loss and manipulation of private data. The software engineering research community has been contributing to the body of knowledge by proposing several empirical studies on vulnerabilities and automated techniques to detect and remove them from source code. The reliability and generalizability of the ﬁndings heavily depend on the quality of the information mineable from publicly available datasets of vulnerabilities as well as on the availability and suitability of those databases. In this paper, we seek to understand the anatomy of the currently available vulnerability databases through a systematic mapping study where we analyze (1) what are the popular vulnerability databases adopted; (2) what are the goals for adoption; (3) what are the other sources of information adopted; (4) what are the methods and techniques; (5) which tools are proposed. An improved understanding of these aspects might not only allow researchers to take informed decisions on the databases to consider when doing research but also practitioners to establish reliable sources of information to inform their security policies and standards.


Introduction
Software security has been always considered as a crucial non-functional requirement to meet when developing software [1].With the rise of novel technologies and devices, e.g., Internet-of-Things (IoT) devices empowered by artificial intelligence approaches, the need for secure software is becoming even more important to avoid malicious accesses to data and information treated by software systems [2].When it comes to software engineering, security refers to the design and implementation of programs that are resilient to external attacks [3], other than to the verification and validation mechanisms that might be put in place to manage it [4,5,6].
Software vulnerabilities are among the major threats to security [7].These are weaknesses introduced by programmers that may be exploited by externals to cause loss or harm during software maintenance and evolution [8,9].
The software engineering research community has been targeting the problem of vulnerabilities under from multiple perspectives, by understanding their life cycle [7,10], their impact on code quality and reliability [11,12,13], and defining a number of several automated approaches and tools to support their detection [14,15,16,17].
A significant amount of research done in the area, both in terms of empirical studies and approaches defined, relied on the elaboration of data coming from publicly available vulnerability databases.The mining of vulnerability repositories indeed represents a widely-adopted research approach that is useful to feed machine learning, deep learning, static and dynamic analysis, and other techniques used for detecting vulnerabilities [18,19].As such, the quality of the recommendations provided by empirical studies in literature and the detection results provided by automated approaches heavily depend on the quality of the information available in those repositories.
Our work stems from this consideration and aims at providing a comprehensive view of the sources of information typically used to study vulnerabilities and build automated approaches for software vulnerability detection.We address our goal with a systematic mapping analysis of the literature [20].Through this process, we identify and classify the existing literature on vulnerability databases in an effort of providing insights into their anatomy.We specifically investigate five aspects: 1) What are the most common security-specific public databases of security vulnerabilities employed by the research community; 2) What are the goals to employ vulnerability databases by the research community; 3) What are the other sources of information adopted to facilitate such goals; 4) what are the methods and techniques adopted; and 5) Which tools are proposed by adopting or for investigating vulnerability databases.Important as software security is, understanding the research domain of vulnerability databases via investigating these research questions shall certainly contribute to this critical area.The results of our work can indeed inform researchers on about the existing vulnerability databases and their characteristics so that they can take informed decisions on the databases to consider when designing the future approaches for vulnerability discovery.At the same time, an improved understanding of how vulnerability reports are created, stored, and managed may be useful for practitioners interested in enhancing their security policies and standards.Structure of the paper.Section 2 introduces the background information about vulnerability databases.Section 3 reports on the research method employed to conduct the systematic mapping study.In Section 4 we analyze the results addressing the five goals of the study.Section 5 presents the main discussion points and the implications coming from our analysis.The possible threats to validity of the study are discussed in Section 6. Section 7 discusses the related work.Finally, Section 8 concludes the paper and outlines our future research agenda.

Background Information
A vulnerability database collects, maintains, and disseminates information about discovered security vulnerabilities.The National Vulnerability Database (NVD) [21] is one of the influential vulnerability databases.It was created based on the list of Common Vulnerability and Exposures (CVE) [22] entries.Using CVEs and CVE identifiers ensures that unique vulnerabilities are discussed, and that information about the same vulnerability is shared by different parties.Many studies employed the NVD reports and the CVE entries to construct datasets for datadriven vulnerability detection and prediction.For example, Gkortzis et al. [23] searched the NVD reports to create a dataset VulinOSS that reports vulnerabilities of 8694 open-source components to analyze the diverse software characteristics and their relation to security vulnerabilities.Nikitopoulos et al. [24] analyzed the GitHub commits referenced by NVD and CVE entries to curate a labeled dataset, CrossVul, containing 27476 vulnerable source code files and the respective patched security counterparts retrieved from Git commits.The dataset can be used to train models to detect security patch commits.Similarly, to investigate an automated approach to identifying fixes for new vulnerability disclosures in SAP development units, Ponta et al. [25] manually collected data from both the NVD and the project-specific web resources to curate a dataset of vulnerabilities and mapped them to the fixes.
Additionally, there are vulnerabilities reported by other security advisory sources such as Security Focus, IBM's X-Force, etc.The key aspects of vulnerabilities in these different security databases are described differently and are complementary [26,27].To meet the different needs in in software security management, there have been studies to create a hybrid vulnerability database [28] by analyzing the CVE, NVD, X-Force databases or propose an ontology [29] to construct a hybrid security repository that incorporates the information security concepts from databases and their relations.
While many attempts have been made to curate datasets for investigating the security aspect of software components, a systematic study of research publications on the use of software vulnerabilities from different data sources remains under-explored.Specifically, there is a lack of comprehensive understanding of the motivation for using vulnerability datasets, the sources of information on security vulnerabilities, the methods and tools to adopt the databases, etc.To better understand these aspects, we conduct a systematic mapping study of the research on vulnerability databases.

Research Method
The goal of the systematic mapping study is to summarize the state of the art on the use of public vulnerability databases, with the purpose of deriving limitations and open challenges that researchers might need to address to further support practitioners and researchers in devising methodologies and tools to deal with software vulnerabilities.In the context of our research we elaborated a number of research questions that aim at targeting the problem under different perspectives.The metrics for answering each of the questions are the sorted list of categorized items summarized from the systematically selected articles.
These are listed in the following: • RQ 1 .What are the most common security-specific public databases of security vulnerabilities employed by the research community?
• RQ 2 .What are the goals to employ vulnerability databases by research communities?
• RQ 3 .What are the other sources of information adopted to facilitate such goals?
What are the methods and techniques adopted?
• RQ 5 .Which tools are proposed for adopting or for investigating vulnerability databases?
Our systematic mapping study adheres to the commonly adopted guidelines provided by Peterson et al. [20].In addition, we also followed the guidelines by Wohlin [30], which are connected to the adoption of the socalled "snowballing", i.e., the analysis of the references of primary studies aiming at extracting additional relevant sources to use when summarizing the current knowledge on a given subject.When reporting the research method adopted, we followed the recently defined ACM/SIGSOFT Empirical Standards 1 Given the nature of our study and the currently available standards, we followed the "General Standard" and "Systematic Reviews" definitions and guidelines.

Defining the Search Process
The main challenge of any systematic mapping study concerns with the definition of a search string that can lead to the retrieval of a complete set of studies to analyze [31].Our search strategy comprised a number of steps, namely the search terms identification, the specification of the resources to be searched, the definition of the search process, and finally the definition of article selection criteria.Search String.We first used the research questions to identify the major terms that we aimed at considering.As such, we started with the terms "software vulnerabilit*" and "software vulnerabilit* database".Secondly, for these terms, we found commonly used alternative spellings and/or synonyms.This step led to the inclusion of terms like "security vulnerabilit*", "security weakness*" for the original term "software vulnerabilit*", but also "dataset*" and "repositor*" as synonyms of "database".To check for consistency and completeness, we verified the presence of the keywords in any relevant paper that was initially scanned: the third step consisted of verifying the presence of any additional term that we did not originally included.In our case, this step did not return any terms.For this reason, we then proceeded with the usage of boolean operators to relate the various terms identified so far: we used the OR operator to concatenate alternative spellings and synonyms, while the AND operator was used to concatenate the major terms.The final outcome was represented by the following search string: "(security OR vulnerabilit* OR weakness*)" AND "(database* OR dataset* OR repositor*)" Resources to be searched.After establishing a search string, we defined the set of literature databases to search for.We first considered Scopus,2 which is the most extensive literature database up to date.For double-checking the results achieved from Scopus, we also considered IEEEXplore3 the ACM Digital Library,4 the ScienceDirect,5 and the citation database Web of Science that index articles published by a number of publishers.The selection of these databases was mainly driven by the popularity and potential level of completeness that they ensure.As a matter of fact, the selected databases are widely recognized as the most representative for of the research in software engineering [31], other than being used by many systematic literature and mapping studies [32,33,20,34].
It is worth pointing out that we consciously excluded Google Scholar6 from the set of databases: it does not include sources that have been already published, but also unpublished research (e.g., preprints currently available on open-access archives like for ArXiv and others).To avoid the analysis of articles that are still not peer-reviewed, we then decided not to rely on Google Scholar.
Inclusion and Exclusion Criteria.As for the inclusion criteria, these were mainly connected to the usefulness of an article to address our research objectives.
As described in Table 1, we included papers based on five criteria that map our research questions.
To be useful for addressing our research questions, the articles retrieved were scanned for consistency and adequacy.The full list of inclusion and exclusion criteria is available in Table 1.As shown, we first filtered out papers that were not written in English, that were duplicated, and not discussing topics connected to our research questions.In addition, we also excluded papers that are not peer-reviewed 7 and short papers that only present preliminary ideas.It is also worth remarking that, in cases where we recognized that an article represented an extension of a previously published paper, e.g., journal papers extending conference publications, we only kept the extension, hence filtering out the previous, preliminary version.In addition, we also screen out studies that do not employ any vulnerability databases in their main contribution; therefore, studies regarding vulnerability detection using static analysis are not included herein.

Applying the Search Process
After defining the key elements of our mapping study, we proceeded with the application of the search string on the search databases.We did not put any time restriction on the search process in an effort of collecting as many articles as possible and, therefore, be as complete as possible in our reporting.It is inevitable a certain number of potential primary studies are not included at first; however, the snowballing step shall compensate for the selection.
The search results are reported in Table 2, which shows how many papers have been identified when querying each of the considered databases after the exclusion step.The initial candidate set was composed of 1,736 papers, which was reduced to 1,140 after removing the duplicates.Afterward, the first two authors of this paper assumed the role of inspectors.They first conducted a pilot investigation, where they verified the validity of the exclusion and inclusion criteria: in this respect, they first independently analyzed an initial set composed of 50 articles, randomly selected from the candidate set.After the pilot, the inspectors met and discussed the results: this procedure did not eventually lead to modifications in the exclusion and inclusion criteria, possibly indicating their completeness and suitability for our study.
Once the inspectors had completed the pilot, they proceeded with the application of the exclusion criteria to the set of retrieved articles.This was still done by the inspectors independently.The analysis was first done based on the title and abstract.In cases where the inspectors were doubtful, they proceeded with the full read of papers.After the independent analysis, the two inspectors compared their results in order to reach a full consensus on the articles that should have been removed from the analysis.In case of disagreement, the inspectors first proceeded and read the entire article and then opened a discussion.If this was not enough, the other authors of the paper were involved in the decision making process.
This scanning process led to the exclusion of 1096 articles.The remaining 109 were further considered for inclusion.The inspectors proceeded with the full-text reading, still independently analyzing whether an article should be included or not.In so doing, they applied the inclusion cri- teria.After the independent analysis and joint discussion, 42 papers were accepted for the mapping study.Each of these articles was then subject of to another round of review, which was performed with the aim of applying a backward and forward snowballing process.With the forward snowballing, the inspectors looked at the articles that cited each of the papers.To accomplish this task, the inspectors relied on Google Scholar, which allows them to easily search for this information.As for the backward snowballing, the inspectors looked at the references of each paper in order to verify whether some relevant piece of research was missing.The backward snowballing process was repeated until no new papers were identified, i.e., the inspectors did not limit the search to the references of the articles accepted, but also went through the references of the cited articles, performing additional snowballing steps.The number of iterations was 2. Due to the initial exclusion step, the snowballing results in a considerable number of additional papers.Overall, the snowballing led to the identification of 27 extra-papers.Hence, the total number of papers led to 69.

Data Extraction
From the 69 primary studies (PS s ) selected previously based on the inclusion and exclusion criteria, we extract the according data therein and map the different data to the answering of each RQ.The extraction process was driven by the open coding research method, namely through an analytic process by which we could assign concepts (codes) to the observed data.In our specific case, we assigned a category to each paper based on the objective of our research questions.For instance, we assigned a code reporting the main goal of a paper with respect to the use of vulnerability databases in RQ2, while we tagged a paper based on the methods/techniques employed in the context of RQ4.The process was iterative and lead by the first two authors of the paper, who acted as the inspectors.
Specifically, the following steps are performed: • In the initial stage, the inspectors independently analyzed a subset of 10 articles and assigned codes with the aim of addressing RQ 2 , RQ 3 , and RQ 4 .The inspectors were free to assign as many codes as they found relevant.Afterward, they scheduled an in-person meeting to discuss about the codes assigned so far, in an effort of finding an agreement.The meeting lasted 1.5 hours.In the first place, the inspectors analyzed each of the ten papers and explained how they came up with the codes -this process was performed to increase the awareness of an inspector with respect to the choices made by the other.Secondly, they discussed their choices and found an agreement on the codes.Finally, they computed the inter-rater agreement through the Cohen's k coefficient, which measured 0.38, showing a low level of agreement.
• On the basis of the discussion had during the first meeting, the inspectors reworked the codes assigned.
Then, they took the other 20 papers into account and proceeded with a new classification exercise.In this stage, the inspectors mainly attempted to use the codes that previously emerged, yet they were allowed to define new codes whenever needed.At the completion of the task, the two inspectors scheduled • The inspectors reworked the codes of the previously coded papers.Afterward, they started the analysis of the remaining papers.Also, in this case, the inspectors were allowed to define new codes, if needed.
Once the task was completed, the inspectors planned a final in-person meeting to assess their work-this lasted around 2 hours.Two key insights emerged from such a meeting.First, no new codes were identified during the last coding exercise.As such, we reached the so-called theoretical saturation, namely the phase where the analysis does not propose newer insights and all concepts are developed.Second, the agreement between the inspectors scored 0.64, which may be interpreted as good.This further corroborates the completion of the data analysis.As a result, therefore, all papers were classified according to the concepts of interest.
• As a final step, we proceeded with additional validation of the codes assigned by the inspectors.In particular, the last three authors of the paper went through papers and codes in an effort of identifying possible inconsistencies and/or erroneous interpretations made during the first steps.This validation did not lead to further modifications.Hence, we could consider the data analysis completed.

Replicability
In order to allow replication and extension of our work by other researchers, we prepared a replication package8 for this study with the complete results obtained.

Analysis of the Results
Via the previously described process, we selected 69 papers.Amongst these selected papers (SPs), 56 of them are conference papers with 12 journal articles and one book chapter (shown in Figure 2).Meanwhile, when dividing these selected papers by the publication year, we can observe the trend of the research on vulnerability databases.Shown in Figure 3, the number of publications per year is stable from 2009 to 2019 when it increases in 2020 and 2021, which is likely due to the increased application of data-driven approaches.For the results in 2022, we only collected the ones published before 10.2022.

RQ1. What are the most common security-specific public databases of security vulnerabilities employed by the research community?
To answer RQ1, we identify the vulnerability databases employed in the selected papers and investigate which are the commonly adopted ones.In this work, we consider public platforms providing a record of existing vulnerability information as vulnerability databases, regardless of the format used to store such information.For example, CVE (a list of publicly disclosed cybersecurity vulnerabilities), NVD (A vulnerability database synchronized with CVE), and VMware Security Advisories (a list of security vulnerabilities in VMware products) are all considered  4.
65 of the 70 selected papers adopted either NVD or CVE out of the 25 26 vulnerability databases reported in Table 4 with 19 of them adopting both.The NVD includes various relevant databases that facilitate the automation of vulnerability management, security measurement, and compliance.It encompasses various information, such as security checklist references (i.e., CVE dataset), security related software weakness (i.e., CWE), impact metrics (i.e., CVSS), and so on.It is common for studies to extract information from all the datasets mentioned above when adopting NVD.For example, [SP14] studies the distribution of CVSS metrics in NVD.[SP20] studies the life cycle of a large number of vulnerabilities from NVD when, especially, the authors investigate the evolution of CVSSvector metrics and the general trend of CVSS score for short-listed vendors.Therefore, similarly, for this case, we consider it employs both NVD and CVSS.
As shown in Table 4, we can draw the conclusion that NVD and CVE are the most commonly adopted vulnerability databases.Besides, both OSVDB and X-Force have also been adopted in 12 studies, when the majority of those studies also adopt either NVD or CVE.[SP5] is the only study among them that adopts only OSVDB without using NVD or CVE.The authors investigate the vulnerability life cycle events by comparing the vulnerability disclosure date and the exploit creation date.For such a purpose, they emphasize OSVDB can provide the patch date and exploit data information.
Besides the existing public vulnerability databases displayed in Figure 4, some studies propose custom databases.For example, [SP36] creates a custom vulnerability database, Secbench, by mining GitHub projects for different vulnerability patterns.[SP1] proposes the design of a new proof-of-concept vulnerability database allowing effective data integration from multiple sources.[SP18] constructed another custom database, EKITS, from the vulnerabilities used in exploit kits from the black mar-ket.[SP48] created a custom database, Big-Vul, containing only C/C++ code vulnerabilities from open-source GitHub projects.These custom databases are not used in the other selected papers.
To summarize and compare the popular vulnerability databases, Tan et al. listed a set of 28 security vulnerability databases which can be divided by their publishers (government or enterprise) [35].By comparing the results, we find only 10 of the vulnerability databases identified herein are mentioned by [35].Meanwhile, many vulnerability databases mentioned in [35], especially the ones described in the Chinese language, are not often adopted in academia.

RQ 2 . What are the goals to employ vulnerability datasets by research communities?
Towards answering RQ2, we investigate the goals of the selected papers when employing their selected vulnerability databases.We summarize the goals of selected papers into the following categories.Among the 69 selected papers, we identify the following 8 main goals: • Analysis -The contribution of the paper is to provide analytical results showing the latent insights about one or multiple existing vulnerability databases.
• Merging -The contribution of the paper is to merge multiple existing vulnerability databases.
• Creation -The contribution of the paper is to provide the creation of new vulnerability databases by collecting security vulnerability information from other sources.
• Application -The contribution of the paper is to provide solutions to existing research gaps or industrial issues by adopting one or multiple vulnerability databases.
• Classification -The contribution of the paper is to provide vulnerability categorization or categorization approaches using one or multiple vulnerability databases.
• Enhancement -The contribution of the paper is to improve the quality of the existing vulnerability databases by adding information obtained from other sources.
• Comparison -The contribution of the paper is to provide a comparison between two or more existing vulnerability databases.
• Detection -The contribution of the paper is to provide approaches to detect vulnerabilities in software applications by adopting existing vulnerability databases.The Underground, is one of the world's most popular and comprehensive computer security web sites. [SP25]

Android Security Bulletins
The available Android Security Bulletins, which provide fixes for possible issues affecting devices running Android. [SP39]

CERT
The CERT/CC Vulnerability Notes Database is run by the CERT Division of Carnegie Mellon University.CVE CVE is a list of records, each containing an identification number, a description, and at least one public reference, for publicly known cybersecurity vulnerabilities.The Microsoft Security Response Center is part of the defender community and on the front line of security response evolution.For over twenty years we have been working to improve security for customers.Our mission is to protect customers and Microsoft from current and emerging threats related to security and privacy.

NVD
The NVD is the U.S. government repository of standards based vulnerability management data, it includes databases of security checklist references, security-related software flaws, misconfigurations, product names, and impact metrics.
The Open Sourced Vulnerability Database (OSVDB) was an independent and open-sourced vulnerability database.The goal of the project was to provide accurate, detailed, current, and unbiased technical information on security vulnerabilities.
SARD Software Assurance Reference Dataset (SARD) is to provide users, researchers, and software security assurance tool developers with a set of known security flaws.

[SP41], [SP46], [SP54]
Secunia Secunia Research criticality rating and Common Vulnerability Scoring System (CVSS) metrics are issued following distinct analysis including product context and related security best practices to allow for a greatly improved means of prioritizing by criticality.

SecurityFocus
The SecurityFocus Vulnerability Database provides security professionals with the most up-to-date information on vulnerabilities for all platforms and services.
[SP2], [SP3], [SP13], [SP30] Snyk.ioSnyk is a developer security platform.Integrating directly into development tools, workflows, and automation pipelines, Snyk makes it easy for teams to find, prioritize, and fix security vulnerabilities in code, dependencies, containers, and infrastructure as code. [SP43] VMware Security Advisories former VMware Tanzu Reports and Pivotal VulnerabilityReport.Now is a document remediation for security vulnerabilities that are reported in VMware products.

VulDB
Number one vulnerability database hosting and explaining vulnerabilities since 1970.-VulDB.

[SP33]
Vupen VUPEN Security offers defensive and offensive cyber security intelligence and advanced in-house vulnerability research X-Force IBM X-Force Exchange is a threat intelligence sharing platform enabling research on security threats, aggregation of intelligence, and collaboration with peers.Figure 5 summarizes the distribution of the different goals identified within the selected papers.
From Figure 5 we can state that the main goal when using vulnerability databases (more than 46% of the works) is to provide analytical insights.The second notable goal (detected in ∼ 30% of the works) is to merge multiple existing databases.All the other goals can be classified as marginal as none of them exceeds 16% in our usage distribution.
Table 5 presents the categorized contributions of the selected papers mapped to the summarized goals.
As we can observe from both Figure 5 and Table 5, 30 of the 69 selected papers contribute to analyzing the vulnerability databases as well as their related information.Therein, eight studies focus on investigating the connection between vulnerabilities and other relevant information.For example, [SP21] studies the correlation between the changes in issue density and those in vulnerability dis-  , investigate the correlation between vulnerabilities and software repository information, e.g., pre-release bugs, issues, commit messages and metrics.Meanwhile, seven studies investigate the life cycle of vulnerabilities while five studies the trends in vulnerability as well as their metrics.
[SP17] looks into the case of Firefox and the evolution of its source code, investigating the phenomena of "after-life vulnerabilities".
[SP5] analyzes quantitatively the vulnerability life cycle and the patch disclosure behaviors related.[SP43] studies the impact of vulnerabilities on the npm packages and their dependencies, regarding, the effectiveness of vulnerability discovery and fixing, as well as the related effect.11 selected papers focus on the application of vulnerability databases for different purposes.Therein, five of the papers propose methods to use the vulnerability data to predict the attributes of vulnerabilities.For example, [SP26] uses machine learning methods on NVD vulner-ability data to predict the time to the next vulnerability.
[SP38], [SP55], [SP62], and [SP64] also propose approaches using machine learning or deep learning to predict vulnerability severity, vulnerability relatedness, security entity relationship, and vulnerability types.The other four papers conduct research on extracting security information from vulnerability databases.[SP34] proposes a semantic web approach to extract the ontological representation of vulnerability databases as well as the traceability links toward software repositories.[SP63] proposes a deep learning method to extract key aspects of vulnerability information from unstructured vulnerability description.
Specifically, six papers propose approaches toward vulnerability classification.Therein, two propose methods to use classification models to predict potential vulnerability attributes.[SP8] proposes to use trained linear support vector machines (SVM) classifiers to predict whether and how soon a vulnerability is likely to be exploited.[SP29] also uses SVM classification method to detect suspicious commits.Meanwhile, four papers propose a classification based on different vulnerability information.[SP7] proposes a text clustering method on the vulnerability descriptions from NVD where 45 main clusters are detected as the main taxonomies.[SP10] proposes a classification framework using SVM on the diverse taxonomic features of multiple vulnerability databases, which also the phenomena that the majority of the security risks are harbored by a small set of services.[SP27] proposes an automatic categorization framework of vulnerabilities using text mining.[SP53] uses topic modeling to classify existing vulnerability topics towards OWASP top 10 vulnerabilities.
21 selected papers provide approaches to merge multiple vulnerability databases for various purposes.Seven papers propose merging approaches as is.For example, [SP3] proposes an approach towards unifying vulnerability information for attack graph construction; [SP4] proposes an ontological approach for vulnerability management to connect vulnerabilities in NVD with additional information, e.g., inference rules, knowledge representation, etc.Meanwhile, eight papers propose methods to merge existing vulnerability databases and potentially other sources of information towards creating new ones.For example, [SP1] proposes a new vulnerability database, Movtraq, by integrating general vulnerability information with additional environmental requirements information and vulnerability impact from CERT, Bugtraq, etc. [SP12] proposes an alliance model, named IVDA, which aims to integrate security databases managed by security organizations from different countries.This study also proposes the international vulnerability description (IVD) to identify vulnerabilities and avoid redundancy.Furthermore, two papers propose approaches toward new vulnerability database creation by extracting information.[SP36] proposes a new database of "real" security vulnerabilities, SECBENCH, by mining the millions of commits for 16 different vulnerability patterns from GitHub repositories.Herein, the authors refer to "real" vulnerabilities as the ones in contrast to the artificial hand-seeded faults used in empirical studies due to the challenges of vulnerability extraction or reproduction.adopted to facilitate such goals?For RQ3, we investigate what other resources of information are adopted by the selected papers that facilitate their studies towards the above mentioned goals.These information sources are categorized as follows.
• Vulnerability databases -The public platforms providing a record of existing vulnerability information (e.g., NVD, CVE, SecurityFocus, etc.) • Project data -The set of information regarding any software projects and products (e.g., GitHub projects, Derby, Chromium, etc.) • Identifier -The pre-defined indicator sets that facilitate fast and accurate correlation of configuration data across multiple information sources and tools (e.g., CCE, CPE, etc.) • Doc and Articles -The collections of documents and articles that contain security and vulnerability related information (e.g., Microsoft Knowledge Database, Cybersecurity news, etc.) • Bug report databases -The collections of bug reports from bug tracking systems or testing tools for specific software projects or from software collaboration platforms (e.g., Bugzilla, LaunchPad, etc.) • Others -The other sources that provide additional information.
Within the 69 selected papers, 38 of them employed other information sources besides vulnerability databases listed in Table 4. Therein, four main types of information have been identified with the number of selected papers adopting each type of source shown in Table 6.
Therein, 15 selected papers adopted specific software project data or software projects databases as the additional information sources.Amongst them, 11 papers utilized software repository information from GitHub.For example, [SP29] uses the commit data, specifically the vulnerability-contributing commits, from GitHub projects, together with the CVE database, conducting the largescaled mapping between them.[SP35] uses the source code of Android operating system from GitHub to investigate the comprehensive list of issues leading to Android vulnerabilities.[SP36] also uses the source code data from 248 GitHub projects, as well as the commit messages to investigate the different patterns of security vulnerabilities and attacks.Besides GitHub data, other software project data sources are adopted, including Maven project data ( Five papers use the data from bug report systems, e.g., Bugzilla and Bugtraq, supporting their study with vulnerability databases.For example, both Bugzilla and Bugtraq data are used in [SP11] as part of the database comparison.[SP19] uses bug report data from Bugzilla together with NVD data to investigate the impact of different vulnerability definitions on the "goodness-of-fitness" of vulnerability discovery models.[SP33] uses Bugzilla data together with the vulnerabilities information of five projects to investigate the relation between software metrics and existing vulnerabilities.[SP55] uses Bugzilla data as part of the training data, together with data from different issue trackers (e.g., Jira tickets, GitHub issues) towards the prediction of vulnerability relatedness.[SP66] proposes a high-coverage approach collecting known security patches by tracking multiple data sources including issue trackers like Bugzilla, GitHub projects, and information from Stack Overflow.
Meanwhile, the Common Platform Enumeration (CPE) Dictionary as a configuration identifier for vulnerabilities is also commonly adopted.For example, [SP4] uses CPE as one of the critical information sources for the proposed ontology for vulnerability management.[SP25] and [SP26] also use CPE for the integration of vulnerabilityrelated information for the purposes of database merging and vulnerability prediction respectively.Others CVSS an open framework for communicating the characteristics and severity of software vulnerabilities.

[SP2]-[SP4], [SP8], [SP14]-[SP16], [SP20], [SP22], [SP23], [SP26], [SP30], [SP38], [SP50], [SP60] OVAL
A community-developed language for determining vulnerability and configuration issues Security related documents and articles are also used to support the studies on vulnerabilities.For example, [SP36] aims to design a database for security testing with vulnerabilities mined from GitHub where the OWASP Top 10 2017 is used for the identification of a considerable amount of trending security patterns.[SP53] also adopts the OWASP Top 10 risks as the vulnerability types for the proposed topic modeling and classification of the CVE database.

ThreatPost and Microsoft Security Bulletin are also used as additional information sources supporting vulnerability database integration ([SP30]) and database comparison ([SP11]).
Furthermore, there are other types of vulnerabilityrelated information sources commonly adopted by a number of the selected papers.Therein, CVSS is the most adopted information utilized by 15 selected papers where it quantifies the evaluation of vulnerability impact.For example, five studies, [SP2], [SP14], [SP15], [SP16], and [SP50], investigate the trends and distribution of vulner-abilities in databases in terms of CVSS; Seven studies, [SP3], [SP4], [SP20], [SP22], [SP23], [SP30], and [SP50], propose approaches to merge vulnerability databases also taken into account CVSS as one of their key information resources; CVSS is also used in two studies towards vulnerability-related predictions, i.e., [SP26], [SP38]).In addition, OVAL, Jira issues, CAPEC, CRE, ERI, SCAP, Code gadgets, emails from the OSS project mailing list, and user contributed attacks and vulnerabilities are also used as information sources in 10 studies reported in Table 6.

RQ 4 . What are the methods and techniques adopted?
For RQ4, we investigate and summarize the methods and techniques applied by the selected papers in terms of how they utilize the vulnerability databases and other information sources towards the goals mentioned above.The methods and techniques are categorized as follows.
• Info Integration -The approach is a combination of traditional methods to match relevant information manually (e.g., reading vulnerability reports and matching them to source code), using pre-defined identifiers (e.g., using CVE Id match vulnerabilities from different vendors) or with source code (e.g., using basic string manipulation and compare methods to match users' report to NVD vulnerabilities).
• Statistics -Statistical methods are applied to the vulnerability-related information sources to gain further insights.
• Machine Learning -Using machine learning algorithms (e.g., Naive Bayes, Logistic Regression, Decision Tree, etc.) to support the analysis of vulnerability database information.
• Deep Learning -Using deep learning algorithms (e.g., CNN, RNN, etc.) to support the analysis of vulnerability database information.
• Data Collecting -Collecting vulnerability data from public databases or other sources via web scraping or other crawling techniques to support the analysis of vulnerability database information.
• Text Analysis -Analysing the collected textual data manually or using NLP techniques to extract insights from vulnerability-related information.As shown in Figure 6, within the 69 selected papers, nearly half (33 papers) adopt the conventional information integration methods to analyze, merge or utilize vulnerability databases.Therein, for 19 papers, such a method serves the purpose of database merging.Four of the papers propose the construction of common securitybased databases using identifiers.For example, [SP6], [SP20], and [SP22] propose the integration of information from multiple existing vulnerability databases using CVEID as an identifier.[SP44] propose similar integration of databases but also mention, besides CVEID, projectspecific identifiers can also be used.Two of the papers propose a common schema of relations between security info to integrate vulnerability databases and other sources.Both [SP25] and [SP37] propose ontological approaches to integrate security information, including both dynamical and static content, via a common schema of the relations.Meanwhile, four of the papers ([SP1], [SP3], [SP4] and [SP52]) extract information from texts, such as the description of the vulnerabilities, to support the integration.Commit histories are also used as the connector between vulnerabilities and project commits by [SP48] and [SP58].[SP12] proposes to use systematic policies and language to archive international vulnerability databases using Vulnerability Citation Index (VCI) as a unified identifier to avoid redundancy.There are also four papers that do not specify by what identifiers to merge the databases and/or other information.
Furthermore, 26 of the papers combine information from vulnerability databases with possibly other sources for database integration or analyzing and exploring indepth insight therein.For example, [SP51] builds a prototype system that accesses and integrates information, such as, exploit script, the configuration of software, proof of vulnerability, and vulnerability description .The system can be used to store and automatically process vulnerability information for cloud-native application vulnerability database.[SP56] integrates the information of vulnerabilities and that of vulnerability resolution commits to analyze the typical security issue types and the security issues reaction times.[SP18] construct a new database by integrating the standard vulnerability form NVD with the ones currently used in exploit kits from the black market.[SP30] and [SP23] propose high-abstraction level system design which indicates the integration of multiple vulnerability databases and other sources of information.
In the results, nearly one third (21) of the selected papers adopt statistical methods.For example, [SP2] uses statistical distribution, e.g., Exponential, Pareto, and Weibull, to analyze the exploit availability of vulnerabilities.Similarly, [SP5] uses the probability distribution to characterize the vulnerability life cycle and exploit creation events when [SP11] analyzes the distribution of vulnerability severity ranking levels on NVD and the trend of these severity ranking levels by years.[SP28] adopts the Cohen's D static [36] to examine the amount of overlap between neutral and vulnerable files with respect to a number of bugs.[SP49] uses the Wilcoxon rank-sum test to analyze the statistical significance amongst classifiers.Furthermore, correlation analysis is also applied, for example, [SP21] analyzes the correlation between issue density and annual vulnerability.
Meanwhile, AI-based approaches, including machine learning (ML), deep learning (DL), and natural language processing (NLP), are also commonly applied to the large volume of vulnerability data.Therein, several studies propose ML-based methods for vulnerability classification.For example, [SP64] adopts Linear Support Vector, Naive Bayes and Random Forest Classifier to classify and predict vulnerability types.[SP29] also uses Linear Support Vector Machines (SVM) to classify commits data towards detecting the ones that contribute to vulnerabilities.[SP27] uses also SVM algorithm with Radial kernel function to train and classify the vulnerability data in the target database.[SP41] proposes using deep-learning algorithm, e.g., CNN or RNN, on the vectorized representation of regularized code data toward vulnerability analysis.[SP62] also uses CNN-based graph attention network toward the prediction of security entity relationships.Furthermore, NLP-based methods are also commonly applied.For example, [SP45] uses Supervised Topical Evolution Model (STEM) on a large volume of vulnerability-describing reports from NVD to analyze the evolving trends of the vulnerabilities.[SP53] also utilizes topic modeling methods to classify the entries of CVE database.
To be noted, some of the papers do not apply AI-based approaches but conduct crawling of the vulnerability data.For such situations, the method is marked as "data collecting".For example, [SP36] extracts the indications of a vulnerability fix or patch committed by the developers from GitHub projects.[SP42] propose a text mining approach to predict invalid vulnerability reports.To identify invalid CVEs, they extract text features using term frequency.However, no NLP tools or techniques are mentioned when conducting such tasks.To clarify the differences, we categorize the method as "text analysis" instead of "NLP".4.5.RQ 5 .Which tools are proposed for adopting or investigating vulnerability databases?For RQ5, we summarize what tools are proposed to support the research on vulnerability databases.Towards answering this research question, we investigate the tools proposed by the selected studies that utilize vulnerability databases, and possibly the other information sources identified in Table 6.Within the 69 selected papers, three of them propose tools that adopt vulnerability databases.
The proposed tools include: ) is a deep learning-based multi-class vulnerability detection tool upgraded from the original VulDeePecker [37].The original VulDeePecker uses Bidirectional Long-Short Time Memory (BLSTM) neural network to detect software vulnerabilities.As a binary classifier, it can only report the target source code being vulnerable or not without detecting the vulnerability types.The µVulDeePecker tool adopts the novel concept of code attention and uses a novel feature fusion-oriented three BLSTM networks towards multi-class vulnerability detection with vulnerability types identified.For this tool, 181641 code gadgets, labeled either vulnerable or non-vulnerable, are obtained from SARD and NVD covering 40 vulnerability types, which are used as training and testing datasets.

• VCCFinder ([SP29]
) is a code analysis tool that flags suspicious commits by using an SVM-based detection model.A large-scale database mapping CVE database to collected vulnerabilitycontributing commits (VCCs) is built for the evaluation.

• EVMAT ([SP69]
) is a dashboard solution for monitoring enterprise vulnerability levels for proper enterprise risk management.It can also automatically gather system characteristics based on OVAL and further evaluates software vulnerabilities installed in a computer resource based on the data retrieved from NVD.The tool can also provide a quantified evaluation of the vulnerability score of an enterprise.
The results show that compared to the number of selected papers, the number of proposed tools is limited.Therein, though public vulnerability databases are adopted, many are not integrated into tools directly.For example, in [SP54], SARD and NVD are used to create training and testing datasets.These datasets are not integrated as tool components.

Discussion and Implications
The results of the research questions elaborated within the scope of our systematic mapping study allow us to provide a number of implications for researchers and practitioners, which we discuss in the following.

On vulnerability databases and their adoption.
By summarizing the previous studies on vulnerability databases, we could observe that NVD and CVE are, by far, the most commonly used vulnerability databases by researchers.At the same time, we discovered that several countries have their own national security vulnerability databases, e.g., FrSIRT from France, JVN from Japan, or CNNVD from China.So far, the main purpose of those databases is to serve as a collection and standardized reference to known vulnerabilities.Despite the availability of these alternative databases, we pointed out that researchers have rarely adopted them for research purposes.Indeed, only the work by Zheng et al. [SP12] proposed a framework aiming at exploiting vulnerability databases from different countries in an effort of providing a common framework.
Perhaps more importantly, the research effort conducted to merge the pieces of information available in these vulnerability databases is still limited.According to our mapping study (see Table 6), most of the studies that attempted to merge data from different sources only focused on Github data, CPE, and CVSS, therefore neglecting the potential contributions brought by the alternative databases.
When considering this perspective, multiple challenges and open questions arise.In the first place, we call for more research aiming at assessing the ecological validity of the findings achieved by researchers so far.Indeed, the generalizability of the conclusions drawn in literature might be potentially be threatened by the specific characteristics of NVD and CVE and, therefore, additional insights might be identified when considering alternative databases.In the second place, the research efforts conducted to merge vulnerability data might be extended to a more comprehensive set of vulnerability databases, further contributing to the generation of more robust benchmarks, more generalizable, and deeper analysis/understanding of the connections between software vulnerabilities in the wild.Last but not least, a few attempts have been made to enhance existing vulnerability databases, despite the availability of other sources of information that can be used to complement them.In this sense, the findings of our mapping study may serve as a basis for novel datasets, benchmarks, and empirical investigations into vulnerabilities.
Implication #1.Our findings represent a call for researchers working in the area of software security, software vulnerability analytics, and empirical software engineering, who are called to revisit previous findings on more comprehensive sets of vulnerability databases and provide more information about the ecological validity of the findings reported in the literature.Implication #2.The variety of vulnerability databases identified in our work represents an opportunity for researchers to build novel, unified data sources, and benchmarks, which might be exploited to better understand the nature and the relations between software vulnerabilities.The additional sources of information available on software vulnerabilities might serve as a further instrument to enhance existing databases.Moreover, we foresee the opportunity for companies to develop new integrated data sources and provide direct API access to developers that need to continuously check the existence of possible vulnerabilities on the software component they are using.
On the limitations of vulnerability databases.As part of our systematic mapping, we could discover that only a limited set of studies employed vulnerability databases in the context of vulnerability detection (e.g., [SP54]), database comparison (e.g., [SP13]), and database quality improvement (e.g., [SP24]).Instead, these studies preferred uncovering vulnerability data by mining other data sources [38].This clearly indicates that the current structure and information provided by the existing vulnerability databases are limited, preventing their wider adoption in research.Very likely, most of the limitations are due to the limited amount of metadata provided to researchers.For instance, let us consider the case of vulnerability detection approaches.Most of these approaches aim at detecting vulnerabilities at line-, function, or file-level.However, most vulnerability databases do not provide fine-grained pieces of information on the location of vulnerabilities that can be actually exploited by vulnerability detection approaches.As such, the practical usefulness of vulnerability databases is threatened and our study promotes further considerations on the way vulnerability databases can be helpful for software security, both from the researcher's and practitioner's perspective.For instance, we may envision a stronger, collaborative effort involving researchers, practitioners, and government agencies with the aim of revisiting the way the vulnerabilities stored in public databases are collected and made available: in this respect, novel data collection and reporting guidelines might be devised to let the contributors of those databases be aware of the need of curating vulnerabilities with additional meta-data.At the same time, the limited amount of literature targeting data quality contributes to having a few insights into the information needed by researchers.We, therefore, call for further research efforts on the matter, as works aiming at integrating information within those databases might be a valuable contribution to the field and enable additional research on software vulnerabilities-as also pointed out in the previous discussion and implication point.
Implication #3.Current vulnerability databases lack fine-grained information and metadata, hence threatening their suitability for research targeting vulnerability detection.More research on data quality and information needs should therefore be considered in an effort of establishing new ways through which vulnerability databases can support the research and practice on software security.We recommend practitioners and in particular the maintainers of the vulnerability databases, to complement the information, introducing fine-grained meta-data and considering data quality aspects.
On the actionability of vulnerability databases.
One of the most surprising outcomes of our systematic mapping study was concerned with how vulnerability databases have been used by researchers.Most studies employed databases to conduct empirical analyses, while only two tools were proposed within the 64 selected papers.Two other tool-proposing papers were identified during the "full-reading" step but were rejected as a consequence of the inclusion and exclusion criteria.Grieco et al. [39] proposed VDiscover, a tool using state-of-the-art machine learning techniques to detect and predict vulnerabilities in test cases.The proposed tool utilizes a customized dataset built from the test case data from the Debian bug tracker; however, the study does not utilize any public vulnerability databases.Liu et al. [40] proposed CBTracer to monitor software runtime executions by catching its real-time I/O traffic, which continuously builds a security database including vulnerability discovery and exploit generation.The data sources used for CBTracer include exploit challenges, e.g., Capture The Flag (CTF) challenges and Cyber Grand Challenge (CGC), with no vulnerability databases adopted.Our findings, along with the further evidence provided by the papers by Grieco et al. [39] and by Liu et al. [40], further highlight the limited suitability of existing vulnerability databases for research purposes other than empirical analysis.This statement is also supported by the fact that these tools are neither industrialized nor properly maintained, meaning that the tools devised on the basis of existing vulnerability databases seem to build on insufficient or incomplete pieces of information.In this sense, our findings suggest that more research should focus on how to make vulnerability databases actionable for industrially-relevant research or suitable for practitioners interested in exploiting vulnerability data for analytical instruments.
Implication #4.Current vulnerability databases seem not to provide enough support for building tools and analytical instruments.More research should target the actionability of the current databases and possibly inform researchers on how to complement the available pieces of information with industrially-relevant insights.

Threats to Validity
In this section, we follow the three categories of threats to validity in software engineering secondary studies proposed by Ampatzoglou et al. [41].Compared to the fourcategory guideline by Wohlin et al. [42], this categorization is more suitable for secondary studies in the software engineering domain.Herein, we discuss the Study Selection Validity, Data Validity, and Research Validity of our study and the potential mitigation to their impact.
Study Selection Validity: In this study, the search strategy, review protocol, and the data extraction process were entirely based on established systematic mapping guidelines [20,43,44].By doing so, we reduced the threats to the initial search and study filtering processes of the secondary study planning phase.Especially, the search string was formulated to include keywords identified from research questions and diversified using synonyms.Although the automated search covers most publications, we admit that potential issues and limitations may arise during the search process, such as a) limitations on the search string and b) searching only on article titles, which may miss some relevant studies.To mitigate the search limitations and extend the coverage of studies, we used the snowballing as the complementary.We reviewed all the references listed in the selected studies and evaluated all the papers that reference the selected ones, which resulted in 26 additional relevant publications.The inclusion and exclusion criteria were defined and piloted to assist the study selection.The criteria are in line with the goal and research questions of the paper and following the guidelines recommended by Petersen et al. [20].Two authors conducted the study selection independently.The other authors were involved in the discussion to resolve the disagreement.Furthermore, the study selection was conducted in December 2021.At the time of preparing the submission of this study report in 2022, we executed the queries to search for relevant studies published since our last queries, and the new queries resulted in 3 additional studies which were also included in the analysis.This reduces the potential threat of an incomplete report.
Data Validity: Regarding the data extraction process, a similar procedure is conducted where the first two authors carried out an iterative and analytic process driven by the open coding method to identify the classification schema.The last three authors further reviewed and validated the codes assigned to all the selected studies.For example, the three sets of categories for RQ 2 , RQ 3 , and RQ 4 extracted from the open coding method largely reduced the bias in classification schema and the mapping of data.For the data analysis process, thanks to the pre-defined categories, the extracted results can be easily summarized and displayed in the forms of bar-charts and tables.On the other hand, publication bias is also a potential threat to data validity where methods, techniques and usage goals from companies are not included sufficiently due to confidential policies, which is hard to be mitigated.Such a perspective can be further investigated via industrial surveys in future studies.
Research Validity: The study can be replicated by following the replication documentation and the steps meticulously.The search strings and details on the systematic mapping study process are all described in detail in Appendix B, by which the scholar can easily replicate the study.Before the start of this study, multiple discussion sessions were organized by all the authors to determine the research method.As the decision on adopting a systematic mapping study was agreed upon by all authors, it shall mitigate the threat of the research method bias.After the selection of research method, all the authors also determined the research question together via several iterations.

Related Work
Software security vulnerabilities are a constant threat to the software industry.The exploitation of vulnerabilities can lead to unauthorized breaches and cause significant financial losses and reputational damage to both software companies and customers.
As an early form of quality assurance in software development, software vulnerability prediction is a data-driven process to leverage historical software vulnerability knowledge for classifying vulnerable code modules.McKinnel et al. [15] performed a systematic literature review to investigate the use of artificial intelligence and machine learning techniques for software vulnerability assessment and their performance.The authors selected 31 relevant studies and identified the scalability and the need for real-time identification of exploitable vulnerabilities as the research chal-lenges and opportunities.The authors' findings indicate the increasing attempts to leverage AI in vulnerability assessment.Similar findings are further reported in another systematic literature review by Eberendu et al. [45] to investigate approaches to software vulnerability detection.The authors selected 55 studies published between 2015 and 2021.The results showed that besides the static and dynamic analysis, machine learning and deep learning approaches were mostly used to detect software vulnerability.Although there are studies on tools and applications for software vulnerability detection and prediction, an investigation of the use of data sources for such tools and approaches is lacking.
A recent study by Croft et al. [46] reports challenges and solutions to data preparation for vulnerability prediction.Based on the 61 selected studies, the authors identified 16 data challenges, most of which were related to data generalizability, accessibility, label collection effort, etc.The results show that the complexity of real-world vulnerabilities and the difficulty in preparing vulnerability datasets form the major barrier to the adoption of vulnerability prediction in the industry.Similar findings have been reported in other studies on vulnerability assessment [47,48,15,49].The studies show a clear need of creating high-quality datasets that provide data provenance and comprehensive information for better sharing and governance of vulnerability data [46].
There are also other studies investigating public vulnerability databases via systematic mapping study or systematic literature review.Alqahtani conducted a survey on 99 relevant software engineering research articles towards investigating the use of vulnerabilities databases in the software engineering domain [50].The study focuses on the security topics covered in software engineering studies as well as in different software engineering activities.While Alqahtani's research reports on which are the commonly adopted vulnerability databases but does not investigate the methods, the related information sources, or the tools.Lin et al. also conducted a survey reviewing the literature on building high-quality vulnerability datasets [51].The study aims to investigate how data mining and data processing techniques are adopted to generate vulnerability datasets to facilitate vulnerability discovery.However, Lin et al.'s study is neither a systematic mapping study nor a systematic literature review.
We compare our study to the previously mentioned systematic mapping study on vulnerability databases, i.e., [50], regarding the overlap of research questions, covered periods, number of selected studies, etc.Details of the comparison are present in Table 7.
Alqahtani's study focuses on the vulnerability databases used in software engineering domain when our study does not apply exclusions on other domains.Only five selected papers are identical (i.e., [SP2], [SP5], [SP9], [SP28], [SP20]) between the two studies.Such a difference is caused by the study focus.Alqahtani's study include papers that use vulnerability databases on the level of in-  [52] on a hybrid taint analysis tool that completely decouples the program execution and taint analysis is included by Alqahtani.The study selects 10 individual vulnerabilities from CVE database as tests to evaluate the accuracy of our offline symbolic taint analysis in the task of software attack detection when it is neither using nor applying methods or techniques upon CVE database as a whole [52].Therefore, similar studies are excluded in our study.Furthermore, our study covers more recent studies from the year 2017 to 2022 compared to Alqahtani's paper.Alqahtani's paper contributes specifically on the identification of security topics and their changes overtime.Comparatively, our paper focuses on investigating the use of vulnerability database as a whole, as well as the methods, techniques, external information sources and tools adopted or created for such a purpose.
Especially, regarding the results coverage, our study and Alqahtani's paper have four commonly used vulnerability databases in common, i.e., NVD, CVE, OSVDB, Se-curityFocus.OWASP and CWE are also included as common vulnerability databases in Alqahtani's paper.Our paper considers OWASP as external information source because OWASP Top 10 is a standard awareness document for developers and web application security 9 .Meanwhile, CWE, as a community-developed list of software and hardware weakness types10 , is only used together with NVD or CVE.Compared to Alqahtani's paper, we also find there are 22 other vulnerability databases used in our selected papers, which are listed in Table 7.

Conclusion
Vulnerability databases have been playing a crucial role in collecting, maintaining, and disseminating information about discovered software vulnerabilities, which contributes to software security.Along with the advance in computing methods and the growth of the data, it is highly required to understand how vulnerability databases are used.We conducted a systematic mapping study on the academic literature published before October 2022 in order to examine the existing body of knowledge in the adopted vulnerability databases.Based on the 69 selected papers, we investigate what are vulnerability databases commonly utilized, what other sources of information are also used, what are the goals of using them, what methods are adopted, and what tools are proposed.The summarized results show that NVD and CVE are the most commonly adopted in vulnerability database related studies with various other sources of information also adopted, e.g., software project data, bug reports, security documents, and articles, etc.Meanwhile, besides the general methods of information integration, data-driven methods are commonly adopted when studying vulnerability data.The goals of the studies are mainly focusing on the analysis, classification, comparison, enhancement, merging, and general analysis of the vulnerability database themselves while vulnerability detection is seldom studied with vulnerability databases utilized.3. The inspectors reworked on the codes.Afterwards, they started the analysis of the remaining papers.Also in this case, the inspectors were allowed to define new codes, if needed.Once the task was completed, the inspectors planned a final in-person meeting to assess their work-this lasted around 2 hours.Two key insights emerged from such a meeting.First, no new codes were identified during the last coding exercise.As such, we reached the so-called theoretical saturation, namely the phase where the analysis does not propose newer insights and all concepts are developed.Second, the agreement between the inspectors scored 0.64, which may be interpreted as good.This further corroborates the completion of the data analysis.As a result, therefore, all papers were classified according to the concepts of interest.

CRediT authorship contribution statement
4. As a final step, we proceeded with an additional validation of the codes assigned by the inspectors.In particular, the last three authors of the paper went through papers and codes in an effort of identifying possible inconsistencies and/or erroneous interpretations made during the first steps.This validation did not eventually lead to further modifications.Hence, we could consider the data analysis completed.

Data items
To identify the main characteristics of each selected study, the context related data items were collected.The list of data items is below.
• Goal to employing vulnerability databases in each selected study were collected.The goals of study are summarized to address RQ2.
• Methods and techniques applied to achieve the goals were identified to address RQ4.
• Vulnerability Databases used in each study were collected and summarized to address RQ1.
• Other databases and information adopted in each study were collected to summarize the range of information sources used for achieving the goals, which addresses RQ3 and complements RQ1.
• Creating new vulnerability databases or merging from existing databases for custom vulnerability databases were identified and collected for each study, addressing RQ2 and complementing RQ1.
• Proposed tools for investigating vulnerability databases and other information sources were collected from each selected study to address RQ5.
The data was collected using the open coding research method, that is to say, through an analytic process by which we could assign concepts (codes) to the observed data, addressing RQ2, RQ3, and RQ4.It is described in the Data Collection Process.

Study Risk Of Bias Assessment
The processes of study selection and data collection were conducted through independent analysis and joint discussion.The details have been explained in the Selection Process and Data Collection Process.In the data collection process, the Cohen's k coefficient score was calculated at each step to measure the inter-rater agreement on the codes assignment by inspectors.The scores indicated substantial improvement of the reliability for codes in the data collection process.

Effect Measures
Since the research questions do not involve identifying the definition of outcome metrics used in empirical studies, the effect measures were not applied in this systematic mapping study.

Analysis and Summarizing methods
The characteristics of selected studies were sought following the data items and recorded in a shared excel file which can always be revisited.We summarized the data by identifying themes emanating from the identified codes.These identified themes gave us the categories reported in the Results section.The charts were created with selected data using Excel's chart creator, and tables were created based on the recorded data.

Reporting Bias Assessmen
Not relevant for mapping studies Certainty Assessment Not relevant for mapping studies

Figure 1 :
Figure 1: The Search and Selection Process another in-person meeting to open a new discussion on their work.The second meeting lasted 1 hour.The Cohen's k coefficient scored 0.49 (moderate), hence showing a substantial improvement.

Table 9 : 2 .
SEGRESS checklist for the secondary study methods and our mapping study (continues) On the basis of the discussion had during the first meeting, the inspectors reworked on the codes assigned.Then, they took other 20 papers into account and proceeded with a new classification exercise.In this stage, the inspectors mainly attempted to use the codes previously emerged, yet they were allowed to define new codes whenever needed.At the completion of the task, the two inspectors scheduled another in-person meeting to open a new discussion on their work.The second meeting lasted 1 hour.The Cohen's k coefficient scored 0.49 (moderate), hence showing a substantial improvement.

Table 1 :
Inclusion and Exclusion Criteria

Table 2 :
Initial Literature Search by Library

Table 3 :
Initial Literature Search by Library

Table 4 :
Vulnerability Databases provides the U.S. Department of Energy with incident response, reporting, and tracking, along with other computer security support.

Table 6 :
Other Information Sources Adopted

Table 7 :
Comparison to Related Systematic Studies