feature FDA drug labeling: rich resources to facilitate precision medicine, drug safety, and regulatory science

Here, we provide a concise overview of US Food and Drug Administration (FDA) drug labeling, which details drug products, drug–drug interactions, adverse drug reactions (ADRs), and more. Labeling data have been collected over several decades by the FDA and are an important resource for regulatory research and decision making. However, navigating through this data is challenging. To aid such navigation, the FDALabel database was developed, which contains a set of approximately 80 000 labeling data. The full-text searching capability of FDALabel and querying based on any combination of speciﬁc sections, document types, market categories, market date, and other labeling information makes it a powerful and attractive tool for a variety of applications. Here, we illustrate the utility of FDALabel using case scenarios in pharmacogenomics biomarkers and ADR studies.


Introduction
FDA drug labels contain rich and comprehensive information about drug products, such as disease indications, target populations, drug-drug interactions, and ADRs. The label of a prescription drug is prepared by manufacturers and approved by the FDA and, thus, in its final form, reflects the collective input from regulators, drug manufacturers, and scientific experts. Drug-labeling data have been an important resource for diverse applications, including the support for policy development [1][2][3][4], drug discovery and development [5,6], support for pharmacogenomics applications for personalized medicine [7][8][9], and scientific research [2,3,7,[10][11][12][13].
Drug labeling is not static and around 400-500 new or updated drug-labeling documents are added every week to the current total of approximately 80 000 structured product labels (SPLs) (www.fda.gov/ForIndustry/ DataStandards/StructuredProductLabeling/ default.htm). Drug labeling has changed over time because of evolving FDA regulations and has increased in content and length, with a standard format to guide the safe and effective use of drugs [14] (www.fda.gov/Drugs/ GuidanceComplianceRegulatoryInformation/ Guidances/ucm065010.htm). For example, each prescription drug labeling in the current format is often complex, with over 20 pages of text and tables covering a range of information about the drug product. This rapid pace of change and the complexity in content illustrate the need for an advanced bioinformatics environment with robust and powerful data management and search capabilities to facilitate the application of drug labeling information.
Here, we describe the FDALabel database developed by the FDA as a web-based application (www.fda.gov/ScienceResearch/ BioinformaticsTools/ucm289739.htm). The tool allows access to the most up-to-date drug-labeling data and facilitates their use in regulatory science, drug development, scientific research, and clinical application, such as: (i) enabling easy querying of drug information for research and monitoring of ADRs (e.g., Boxed Warning, druginduced liver injury (DILI), pharmacogenomics biomarker) to advance pharmacovigilance; (ii) supporting research through integrating druglabeling data with other drug databases and  Table 1, 'Highlights of Prescribing Information' is a half-page summary of the essential safety and efficacy information for approved human prescription drug and biological products. By contrast, full prescribing information has a total of 17 labeling sections with more detailed content compared with the 'Highlights' . The additional sections include Drug Abuse and Dependence (Section 9), Over Dosage (Section 10), Clinical Pharmacology (Section 12), Nonclinical Toxicology (Section 13), Clinical Studies (Section 14), and Patient Counseling Information (Section 17). The information in each section is structured and specified for certain drug information. Of note, patients receive the prescription drug information with limited information from the Full Prescribing Information (FPI) specified for patient (Medication Guides).

Pharmacogenomics data
In January 2013, the guidance for industry on 'Clinical Pharmacogenomics' information preparation for labeling was released [14] (http:// www.fda.gov/downloads/Drugs/ GuidanceComplianceRegulatoryInformation/ Guidances/UCM337169.pdf ). A new subsection, Pharmacogenomics, was added under Clinical Pharmacology (Section 12). The guidance document indicates that, 'If applicable, a 'Pharmacogenomics' subsection should be included in the CLINICAL PHARMACOLOGY section 12 (e.g., as '12.5 Pharmacogenomics') of the prescribing information (PI) and should include clinically relevant data or information on the effect of genetic variations affecting drug therapy' . Recently, many pharmacogenomics biomarkers have been included in drug labeling (e.g., the drugs warfarin, boceprevir, Nuedexta1, Bro-vana1, and pantoprazole) [8], which allows clinicians to apply these drugs to the specific populations who are most likely to benefit from precision medicine [8,9]. The pharmacogenomics biomarkers found in drug labeling can be categorized as: (i) involved in drug metabolism variability (e.g., CYP enzymes) among individuals; (ii) associated with increased risk for adverse  events (e.g., G6PD, TPMT, and HLA-B); and (c) describing the mechanism of action of the drug (e.g., CD30), which likely impacts the effect of the drug on specified patients (www.fda.gov/drugs/ scienceresearch/researchareas/ pharmacogenetics/ucm083378.htm).
In summary, the drug labeling contains rich information from clinical studies, nonclinical studies, and postmarketing experiences regarding ADRs for pharmacovigilance and pharmacogenomics. At the FDA, drug-labeling development has been a crucial element in the drug review process. Its content can also support diverse research needs. Box 1 summarizes some of the key applications using the drug-labeling data.

FDALabel database
The FDALabel database contains over 80 000 full-text SPLs. The source of FDALabel data is the SPLs of the FDA archived in the FDA Online Labeling Repository (http://labels.fda.gov/) and DailyMed (http://dailymed.nlm.nih.gov/ dailymed/index.cfm). FDALabel was implemented as a secure three-tier platform with an Oracle database and is updated quarterly. The database can be accessed through a web-based application. The tool has a simple query that can be intuitively performed (e.g., full-text search, product or generic name search in version 1.0). Importantly, the advanced query functions are implemented to perform a range of queries for, individually or in combination: (i) presence of, or text within, specific sections of the prescribing information; (ii) document types (e.g., Human Rx, Human OTC, Vaccine); (iii) marketing categories (e.g., NDA, ANDA, BLA); (iv) SPL identifiers (e.g., Product NDC Codes and SETIDs); and (v) Market start/end date. In addition, the search summary results can be downloaded as a spreadsheet that links to the original SPLs.
Of note, DailyMed is a widely used resource for drug labeling and provides SPL data. However, many unique functions (e.g., the full-text searching capability and querying based on any combination of specific drug fields and sections) are available in FDALabel and, thus, make it a more powerful, user friendly, and attractive tool (see Table S1 in the supplemental information online for comparisons between FDALabel and DailyMed). For example, we searched for the keywords 'acute liver failure' in 'Full Text Query' , which resulted in 757 labeling hits. We also searched the same keywords within Boxed Warning using 'Section Present' , which resulted in 556 labeling hits. In addition, we also added NDA as filter from 'Marketing Categories' , with the same two queries resulting in 53 hits and 23 hits, respectively. Thus, a large number of duplications in drug labeling can be easily removed by using NDA as a filter in FDALabel, which is not readily available in DailyMed. The unique functions of FDALabel database enable the drug-labeling content to be more easily accessed by researchers for ADR study, FDA medical officers for drug review, pharmaceutical companies for drug development and repositioning, and physicians and consumers for drug safety information. Google Analytics has shown that the number of users has increased greatly since the database opened for public access in 2012 (Fig. S1 in the supplemental information online).

Pharmacogenomics biomarkers
Some pharmacogenomics biomarkers are associated with ADRs. We queried five ADR-related biomarkers (i.e., G6PD, TPMT, DPD, HLA-B*1502, and HLA-B* 5701) in FDALabel (Table S2 in the supplemental information online) and built a network visualization of the drug-ADR relations via these biomarkers (Fig. 1). The results illustrated that patients who carry the HLA-B*5701 allele are at high risk for experiencing a hypersensitivity reaction (HR) to abacavir, while patients who carry the HLA-B*1502 allele are at a high risk of HR to carbamazepine, which could lead to Steven-Johnson syndrome, a severe ADR mentioned in the Box Warning section [17,18].

ADR study using MedDRA standard
Medical Dictionary for Regulatory Activities (MedRA; www.meddra.org/) is widely used in the USA, European Union, and Japan for ADR reporting. Extracting MedDRA standard terms from drug labeling will allow researchers, regulators, and healthcare professionals to better understand the trends and frequencies of adverse events for drugs in the current markets [6]. There are five term levels in MedDRA, from lowest to highest: LLT (Low-Level Term), PT (Preferred Term), HLT (High-Level Term), HLGT (High-Level Group Term), and SOT (System Organ Class), with PTs the most commonly used for ADR study. Our mapping showed that, out of a total of 74 229 MedDRA (version 18.0) LLTs, 11 847 LLTs have appeared in FDA-approved prescription drug labeling, which, in turn, identified 6161 PTs (out of total of 21 345 PTs in MedDRA). The top ten labeling sections that contain the most PTs are plotted by counts (Fig. 2) Example applications using drug labeling data Drug interactions Drug labeling has a specific section to summarize the findings relating to drug-drug interactions and their associated adverse events in drug application. Some specific questions can be queried against the labeling data, such as which drugs used to treat HIV (in the Drug Indications and Usage Section) are known to interact with methadone and which drugs will interact with disulfiram (in the Drug Interactions Section).

Drug classification
There are several ways to classify drugs; each one has its intended application (e.g., clinical application, mechanistic study, chemical structure, etc.). For example, which drugs share the same pharmacologic class, such as kinase inhibitors, HIV protease inhibitors, or betaadrenergic blockers, and so on. The drug-labeling indexing provides classification based on Established Pharmacologic Class (EPC), Mode-of-Action (MoA), Physiologic Effect (PE), and Chemical Ingredient by structure (CI). These classification schemes facilitate the study of drug class effect and evidence-based justification for making a labeling change to a drug class during the review process.

Adverse events
Three sections (Boxed Warning, Warnings and Precautions, and Adverse Events) summarize drug-related adverse events, which have been widely applied in pharmacovigilance and drug safety research. While the standard vocabulary to describe adverse events is not mandatory, most terminologies for adverse events are implemented with certain standard terminology (such as SNOMED and MedDRA), which facilitates the study of the adverse events data in the drug labeling.
Precision medicine A large number of pharmacogenomics biomarkers are included in drug labeling. These biomarkers are likely to impact the effectiveness and adverse events for patients from specified subpopulations taking the drugs. Thus, the information facilitates the identification of new trends and frequency of genetic variability associated with increased risks to public health, which is an important goal in precision medicine.
have been combined into the Warnings and Precautions section in the new drug-labeling format in PLRs.

Drug-induced liver injury study
We utilized drug-labeling data to study DILI [17], drug safety [19], and drug repositioning [5]. For example, we developed a systematic annotation method using drug-labeling information to an-notate the potential of a drug for DILI. Specifically, a combination of keywords about DILI, which reflected not only different types, but also various severity levels of DILI, was used to search against the drug-labeling database. The study enabled the relevant DILI information to be extracted from three labeling sections (Boxed Warning, Warnings and Precautions, and Adverse Reactions) with a DILI classification scheme to define a benchmark DILI data set that is widely used as a model in DILI study [20,21].

Concluding remarks and future perspectives
FDA drug labeling has accumulated over the past 40 years or so [since the Federal Register of June 26, 1979 (44 FR 37434)] and is an integral part of the FDA review process. In addition, many guidance documents have been issued by the FDA to facilitate its application, such as for drug discovery and development. Similar labeling resources have also been developed around the world, such as in Europe and Japan. Given the recent implementation of data standards and rapid advancement of information technology, drug-labeling data has grown tremendously, truly becoming regulatory big data for knowledge discovery and drug-centric research to improve public health. To fully utilize these regulatory big data, powerful tools and databases with flexible functions are crucial. FDALabel is one such tool, developed by the FDA to specifically support regulatory science. The tool fills the gap in the FDA where large amounts of the drug information are available, but few tools are available to take advantage of it.   Five selected pharmacogenomics biomarkers and their associated adverse effects and drugs. A network visualization illustrates the relation among drugs (blue), biomarkers (green), and associated adverse effects (red) based on the information retrieved from FDALabel. For example, patients who carry the HLA-B*1502 allele are at a high risk for experiencing a hypersensitivity reaction (HR: such as Stevens-Johnson syndrome) to carbamazepine. The drug-labeling resource is not the only one related to drugs within the FDA. Furthermore, many drug-centric databases and data standards have also been developed by the research community. However, these databases, including those developed by the FDA, are often not easily available to inform the FDA review process and drug safety research. We intend to address this challenge by expanding FDALabel by integrating it with multiple disparate database contents to provide comprehensive access to drug-related information. The expanded FDALabel will make the data accessible in a way that is useful and focused on the question asked by reviewers and researches to discover knowledge and fill the knowledge gap. At the time of writing, the information and databases currently being evaluated for integration were: (i) Drugs@FDA, which provides drug approval history; (ii) FDA Orange Book, containing publications for approved drug products with Therapeutic Equivalence Evaluations, which provides patent related information; (iii) 'Pharmacological Class Indexing' from SPLs, which will enable searches for ADRs across drug class products. Drugs from the same pharmacological class often share similar efficacy and safety profiles; (iv) MedDRA, which provides standard terminology for international clinical results used by regulatory authorities in the pharmaceutical industry; (5) RxNorm, which provides normalized names for clinical drugs and links their names to many of the drug vocabularies and databases commonly used; (vi) FAERS, which the Adverse Event Reporting System of the FDA for drug products; and (vii) Substance Registration System-Unique Ingredient Identifier (SRS-UNII), which provides unique chemical substance information and structure for drugs. Furthermore, we will implement more options for flexible access to this integrated information, such as searches for drug class, chemical structure, topic, and so on. Our ultimate goal is to provide publicly available, rich, accurate, and complete information that facilitates transparent knowledge exchange among the public, pharmaceutical companies, and government regulatory agencies.

Disclaimer
FDALabel database is not compatible with Internet Explorer. We suggest that users use Firefox or Google Chrome as Internet browsers. FDA-Label is not a diagnostic tool and is not intended to inform regarding choice of medicines or therapies for medical conditions. The views presented in this article do not necessarily reflect current or future opinion or policy of the U.S. Food and Drug Administration. Any mention of commercial products is for clarification and not intended as endorsement.