Supervised Relation Extraction Between Suicide-Related Entities and Drugs: Development and Usability Study of an Annotated PubMed Corpus

Background Drug-induced suicide has been debated as a crucial issue in both clinical and public health research. Published research articles contain valuable data on the drugs associated with suicidal adverse events. An automated process that extracts such information and rapidly detects drugs related to suicide risk is essential but has not been well established. Moreover, few data sets are available for training and validating classification models on drug-induced suicide. Objective This study aimed to build a corpus of drug-suicide relations containing annotated entities for drugs, suicidal adverse events, and their relations. To confirm the effectiveness of the drug-suicide relation corpus, we evaluated the performance of a relation classification model using the corpus in conjunction with various embeddings. Methods We collected the abstracts and titles of research articles associated with drugs and suicide from PubMed and manually annotated them along with their relations at the sentence level (adverse drug events, treatment, suicide means, or miscellaneous). To reduce the manual annotation effort, we preliminarily selected sentences with a pretrained zero-shot classifier or sentences containing only drug and suicide keywords. We trained a relation classification model using various Bidirectional Encoder Representations from Transformer embeddings with the proposed corpus. We then compared the performances of the model with different Bidirectional Encoder Representations from Transformer–based embeddings and selected the most suitable embedding for our corpus. Results Our corpus comprised 11,894 sentences extracted from the titles and abstracts of the PubMed research articles. Each sentence was annotated with drug and suicide entities and the relationship between these 2 entities (adverse drug events, treatment, means, and miscellaneous). All of the tested relation classification models that were fine-tuned on the corpus accurately detected sentences of suicidal adverse events regardless of their pretrained type and data set properties. Conclusions To our knowledge, this is the first and most extensive corpus of drug-suicide relations.


Introduction
DSR (Drug-Suicide Relations) PubMed corpus is an annotated corpus of drug entities (DE), suiciderelated entities (SE), and relation classes between them. The corpus consists of titles and sentences from PubMed abstracts.
The corpus consists of single sentences and supports only inter-sentential relations. All the evidence should appear in the same sentence.
For more information about the corpus constituents and the algorithm of the data collection, please refer to the original paper.

Entities
An entity can be a single word or phrase that appears in the annotating text. In this guideline, we focused on two types of entities: drugs and suicidal entities. This document provides the guidelines followed during the annotating process of drug entities, suicidal entity, and their relations.

Drug
Drug Entities are pre-annotated in the text by means of an automatic NER model (Med7 [1]). The model recognized enzymes, amino acids, hormones, raticides, veterinary drugs, herbicides, and pesticides as a drug names as well. Next are the corrections made by annotators after NER.

Sentence
Med7 DE Findings suggest that differences exist based on demographic variables (gender, age, race, and sexual orientation), lifetime drug use (inhalants, Valium™, crack cocaine, alcohol, Coricidin™, and morphine), recent drug use (alcohol, ecstasy, heroin, and methamphetamine), mental health variables (suicide attempts, familial history of substance use, and having been in substance abuse treatment), and health variables (sharing needles and having been tested for HIV).
Valium™ 2. A missing part of entities was filled in.

Sentence
Med7 DE Massive ingestion of isosorbide-5-mononitrate and nitroglycerin: suicide attempt by an adolescent girl without previous heart disease.

Sentence
Med7 DE Deaths including those from natural causes, toxicity, accident and suicides with positive forensic toxicology analyses for methamphetamine and its metabolite amphetamine in postmortem samples were investigated. amphetamine methamphetamine methamphetamine 5. If a drug and its metabolites are recognized together, the drug is excluded.

Sentence
Med7 DE Deaths including those from natural causes, toxicity, accident and suicides with positive forensic toxicology analyses for methamphetamine and its metabolite amphetamine in postmortem samples were investigated.
amphetamine methamphetamine methamphetamine

Suicidal Entity
For suicidal entities, the mention of suicide-related events, tendencies, and behaviors, including suicide risk, suicidal attempt, completed suicide, and suicidal ideation or behavior disorders were annotated. The entities are nominals (nouns or noun phrases), related to suicide. Next are elaborations on controversial cases.
1. "Risk" etc expressions are not included in a label if appears before the suicide-related word or phrase.

Sentence SE
To compare the risk of suicide in adults using the antidepressant venlafaxine compared with citalopram, fluoxetine, and dothiepin. suicide In treatment responsive BD patients, lithium (Li) stabilizes mood and reduces suicide risk.
suicide risk 2. In case of concatenation by "/" , "or" , "and", separate labels only if the meaning will remain the same after separation. If the meaning changes after separations, the labels are not separated and the conjunctions are included in the label.

Sentence SE
The use of paroxetine may increase suicidal behavior and suicidal ideation.
suicidal ideation suicidal behavior Here we add further Associations of Z-drugs, trazodone, and sedative benzodiazepines (temazepam, triazolam, flurazepam) with suicidal ideation, planning, and attempts were estimated using binomial logistic regression. suicidal ideation, planning, and attempts 3. Negated entities were annotated as well.

Sentence SE
Moreover, convincing evidence exists that lithium has added value and benefit for its unique anti-suicidal effects as well as reducing mortality by other causes.
anti-suicidal effects 4. Abbreviations are annotated as well. When an abbreviation appears next to the full expression, the abbreviation is annotated together with the full expression.
Sentence SE After adjusting for demographic characteristics, tobacco use, family history of suicide and depression, both SI and SA were positively associated with AUD, cannabis and cocaine use.

SI SA
Suicidal ideation (SI) is often cited as a reason to exclude patients from interferon-based treatment or to terminate antiviral treatment that is in progress.
Suicidal ideation (SI) 5. Expressions that do not contain a stemmed version of "suicide" ('suicid'), but suggest suicidal action in form of self-harm, self-poisoning, or fatal intent.

Sentence SE
Although anecdotal, our case report points toward safety of pregabalin following deliberate self-poisoning. self-poisoning