Acquiring Short Scripts and Setting a Case Frame in Each Acquired Script: Toward Random Story Generation

The integrated narrative generation system (INGS), that the authors have developed, generates a story to translate the story into the surface representation. In the story generation process, the integrated narrative generation system uses narrative knowledge that was automatically acquired from existing narrative works. This paper presents a method to acquire short scripts, which are a kind of narrative knowledge, from existing works in Aozora Bunko for the story generation. This paper presents a mechanism to generate random story-like event sequences by using 23,751,142 bi-gram scripts acquired based on the method proposed below. The authors aim to use the scripts generated by the method as a first set to be revised through the next learning process. The This is an open access article the CC (http://creativecommons.org/licenses/by-nc/4.0/).


INTRODUCTION
The integrated narrative generation system (INGS) [1] is a system that generates a narrative automatically. The system not only generates a story but translates the story into the surface representation. In story generation, the INGS uses knowledge about story structure and dictionaries [2] that systematically store nouns, verbs, and so on. The authors call story content knowledge, which is stored knowledge bases of the INGS. The dictionaries are called conceptual dictionaries. A variety of generated stories depend on the scale of the knowledge base. If the knowledge base is small scale, story generation by the INGS fails frequently. However, expanding the knowledge base manually has limitations. Therefore, the authors have attempted to acquire knowledge automatically [3]. This paper details the attempt to acquire short scripts based on bigrams acquired from existing narrative works for story generation.
A script that Schank [4] has mentioned involves knowledge about human actions in a particular situation. For example, a restaurant script describes the procedure procedures for ordering and eating dinner at a restaurant. We attempt to not only acquire Schank's scripts but to also acquire the temporal sequences of events.
The INGS consists of one-part mechanisms and one-part knowledge. Knowledge refers to dictionaries for concepts and language notation [1], and to knowledge bases that build partial structures of a story (e.g., the narrative content knowledge base). By using this knowledge, the mechanisms generate various aspects in a narrative, such as a story, discourse, and surface representation (i.e., sentences, music, or images). Figure 1 shows a story structure that is generated in the INGS. The structure has a hierarchical structure that consists of states, events, and relations. A state describes static information regarding characters, things, locations, and times in a story. An event describes dynamic information that represents the difference between two states. A relation links several events by some relationship such as Causal relationship.

METHOD FOR STORY GENERATION IN THE INGS
Story generation in the INGS is the expansion and transformation of a story structure using story techniques. A story technique is a formal procedure for expanding the structure of a story and draws on a knowledge base that corresponds to the particular technique. Figure 2 shows an example of using a story technique; the technique selects applicable knowledge and expands the structure of the story by using that knowledge. Figure 3 shows an example of automatically-acquired knowledge. This knowledge is applied to an event that includes "eat, " and the story technique adds an event that includes "drink" to the structure of the story. (&v age1) or (&v obj1) are variables described by the cases. These variables denote the position or location where the generated character, thing, or location is inserted based on the restrictions. Cases that have the same variables also have the same character, thing, or location.
Potential selection points in story generation involve mainly the "application point of the story technique, " "applying a story technique, " and "using knowledge. " The "application point of story technique" and "applying a story technique" steps can be controlled

Received 3 June 2018 Accepted 30 September 2018
Keywords Case frame integrated narrative generation system knowledge acquisition random story generation script verb concept

A B S T R A C T
The integrated narrative generation system (INGS), that the authors have developed, generates a story to translate the story into the surface representation. In the story generation process, the integrated narrative generation system uses narrative knowledge that was automatically acquired from existing narrative works. This paper presents a method to acquire short scripts, which are a kind of narrative knowledge, from existing works in Aozora Bunko for the story generation. This paper presents a mechanism to generate random story-like event sequences by using 23,751,142 bi-gram scripts acquired based on the method proposed below. The authors aim to use the scripts generated by the method as a first set to be revised through the next learning process. based on a parameter that is inputted into the INGS. For example, if "parameter length" is increased, a story technique is selected that increases the number of events within the story.

ACQUIRING SHORT SCRIPTS FROM EXISTING WORKS
The mechanism for acquiring short scripts has three steps: acquiring bigrams, making short scripts, and setting a case frame. This section explains each of the three steps.

Acquiring Bigrams from Existing Text
Bigrams that consist of two verbs are acquired by applying morphological analysis to existing text. The acquired bigram is called a verb bigram. For example, verb bigrams are acquired as shown in Figure 4. The procedure for acquiring bigrams is as follows: (i) Morphological analysis: The mechanism analyzes a text using a morphological analyzer.
(ii) Extracting verbs from results: The mechanism extracts verbs from the results of morphological analysis. (iv) Removing a part of the bigram: The mechanism removes bigrams that include verbs that are not stored in the conceptual dictionary, because the INGS cannot use such bigrams in story generation.

Making Short Scripts
The acquired verb bigrams are made into short scripts. This step connects the verb concepts to verbs that are included in verb bigrams. Verb concepts are assigned a number in the INGS because the INGS provides the plural of the verb. For example, the verb concept "食べる1" has a meaning like "earn, " while the verb concept "食べる2" has a meaning like "eat".
Short scripts (X n × Y k ) were created from verb bigram (X × Y) [(X n × Y k ) consists of verb concepts X n and Y k ; (X × Y) consists of verb X and verb Y]. In this case, verb X n has i kinds of meaning, verb Y k has j kinds of meaning, and the number of short scripts is equal to i multiplied by j. In making short scripts, we need to consider the meaning of a verb. However, in this paper, the authors created all patterns of (X n × Y k ) with the procedure shown in Figure 5.

Setting a Case Frame
By setting a case frame in a short script, the INGS can generate a story using the short script. Figure 6 shows the procedure for  setting the case frame in short scripts (X n × Y k ). A case frame consists of cases that require a verb. For example, a verb concept has a description as seen in Figure 7. Here, the elements of the "caseframe" are the case frame. A case frame has restrictions based on the noun conceptual dictionary. The restrictions define the probability of cases that are included in a verb concept.

RESULT
In this paper, the authors acquired short scripts from 13,331 works in the Aozora Bunko [5]. The acquisition mechanism used MeCab in the morphological analysis. We acquired short scripts from each work. In analyzing each work, we did not consider structures such as paragraphs in each work. Table 1 shows the number of acquired verb bigrams and short scripts (MeCab was downloaded from "http://taku910.github.io/mecab/").

USING SHORT SCRIPTS FOR STORY GENERATION
The INGS uses a technique that allows the generated story to converge on one result based on several parameters that are set beforehand. Specifically, the INGS performs story generation by using selection rules at several points of the generation process. The authors have adopted random generation in order to gradually change a narrative. A summary regarding random narrative generation is presented in Ogata et al. [6]. In this section, a generated story is presented.
Currently, parameters have not been applied by the authors in order to evaluate whether acquired knowledge alone suffices for random story generation. This method accomplishes generation by applying various story techniques that INGS has available, and uses scripts that were acquired automatically which are stored  Figure 8 shows how a generated story is transformed into a natural language sentence by a sentence generation mechanism in the INGS. Although the authors have experience with the topic of random generation, previous attempts to generate a story through random methods failed owing to the lack of a knowledge base.

CONCLUSION
This paper showed short script acquisition from existing works. The authors acquired 23,751,142 short scripts from existing works that were stored the Aozora Bunko. The acquired short scripts were used for story generation in the INGS. In future works, the authors will acquire longer scripts. In addition, they will attempt to acquire scripts semantically, because the results in this paper did not consider the semantic structure of a story.
The authors' goal is the random generation of a story. We presented story generation by using knowledge, which was obtained automatically. Based on the discussion above, a method to change narratives and to set parameters for a narrative generation was presented.
For future research, the aim is to increase automation by using learning mechanisms. For example, we propose a generation strategy composed of a preliminary story generation, a valuation of the story, and elements conjecture where operating parameters are not input manually, and instead are selected automatically. Through the learning operations at some selection points, generation results improve using a bottom-up process. A second option would be the construction of a system although maintaining the consistency of a story in a narrative generation process is very difficult, and maintaining consistency would be our goal.