A bottom-up method for remixing narratives for virtual heritage experiences

Considering the impacts COVID-19 has had on travel and many economies, developing virtual experiences that are well-received by different publics has become even more prominent. This paper shows how a multimodal discourse analysis can be used to as a bottom-up approach to identifying narrative themes that can be used in virtual experiences for cultural heritage sites. A case study on 11 UNESCO World Heritage Australian Convict Sites shows how diverse sources of user-generated content, tourism marketing materials and historical information can be analysed and then remixed into a virtual tour of the sites in the form of an interactive web documentary (iDoc). Although this case study involved a total of seven narrative development phases, this paper focuses on two phases, namely how the user model and content model were determined. These models were later used to develop the resulting iDoc prototype. The user model focused on the prospective audience of cultural heritage tourists, and a content model of narrative themes for the iDoc was developed through a multimodal discourse analysis. This bottom-up approach of analysing existing cultural data allows for the discovery of the prospective audiences’ interests as well as narrative themes that can be included in virtual heritage experiences. It also provides a new creative methodology that can prevent issues that may arise with top-down narratives that focus too heavily on one institutional perspective or national narrative and lack direct engagement with or understanding of today’s publics.


Introduction
As new media technologies have emerged and become more accessible, cultural heritage and memory institutions are increasingly developing virtual visitor experiences for both local and industries and users. The Internet presents a digital world that is wide open to interpretations, it allows for collaborative production without 'official' or sanctioned institutions (i.e. gatekeepers) to make meaning and create knowledge for users, which raises issues of credibility and reliability (Bruns, 2005: 20). Social communitiesno longer tied to physical geographyare emerging on the Internet through voluntary, temporary and tactile affiliations based on shared intellectual interests and emotional investments, and they are sustained through the mutual production and exchange of knowledge (Jenkins, 2004: 35). They form groups, new niches, and communities who share and produce content they have a topical interest in, such as TripAdvisor travel communities interested in visiting heritage sites. This situation of media divergence, convergence and shifting values towards multiple perspectives rather than one 'official narrative' has resulted in big data that can be difficult to navigate, especially if a person is not an active member of a trusted online community. This increasingly participatory culture is changing the previously well-understood methods of communication and storytelling (Jenkins, 2004: 36). In the case of cultural heritage narratives, it requires a re-examination of existing and emerging digital 'big cultural data' to inform the production of new virtual narrative experiences that can serve communities who are interested in heritage sites, such as local visitors and tourists.
The third challenge is identifying the interests of different public communities (e.g. locals and international visitors) to ensure that a virtual narrative experience engages the targeted audience(s). Cultural tourists are motivated to go 'to cultural attractions away from their normal place of residence, with the intention to gather new information and experiences to satisfy their cultural needs' (Richards and Richards, 1996: 24). Many previous studies show that 'cultural tourists' are domestic rather than international visitors because they have a closer affinity for the cultural experiences they seek. Cultural heritage tourists tend to seek local heritage through archaeological sites, historic landscapes, local architecture, museums, art expressions, traditions and practices of the past (Timothy and Nyaupane, 2009). The concept of cultural heritage tourism refers not only to the sites that tourists visit, but also why they travel to them. Considering these rather broad definitions of the desire for knowledge and to experience locations of history and culture, the following case study further examined the demographics and interests of this potential audience for virtual narratives to identify who is part of the subcommunity of 'cultural heritage tourists' (i.e. a user model). With a clear understanding of this target audience, future creators of digital narratives, virtual experiences, and marketing strategies can better tailor their communications to this subgroup. Therefore, the current climate of digital participatory cultures, the rejection of single-perspective narratives produced by authorities, and the ambiguity of the interests of cultural heritage tourists prompted the development of a bottom-up approach to producing remixed virtual narratives for cultural heritage sites. Multiple sources of 'big cultural data' were analysed to develop a model for the potential audience who may be interested in the selected cultural heritage sites and a content model for the resulting iDoc. To demonstrate this approach, a case study on 11 UNESCO World Heritage Australian Convict Sites was conducted and a prototyped interactive web documentary (iDoc) was developed as a proof of concept for this method. The results of this project revealed other implications that will be useful reflections for future creative producers who are targeting cultural heritage audiences with their virtual experiences.

Democratising virtual heritage narratives
Cultural heritage is a rich area for virtual narrative development because institutions, such as museums, galleries, archives and libraries (GLAMs) are increasingly experimenting with digital media exhibitions to communicate history. This move is often referred to as a turn towards 'immersive experiences' (Kidd, 2018), but little research has been conducted to understand current practices for developing these experiences. Heritage can be presented in a myriad of ways through the selective curation of artefacts and narratives, the perspectives presented and the potential use of digital media to display heritage. The challenge for GLAMs, who have not traditionally employed storytellers or creatives, is how to capitalise on digital media to help present different options or narratives to the public. For example, the museum experience is moving towards audience-oriented exhibitions, which shifts focus from individual objects to a 'whole gallery experience' where objects are rarely left to 'speak for themselves' and meaning is made in collaboration with semiotic modalities, such as space, visual images and language (Meng, 2004: 31). Social historians are also supplementing objects with maps, photographs, documents and oral history taken from everyday experience to create a range of stories about an object, not merely the dominant history. The challenge with this increase in digitised history is the ability to narrativize it in a way that engages the public and persuades them to discover more.
For example, in contemporary museums, Kidd (2014) summarises that they are engaging in transmedia because they exhibit their materials across multiple media and are using social media to converse with their audiences. These actions shift the dynamic of trust because the audience members become authors through the production of user-generated content (Kidd, 2014). Museums are incorporating personal narratives into some exhibitions (e.g. 15 Second Place project at the Australian Centre for the Moving Image), creating interactive digital displays within the museum buildings, and creating online gaming aspects (Kidd, 2014). Digital exhibitions created by GLAMs and have become a genre known as virtual museums, defined as 'a logically related collection of digital objects composed in a variety of media' (Styliani et al., 2009: 521). Many virtual museums provide visitors with a database of their collections and some offer 360 degree virtual tours through the physical galleries. For example, visitors to the websites for the Louvre, the Smithsonian Museum, or the Vatican Museums can take a 360 degree virtual tour of past or current exhibitions. Many other cultural heritage projects have fostered public engagement in the GLAM sector through case studies that have involved crowdsourcing through the digital transcription of historical documents and the creation of digital archives for teaching and learning purposes (Warwick and Bailey-Ross, 2020: 93-96). However, they often have more limited abilities to include narrative because the primary focus is on preserving and increasing access to heritage documents.
In the heritage tourism sector, gamification has been used as a method of marketing locations and brands to potential visitors to create awareness and enhance their on-site experiences at a destination (Xu et al., 2017). Gamification employs game design elements in digital applications to engage users and motivate desired behaviours (Andreoli et al., 2017: 3). For example, Musica Romana, intended for use by tourists, is a website for mobile phones that allows users to experience classical music associated with it and the music playable on-location at specific churches located in Rome, Italy (Fagerjord, 2017). A large number of mobile apps have also been created for heritage trails to promote and engage tourists/visitors using gamification (Basaraba, et al., 2019). Finally, examples of cultural heritage narratives that fall within the iDoc genre thematically tend to focus on specific locations, such as a nation, historic area or city. For example, iDocs filed in the MIT Docubase on topics of heritage focus on preserving cultural heritage (e.g. Gallery of Lost Art, 2012), national heritage (Zikr: A Sufi Revival, 2018), and specific heritage sites (City of Memory, 2008). These examples also show that iDocs for heritage are well-suited for location-based narratives and they evoke a more thematic experience through the diverse viewpoints, voices, and modes where many narratives are presented rather a chronological or single-story path. The allowance of multiple perspectives and narratives is especially applicable for a bottom-up approach to including publicproduce content within a remixed iDoc.
Since these narratives in digital forms allows for a more democratic process of creation, it this presents an opportunity for developing new virtual heritage experiences. The Internet has forced historians to confront the multiplication of popular productions and they use the concept of shared authority to describe the democratisation of the knowledge-building process where audiences are never passively consuming knowledge produced by expert historians (Cauvin and O'Neill, 2017: 5). Not only have GLAMs move towards democratising access to history using digital media, but a plethora of UGC on cultural heritage experiences has been and continues to be produced by tourists through social media. For example, TripAdvisor, Instagram and Facebook allow people to share their travels through photos and stories, which created major shifts in the travel industry (May, 2014). Tourism blogs proliferated as people shared personal narratives and experiences that are often tailored to niche travel communities, such as solo, women's, adventure, budget, and arts and culture travel (TBEX, 2017). The social web provides people with access to specific travel information from people with similar interests and who they tend to trust more than official or professional sources. Multiple studies show that tourists refer to and trust UCG when making decisions about trip planning (Mendes-Filho et al., 2017;Ukpabi and Karjaluoto, 2018;Yu et al., 2014). As GLAMs and academics are developing and experimenting with new media and digital storytelling for heritage applications, tourists are a key audience of virtual experiences.

Tourists as a primary audience for virtual heritage narratives
Considering the impacts on travel due to the COVID-19 pandemic, virtual heritage experiences have become even more pertinent so that physically unvisitable places can still be seen by the public. The UNWTO (2018: 18) defines cultural tourism as an activity where the visitor's 'motivation is to learn, discover, experience and consume the tangible and intangible cultural attractions/products in a tourism destination. These attractions/products related to a set of distinctive material, intellectual, spiritual and emotional features of a society'. To date, many scholars seeking to understand heritage tourists as a subgroup have discussed their demographics, motivations for travel and cultural experiences (Frank and Medaric, 2018). Since this paper focuses on a case study of 11 UNESCO Australian Convict World Heritage Sites (WHSs), it was important to distinguish WHSs tourists from more general heritage tourists for the purposes of developing an iDoc. For example, Ramires et al. (2018: 56-57) found that 'absorptive cultural tourists' (51%) were on average 37 years of age, were employed, had higher education degrees, were repeat visitors and placed a high importance on gastronomy and value for money. Another Europe-based study found that visitors to UNESCO monasteries in Northeast Romania were mostly couples (42%) and families (29%) and came from European countries (68%) (Lupu et al., 2019). Lupu et al.'s (2019: 12) analysis of TripAdvisor reviews highlighted 10 main themes that were of interest to those who visited the Romanian monasteries, which showed that both religious and non-religious themes (e.g. paintings, architecture, history) were important to them. Finally, King and Prideaux's (2010: 244) survey of visitors to five natural UNESCO WHSs in Australia showed that 40% of visitors did not know they were visiting a World Heritage area before or after their visit. King and Prideaux (2010: 245) argue overall that the World Heritage brand signals that the site is a 'must-see' location for visitors, it cues the public on acceptable behaviours and expectations while on-site, it is a visible symbol of national commitment to quality recreational opportunities, site protection and conservation. Summarising the findings, most cultural tourists participate for recreation and pleasure rather than for deep learning experiences (McKercher and Du Cros, 2003: 56). Therefore, the purpose of travel can inform how much entertainment or educational content different tourists may desire. Based on these findings, McKercher and Du Cros (2003: 57) recommend that cultural tourism content must be presented in an easily consumable and enjoyable manner that may contain elements of learning, but it should firstly entertain. Therefore, the interests of cultural heritage tourists as a primary target audience are informative for developing virtual narrative experiences Rhetoric, discourse and top-down narrative composition The composition of digital narratives can be approached from a multitude of communication, media, narrative, and discourse theories. The theoretical framework upon which the following narrative invention methodology stemmed from was based on a transdisciplinary convergence of digital rhetoric (Eyman, 2015), transmedial narratology (Ryan, 2008) and interactive narrative studies (Koenitz et al., 2013). Basaraba's (2018) resulting seven-phase narrative creation framework, including updates to digital rhetorical theory, are: (1) know the audience, (2) define the communication goals, (3) consider the delivery medium/media, (4) invent the digital narrative system, (5) arrange the narrative(s) into a non-linear structure, (6) design the system and (7) make revisions and updates to the system and narrative as needed. To provide some additional context to Phase Four of 'inventing' or narrative compositionwhich is exemplified in this paperit is important to consider key differences in terms of fiction and nonfiction as well as the impacts of the digital medium. For example, one of the first scholars to address the non-linear dramatic structure afforded by computers, Laurel (1991), applied Aristotle's Poetics to computational narratives in fictional dramas of tragedy, comedy and melodrama. Cultural heritage virtual experiences are a nonfiction genre (in most cases), they often utilise a non-linear structure and they also require different compositional considerations than fictional narrative genres. Many nonfiction narratives are designed with a particular rhetorical purpose (e.g. to educate or elicit civic action) and thus, rhetorical composition techniques can be modified, updated and applied to digital narratives. Digital rhetoric, Zappen (2005) explains, is an amalgam of more-or-less discrete components, such as selfexpression and collaboration, the affordances and constraints of digital media, and the formation of identities and communities. Digital rhetoric considers how the digital medium affects the ability to persuade or achieve a narrative goal with the audience (e.g. public). Although digital rhetoric has resulted in ongoing dialogue and negotiations among writers, audiences and institutions, 'it focuses on the multiple modalities available for making meaning using new communication and information technologies' (Hocks, 2003: 632). A key impact on digital rhetoric today is the fact that new media audiences participate in production; thus, invention becomes a work in progress (Pfister, 2014) and this has changed the way digital narratives are created. As digital narrative experimentation is becoming more common and growing across the creative industries and GLAMS, the merger of digital rhetoric and narrative theory into the seven-phase creation framework (Basaraba, 2018) can aid future creative practice. Existing practices of nonfiction narrative composition for GLAMs and tourism industry experiences has largely been top-down where the goals are often educational and financial as they wish to attract more visitors. Many virtual experiences have been tested on users after they are produced (and completed), but this again is a top-down approach. Beginning with the targeted audience (i.e. Phase One -'know the audience') is a way to develop the narrative (i.e. Phase Fournarrative 'invention') from the bottom-up drawing from existing datasets both in digital media and print media (Basaraba, 2018). Therefore, the following methodology is detailed to show how public interests and participation in narrative development for virtual heritage experiences can be achieved using existing cultural data.

Case study on UNESCO World Heritage Australian Convict Sites
The UNESCO World Heritage Sites were selected as a case study because they are recognised as culturally significant to the world and their designation is based on a specific set of criteria and rigorous selection process. The 11 UNESCO World Heritage Australian Convict Sites, designated in 2010, were selected as they include physical buildings, intangible cultural heritage content, and are a substantive group of thematically related, yet diverse sites. This more recent history (1788-1868) presents opportunities for vested personal interest by the public today who may have ancestors who were transported or went as free immigrants to Australia. The history offers many lesser explored perspectives (e.g. Indigenous Australians, forced labour, female imprisonment) and it impacted Australian national identity. Thus, there is immense narrative potential for exploration to allow for the creation of different emergent narratives according to the public's interests. Furthermore, this case study actually consists of 11 smaller studies connected by the central theme of convict transportation, which provides enough cross-media cultural content to analyse. The 11 UNESCO World Heritage Australian Convict Sites include four sites located in New South Wales -Hyde Park Barracks, Cockatoo Island, Old Government House and Old Great North Road; five in Tasmania

A narrative approach to virtual heritage experiences
For this case study, an iDoc was developed using Basaraba's (2018) seven-phase creation framework, but many different virtual narrative experiences can be created based on the framework and remixing invention process. This paper focuses on two out of the seven phases, namely on Phase One of 'know the audience' (i.e. develop user model) and Phase Four of 'invention', which is the bottom-up approach that was used to remix existing cross-sector and cross-media narratives into an iDoc on the 11 UNESCO WHS. The iDoc, titled Sentences for Transportation: A Virtual Tour of Australia's Convict Past, (about the 11 UNESCO Australian Convict WHS) was developed and remixed over a 14-week period based on the results from the follow bottom-up 'big data' analysis. It was created using Klynt software and published online as a prototype for user testing with two different audiences, for which the results are available in Basaraba (2020). The iDoc format was chosen because it is more easily accessible for a potential global user base (over a location-based mobile application, for example), it allowed for the inclusion of multiple different narrative perspectives and being published on a website allowed for a rather high level of interactivity to be incorporated into a non-linear narrative structure. This remixing approach drew upon existing datasets across three sectors, namely content produced by (1) the tourism industry, (2) by Internet users and more specific social media users, and (3) by subject-matter experts because these groups produce a huge amount of content about the selected heritage sites. This represents a bottom-up approach to narrative design because the topics included in the resulting creative project (i.e. an iDoc) emerged from the datasets rather than being pre-determined by the author's personal interests. A more common approach to developing narrative experiences is top-down because heritage institutions, such as GLAMS or documentary film makers, have a story already in mind that they want to communicate, and this can lead to certain perspectives, groups, and public interests or needs being overlooked. This is especially true in the case of history and heritage, where national narratives are often prioritised, and marginalised stories do not appear. Therefore, big data can be very informative and allow for the remixing of new 'untold narratives' and incorporating topics of interest to different audiences, which can be communicated through a variety of possible virtual narrative experiences (e.g. virtual exhibitions, transmedia stories, mobile applications, etc.).
Three datasets were used to develop the user model and nine datasets were used to develop the content model. Since the datasets were multimodal, they each required a different process of analysis and cross comparison. The datasets were selected to determine who is interested in the 11 UNESCO WHS, to get an overview of which narratives already exist, and of those, what would be of interest to the prospective iDoc audience. In an effort to find a systematic method of analysing these complex datasets, there are currently no techniques/tools that 'combine multimodal analysis, data mining, and information visualisation simultaneously, due to the inherent complexity and the challenges of disciplinary and theoretical integration' (O'Halloran et al., 2018: 23). However, O'Halloran et al. (2018: 24) developed a process that involves (1) determining the multimodal dataset including metadata and contextual information, (2) automated data processing using algorithms and manual analysis to identify key systems and (3) identifying discourse patterns in interactive visualisations to explore the content and tone of messages over time and space. This three-step process was applied since it allows for the analysis of a variety of different multimodal datasets and the use of different computational tools for analysis, which are supplemented by manual analyses to identify discourse (and in this case thematic narrative) patterns.

Determining multimodal datasets for user and content modelling
For the purposes of modelling, the potential user base for a virtual tour of Australian convict sites, existing data for the 11 UNESCO WHSs was collected from three sources: (1) official tourism statistics reported by the Australian Government, (2) visitor statistics from the WHS management reports and (3) TripAdvisor. Tourism statistics were gathered from: the Australian Government, the New South Wales (NSW) Government, the Government of Western Australia , Tourism Tasmania (the Tasmanian Government's tourism marketing agency) (Tourism Tasmania, 2018) and the Department of Regional Australia for Norfolk Island (Department of Regional Australia, 2012). The visitor statistics to the WHS were collected from the management authorities and other publicly available reports for: Port Arthur Convict Historic Site, Cockatoo Island, Hyde Park Barracks Museum, Fremantle Prison and Cascades Female Factory. TripAdvisor reviews were selected as a source because it is one of the largest travel review websites with 661 million reviews and an average monthly unique visitor count of 456 million as of October 2018 when the data were collected (About TripAdvisor, 2018), it provides a convenience sample of visitors to the sites (which are not easily accessible to the author), and they are user-generated rather than directed by academic questioning. TripAdvisor reviews were available for eight of the 11 Australian Convict Sites (Brickendon and Woolmers Estates, Darlington Probation Station and the Old Great North Road did not exist). The data variables collected for each TripAdvisor review included: the number of reviews, the location where each reviewer lives (e.g. city and country), type of traveller (e.g. couples, families, friends, solo, business), the date, title and the textual content of the review. To better understand who is visiting the WHS, TripAdvisor reviews were analysed to reveal common content themes and possibly unexpected aspects of visitor interests.
Taking a bottom-up approach to narrativizing a cultural heritage tour of the 11 WHS, nine different data sources were used to inform the content model (see Table 1). The nine datasets were grouped into three corpora, namely, tourism industry content, user-generated content and expertproduced content. These three corpora were identified before the respective datasets were selected for the purpose of analysing a cross-section of content produced by different 'authors' for different audiences who may overlap or have multiple common interests in the context of consuming cultural heritage virtual experiences. These three corpora also provide a bottom-up perspective from the tourism industry, who targets spontaneous or casual heritage consumers; social media or usergenerated content, which covers topics of interest to WHS visitors; and cultural heritage experts who have their own research interests and they share insights into the historical context through various publications. Dataset nine, consisting of songs and ballads, was grouped into the expert corpora because it serves as close to a 'primary' account, or oral history, of the events and experiences of the convict period. The sample size for each dataset was determined using mixed purposeful sampling, which depends more on quality rather than quantity (Koerber and McMichael, 2008: 467-468) and involves choosing more than one sampling strategy and comparing the results (Collins et al., 2007: 85). The nature of each dataset called for one of two sampling strategies, which were critical case and stratified sampling. Critical case sampling involves 'choosing settings, groups, and/or individuals based on specific characteristic(s) because their inclusion provides the research with compelling insight about a phenomenon of interest' (Collins, et al., 2007: 82). Stratified purposeful sampling involves dividing the sample into 'strata to obtain relatively homogeneous subgroups and a purposeful sample is selected from each stratum' (Collins et al., 2007: 85). Table 1 provides an overview of the multimodal contents analysed in each dataset and the respective sampling technique applied.

Automated data processing tools
Two main types of processing software were used to complete the multimodal discourse analysis were Voyant Tools and PixPlot. Although there are many different software available for topic modelling, Voyant Tools was selected for this study because it accumulates perspectives from many tools (Sinclair and Rockwell, 2015: 288) and could highlight areas for further investigation in terms of possibilities for narrative development. The textual data were uploaded into Voyant Tools (Sinclair and Rockwell, 2016) for the Convict Sites. Multiple individual tools within Voyant Tools were used as each provides different output, which were grouped and compared to aid analysis of the results. For example, topic modelling is a method of finding meaning in a large volume of text and they generate new ways of looking at content that emerge from the data rather than seeking to prove that a preconceived idea is correct (Graham et al., 2012: 119-120). Topic models represent the probability distribution of topics and they 'infer the hidden structure based on the resulting high cooccurrence of groups of words' (Gunther and Quandt, 2016: 11). These tools, in addition a supplementary process of content analysis, allowed discourse patterns to be identified in the resulting interactive visualisations and coding the content and tone of messages. The discourse patterns were summarised into a user model and a content model, which were subsequently used to develop the iDoc.

Results of the case study multimodal analysis
The prospective user model -Know the target audience As for TripAdvisor, there were a total of 11,295 reviews across the eight web pages for the Australian Convict Sites as of 17 October 2018 (see Figure 1). Based on the number of reviews posted, it can be inferred that the most popular Convict Sites for travellers are the Port Arthur Historic Site (Hobart, Tasmania), and the Fremantle Prison (Perth, Australia), which differs in terms of the second most-visited convict site based on the officially reported visitor numbers (Figure 2). The Government of Australia's regional statistics showed that New South Wales, Western Australia and Tasmania received over 22 million domestic visitors and 4.8 million international visitors combined. This shows that Australians are active domestic travellers, making up 82% of total visitors to the regions containing the WHSs, and that there are fewer international travellers (18%), which provides insight into the potential demographics for an IDN on the Australia Convict Sites. The government's overall international tourism statistics show that the top five countries of origin are China, New Zealand, USA, UK and South Korea (Tourism Research Australia, 2018). Of the 11,295 TripAdvisor reviews posted (as of 17 October 2018) on the Australian Convict Site web pages, 10,181 reviews could be analysed for country locations. Australian national tourism statistics for Sydney (see Figure 3) and Perth (see Figure 4) showed what type of travellers visited (e.g. alone, couples, families, family group, business associates, other). Over 40% of travellers to both Sydney and Perth travelled alone. There were 9607 TripAdvisor reviewers out of 11,295 who identified what type of traveller they were (e.g. couples, families, friends, solo, or business) and the majority of reviewers (47%) travelled in couples or with family/friends (43%) (see Figure 3).
As cultural tourism scholars have noted (McKercher and Du Cros, 2003;Espelt and Benito, 2006;Ramires et al., 2018), cultural heritage tourism segmentation is more effective when it is based on desired motivation/interest and experience in a place, thus a qualitative analysis of the content of the written TripAdvisor reviews was conducted to reveal common topics or themes that interested visitors. The main purpose of visits, according to Australian national statistics, to Sydney (25%), Perth (25%) and Tasmania (50%) was for holidays and secondarily to visit friends and relatives (NSW Government, 2018;Tourism Tasmania, 2018). Travellers were interested primarily in dining and shopping, but this may be due to the limited options provided by the tourism survey (NSW Government, 2018;Tourism Tasmania, 2018). On the other hand, the advantage in this analysis of voluntary TripAdvisor reviews is that their interests are not directed by a narrowed list of options and reviewers can freely write about what interested them and what they enjoyed or did not enjoy about their experience. Comparing the demographics from Tri-pAdvisor to the government-reported tourism statistics for Australia, there are some similarities and differences between the general tourist population and those interested in visiting the Australian convict sites. The similarities included the higher number of domestic (Australian) visitors (60%) than international visitors (40%), and the top nations of international travellerswho wrote reviews are from the USA, UK, and New Zealand rather than China as the international statistics showed. Since 40% of visitors to the UNESCO Australian Convict WHS are international, it shows that there is significant world interest in the sites compared to the 28% of overall international visitors to Australia in 2018. The main findings from the three datasets are summarised in the following user model (see Table 2).

Content modelling -Bottom-up narrative theme identification
The following cross-comparison summarises the main findings from each dataset within each corpus to highlight the topics covered, the modalities used, perspectives included the mentions of UNESCO WHS status and conservation measures. Table 3 provides an overview of the quantitative size of each corpus and shows that the UGC corpus was the largest, followed by expert-produced content, and finally tourism industry content. Across the nine datasets, the amount of content  published on each Australian Convict Site is indicative of the level of interest tourists and experts collectively have for each site. The WHS that received the most attention in terms of the existing content was Hyde Park Barracks, followed by Port Arthur, and then Fremantle Prison. The Sites receiving the least attention in terms of existing content were Brickendon, KAVHA and the Old Great North Road. The results also indicate that some Convict Sites will have more existing content available for use in a content model for developing a virtual experience. The richest dataset for the purposes of content modelling in the tourism industry corpus was the Australian Convict Sites' websites, the TripAdvisor reviews in the UGC corpus and the academic publications in the expertproduced corpus. Each of the corpora contributed specific topics that were cross-compared and grouped into the larger themes that made up the resulting content model. Rather than providing a complete overview of all the findings from each dataset, the key findings are summarised to show how they contributed to the development of this thematic content model. The themes were determined through a detailed cross-comparison of the multimodal discourse analysis that was conducted on each dataset separately.
The first theme in the content model focused on the 'sense of place' (i.e. infrastructure and atmosphere) and natural heritage (i.e. landscape), which emerged primarily, but not exclusively, from the UGC corpus. A key finding in the analysis was from the Australian Convict Sites' websites was the level of dark tourism aesthetics. The websites were categorised from dark to light in respect to the first visual impression, taglines or image captions appearing on the homepage. As a baseline of qualities contributing a dark tourism aesthetic versus a light website, Krisjanous (2016) found that more serious (i.e. dark) websites used solid, sharp, and formal, black-coloured font; landscape/ building photography with no people; muted colours and sepia tones to signify distance from the present; and more empty space between text or imagery. She found that less serious (i.e. lighter) websites used rounded and colourful fonts, more social photos of tour groups for example, and a more cluttered layout connoting playfulness or informality (Krisjanous, 2016: 348 A key finding from the analysis of the Instagram dataset using PixPlot showed seven clusters of photos taken at each of the 11 WHSs (see Figure 4). Photos of the buildings and infrastructure were of ruins of buildings at the Coal Mines site, the main penitentiaries at Port Arthur, Hyde Park Barracks and Fremantle Prison. Fremantle Prison's photo clusters within PixPlot showed the highest level of consistency as the subject matter with nearly identical image compositions. Sites featuring the most equipment were of furniture and machinery as the Old Government House and Cockatoo Island. For example, Cockatoo Island had a small cluster of black and white photos of the cranes/ machinery and a cluster of interior structures, which gave a dark tourism aesthetic to the photos. Maria Island's and KAVHA's (Norfolk Island) photos focused mostly on the beach, ocean, nature and landscape. Notably, the UNESCO-designated portion of Maria Island, the Darlington Probation Station buildings, were not photographed (except for the governor's house that sits atop a hill farther away from the convict area) signalling that visitors were more interested in the landscape than the convict history associated with the infrastructure. KAVHA was similarly photographed for its sunny tree-lined landscape rather than convict-built historical buildings.
In regard to themes two and three of the content model, chronological information about the historic sites was common in the tourism datasets including the tourism brochures and guidebooks. Due to the limited space of the printed medium, chronological timelines provided key information points and associated dates. Similarly, the travel guidebooks often provided a chronology of events and included further explication due to having more space than a double-sided brochure. Looking at the overall historical context of convictism, four of the five guidebooks dedicated more than 60% of the selected content to the general Australian history with the remaining analysed content covering the individual WHS. By using key dates as signposts for the linear progression of time and narrative, the brochures and guidebooks concisely cover a long period of the history. However, a key finding across the guidebooks was that not all publishers included the 11 UNESCO World Heritage Australian Convict Sites and further to that, UNESCO designation was rarely mentioned (see Table 4). The sampled datasets suggested that both members of the public and scholars are less inclined to mention the UNESCO brand, which raises the question for future research as to what world heritage and the UNESCO brand mean to visitors? the Cascades Female Factory, there are actors on-site who re-enact the history and provide an immersive experience into the stories of the women who were imprisoned there. Hyde Park Barracks Museum offers audio tours and the term 'immigrants' referred to those registered at the Barracks upon entry to Sydney. For example, reviewers wrote: 'You need to allow yourself 2-3 h to listen to the audio and have a look at all the exhibits' (Kazmam, 2018). Reviewers who visited Cockatoo Island mentioned the need to take a ferry from mainland Sydney and the camping options available on-site. The most common theme of interest for bloggers was the history of sites, which included many non-UNESCO sites related to the convict penal system. The majority of blogs in this sample (53%) covered stories of convicts as a process of genealogy tracing, they discussed related non-UNESCO-designated convict sites, the convict ships and books and artworks related to convict history and the heritage sites. A few blogs also provided practical tourist information about transportation to and from the heritage sites, the on-site or nearby accommodation, and places to dine, which also contributed to the modern-day site usage theme. The overall thematic content model (see Table 6) was based on the findings, which was used to develop the virtual narrative experience for the UNESCO World Heritage Australian Convict Sites. The virtual experience, an iDoc titled 'Sentenced to Transportation: A Virtual Tour of Australia's Convict Past,' was constructed based on the topics that emerged from multimodal discourse analysis of each corpus.
Theme six on 'Convict life and Australian history and identity' was included last in the iDoc production because it resulted from connecting different narratives (phase five of the creative framework (Basaraba, 2018)) from the other themes within the system and adding supplemental video content produced by experts. For example, a footer menu titled 'Convict Life' was included to curate the multiple themes that were represented in the expert-produced corpus. The most prominent  topics in the expert corpus were convict life and treatment and Australian identity and history, including discussion of the morals, values and religion of the society at the time. In addition to these, other topics included investigations into the convict labour and industry; violence and punishment; transportation; and heritage and tourism. The specific Australian Convict Sites receiving more attention were Port Arthur (appearing in eight articles) and Hyde Park Barracks (appearing in four). A number of convict sites beyond the 11 WHS were also included including Point Puer, Carters' Barracks, Moreton Bay Penal Settlement, Port Macquarie, Sarah Island, Port Phillip and Ross Female Factory. These topics demonstrate the breadth of concepts covered in the academic dataset and also highlights topics that were not widely addressed in the tourism industry and UGC datasets.

Discussion on remixing narratives from multiple data sources
The multimodal discourse analysis of the three corpora added value to the iDoc creation in different ways. To make sense of and utilise the results of this large-scale data analysis in the development of the remixed narratives for the iDoc, a cross comparison of the datasets and emergent themes was a necessary next step. Narrative composition is a creative (artistic) practice, which can be achieved in many ways. Since this was a scholarly project with specific narrative communication objectives, rhetorical theory was drawn upon (i.e. the seven-phase creation framework) to strategically develop a new virtual experience that communicated stories that emerged from the datasets. Rhetorical theory was applied as part of the manual analysis to understand the existing content and draw from the results to communicate specific narratives about convictism in Australia and associating this history to the 11 UNESCO World Heritage Australian Convict Sites. The nine datasets were analysed under these three components and the larger rhetorical context of ethos, pathos, and logos to understand the genre conventions in each corpus and systemise the cross-comparison of the resulting narrative themes as summarised in the section above. The ethos was used to consider who the content producers are, their desired target audience(s), and their communication goals or reasons for producing the content. The pathos was largely determined by the medium (print or digital) and modalities used, stylistic choices, such as tone of writing and design layout. The logos were considered in terms of the content themes, perspectives included/excluded, and the significance of the UNESCO World Heritage designation. The data collected were much larger than what is reported in the following sections and thus, only the most informative elements regarding ethos, pathos and logos for each dataset are discussed below in the context of how the analysis of the three corpora informed the development of the narratives for the iDoc.

Tourism industry corpus: A dark tourism visual aesthetic
The ethos across the selected three datasets from the tourism industry shows that it targets a general audience, with some focus towards the USA and UK in this case study. The tourism industry relies more on brand recognition than on the authority of specific writers or editors. The information is primarily created for providing pre-visit or on-site visitor information to highlight the features of the locations. The print medium itself acts as evidence of gatekeeping and a certain level of trust that the content has been vetted and verified. However, the guidebooks often include a caveat in the first few pages that they do not take responsibility for incorrect information as it was accurate at the time of publication, to their knowledge. The content analysis showed that some guidebooks do include incorrect or incomplete information. In terms of pathos, the tourism industry corpus highlighted a dark tourism aesthetic in some of the marketing brochures and website designs; text is the primary modality of communication; and the Australian Convict Sites as a collective group do not have a common visual brand identity, attributable to the fact that the sites have individual management authorities. As for the logos in this corpus, the datasets provided a wide breadth of surface-level history, giving a brief overview of dates, and they focused on available tourism amenities for the WHS. The guidebooks highlighted the history of Australia's founding and settlement, developments in infrastructure, but omitted the less accessible world heritage sites that have less tourism infrastructure. Similarly, the printed tourism brochures focused on visitor experiences (e.g. tours), buildings/infrastructure, and educational programmes. The analysis also uncovered that the websites included information on modern-day usage of the Australian Convict Sites as well as concerns for the natural environment and site conservation. On average, UNESCO WHS designation was included in 68% of the corpus. Therefore, key findings from the tourism corpus that contribute to the iDoc protostory invention were that there is a medium level of a dark tourism aesthetic, tourists are interested in key dates and statistics (e.g. number of convicts transported, dates buildings were constructed), and the guided tours were pivotal to the visitors' experiences. This manifested in the iDoc in the selection of accent colours used for text overlays and hyperlinked buttons, the background music for each narrative branch, and in writing the script for the voice-over narration that introduces the history and significance of each of the 11 WHSs.
UGC findings: Modern-day site amenities In the UGC corpus, the ethos lies in the idea that the content being created by tourists for tourists, and it serves as a form of electronic word-of-mouth marketing that consumers view as more authentic or truthful. Therefore, UGC presents alternative authorities on the subject matter. The UGC corpus shows more than it tells and thus, there is more emphasis on the pathos in terms of visually capturing the sites as evidenced by much heavier use of photography and video compared to the other datasets and the emotive responses in the comments sections. The text in the UGC corpus was written in first-person and the imagery focused heavily on the infrastructure and made up a substantial portion of the communications. Interestingly, the themes emerging from Instagram posts closely aligned with each site management authority's vision as stated in the public reports and the rhetoric on the official websites, which suggests that their communication/marketing goals were achieved. The modern-day and multi-purpose uses of the WHS was evidenced in the photos and TripAdvisor reviews (e.g. weddings, camping, amenities, temporary exhibits), which adds another newer cultural-use theme to the heritage sites. UGC particularly exposed the unique marketable features of each site and pointed to places where the narrative could potentially be expanded beyond the 11 UNESCO WHS. The corpus also highlighted other related information, such as related convict sites that are not UNESCO-designated (as seen in the blog posts), books and art related to convict history, and the surrounding nature and landscape especially for the Coal Mines, Cockatoo Island and Norfolk Island Sites. Although, UNESCO designation was rarely mentioned, appearing in 11.3% of the time across datasets. Based on these results, the iDoc invention incorporated heavy use of photography and some video in terms of modalities and provided narrative paths for producers to learn more about nearby related attractions (beyond the WHS) and hyperlinks to practical tourism information to provide transmedia extensions to the narrative.

Expert corpus findings: Heritage meanings
The expert-produced content has the highest level of ethos because the content is written by authors who are named front and centre on the publications, they are the authorities on convict history. This corpus focused more on convict life and the groups impacted by the penal settlements and colonisation of Australia. The expert-produced corpus provides more details and context on the wider historical impacts, including maps of the sites' infrastructure and archaeological digs. It ties the convict history into Australia's wider history, which could be of interest to iDoc produsers who are 'absorptive cultural tourists' (Ramires et al., 2018) and want deeper contextual information about the convict past. The pathos in the expert-produced corpus was stronger than expected since it had the most diverse usage of modalities including text, figures, maps, artwork and music. In terms of logos, the academic web pages and publications discussed convict treatment, transportation, the labour system, morals/values/religion and Australian identity. The songs/ballads expressed the convicts' feelings about being transported and leaving their homeland. The song lyrics included mention of Botany Bay and Moreton Bay rather than the specific prisons (i.e. the world heritage sites) considering that the infrastructure was not yet well-established or known by name in the UK and Ireland when they were written. It also includes lesser known perspectives of those impacted by transportation including juvenile offenders, Canada, Jamaica, India, New Zealand, Cape Town (aka The Cape) and the West Indies. UNESCO designation is collectively not often mentioned across the corpus (an average of 14.5%) and thus, while the brand itself does not seem to interest scholars working on Australian convict history to date, they frequently mention world heritage. The expert corpus importantly provided a larger macro-narrative theme of Australia's history and national identity for the iDoc prototype which helps connect the micro-narratives together.

Conclusion
This case study demonstrates a bottom-up approach to remixing cultural heritage narratives by using multiple existing datasets, namely content produced by the tourism industry, internet users (i.e. UCG on social media), and experts (e.g. scholars). The six identified narrative themes that emerged across the datasets (i.e. content model in Table 6) were used to create an iDoc with over 20 narrative branches (i.e. story paths) and 290 'film scenes'. The remixed narrative communicated the themes of Australian heritage and identity, the historical significance of each WHS, the impact on specific groups including female convicts and displaced Indigenous groups and transmedia extensions which directed users to other content sources where they could learn more on the themes that they were interested in pursuing further beyond the iDoc narrative. This data analysis also showed that heritage sites can improve their narrative communications, particularly UNESCO WHSs. It highlighted that many WHS visitors are unaware they are visiting a UNESCO site, and what its historical significance is. The close reading showed that since the Australian Convict UNESCO WHSs are individually managed, they do not cross-market the other WHS to visitors, which offer opportunities for future heritage-focused transmedia narrative creators to consider when determining their narrative communication/marketing priorities. Therefore, an identified area for future research on developing nonfiction virtual narrative experiences is to consider how to persuade positive behavioural change at UNESCO World Heritage Sites. The virtual narratives could better promote conservation and preservation efforts by the WHS management authorities, advocate respect and appreciation for the sites to visitors, and encourage (i.e. persuade) civic action and more sustainable tourism practices. New virtual narrative experiences developed for heritage sites going forward could also have a more educational focus (e.g. 'edutainment') with the communication goals being designed towards desired learning outcomes for students and/or specific civic actions for members of the public. This case study also serves as another step towards understanding the demographics of the public as cultural heritage site visitors (and as a tourist sub-community) and how to address their interests and/or knowledge gaps, which may be useful to museums, heritage institutions, scholars and the tourism industry. The multimodal discourse analysis of existing data sources showed that most visitors (60%) to the 11 UNESCO Australian Convict Sites are domestic, that international visitors tend to come from countries that are close in distance and culture, and they often visit in couples and families. The UGC, including TripAdvisor reviews and Instagram photos, showed that visitors were interested in the tours, place/location, history, and the modern cultural uses of space (e.g. Cockatoo Island for art festivals and the Old Government House as the filming site of a TV series). Thus, there are different layers and aspects of culture and history that visitors may be interested within one cultural heritage site and this combination and branching approach to diverse themes should be considered in future virtual experiences. Virtual narratives present an opportunity to increase the breadth of the social groups interested in heritage, the types of histories (e.g. different social groups) that are shared in the digital space, and for evolving interpretations and public contributions to cultural heritage narratives and these themes can be identified using multimodal discourse analysis of exiting datasets. .