More than a Feeling: Insights and Information from a Sentiment Analysis Study

Conferences have gained a significant role within the scholarly communication ecosystem as they contribute to the immediate dissemination of scientific results and, also, to the professional socialization. LIBER’s Annual Conference constitutes one of the main channels of communication for the information professionals of the European research libraries. Each year, a parallel universe arises during the conference’s activities in the social media, especially in the micro-blogging sphere of Twitter, where conference attendees express their opinions about trending topics, organizational aspects and conference talks. The current study proposes that the implementation of sentiment analysis to data generated on that sphere could be a useful decision-making tool for both the organizational committee and the host organization. By applying a sentiment analysis on 5,500 tweets regarding the LIBER 47th Annual Conference in Lille, France, we highlight the positive and negative aspects of the specific event as they were mentioned by Twitter users.


Introduction
Academic conferences constitute venues where intellectual discussions emerge and frame the future of a scientific domain. This fermentation is made formally through the scheduled talks, presentations, workshops and other activities, as well as informally via the dynamic backchannel of communicating on the social media sphere, such as micro-blogging. It has been found that the Twitter activity for scholarly purposes is more intense during a conference than every other day (Ross, Terras, Warwick, & Welsh, 2011). The LIBER Annual Conference attendees seem to be 'hyper-active' in the platform of Twitter, as previous conferences have proved. Recent observations show that the Annual Conference hashtags, such as #Liber2017 for Patras, #liber2018 for Lille and #liber2019 for Dublin, have topped the national Twitter trends. Such an amount of data cannot be manually analyzed in an effective way, so every researcher who would be interested in detecting, classifying and correlating entities and statements would have to apply machine learning techniques. Sentiment analysis constitutes an emerging Natural Language Processing (NLP) method for detecting people's emotion in texts (Cambria, Schuller, Xia, & Havasi, 2013). In this way, any implementation can capture opinions, preferences and interest about products, events, marketing campaigns, etc.
The current study attempts, not to analyze Twitter's users behaviour for scholarly purposes, but to explore their use for "organizational enhancements" (Reinhardt, Ebner, Beham, & Costa, 2009). An effective exploitation of these spontaneous textual data would ameliorate decision making from two viewpoints. From the macro viewpoint, LIBER governance would consider the opinions posted on Twitter and redirect the spotlight of its activities, democratizing in this way the process of decision making and materializing one of its core values, inclusivity. Second, from the micro viewpoint, it would give each local conference organiser the opportunity to evaluate the opinions expressed, especially the ones regarding organizational issues of the events, allowing them to foresee areas of interest and increase the ability to better organise their event.

Background
Conferences serve as nodal points of information dissemination, professional advancement -scholars may use them as a starting point for their academic career and others as an opportunity for skill development-and networking (Seidel, 2018). As Mahrt, Weller and Peters (2014) mention, in parallel with these formal communication channels, informal communication activities, like tweeting, are taking place. Tweets have been analysed from different perspectives, both quantitatively, such as the level of activity, participation and message distribution using qualitative (Borgmann et al., 2016) and quantitative methods (Mishori, Levy, & Donvan, 2014).
It is profound that these datasets need a different methodological approach in order to overcome the size and complexity issues; machine learning methods, such as sentiment analysis, can be an answer. Kimmons and Veletsianos (2016) focus on the tweeting activity of the American Educational Research Association annual conferences between 2014 and 2015, inferring that participants with different roles have different behaviour. The same research applied sentiment analysis to the conference's hashtags, concluding that academics generally exhibit positive sentiments. Desai et al. (2012) also tried to detect sentimental polarity at a medical conference's tweets and found that, in general, informative tweets had more negative sentiment scores than uninformative tweets. They also found that these were more negative during conferences than in the leading up period.
The use of sentiment analysis as a decision-making tool has been recognised lately from researchers coming from diverse fields. The detection of investors' sentiments would be a safe indicator about stock market performance (Wu, Zheng, & Olson, 2014), while other researchers imply that the implementation of such methods would be beneficial for the amelioration of supply management decisions (Wood, Reiners, & Srivastava, 2015). Additionally, big data analytics with sentiment analysis can improve brand managers' and consumers' decisions (Çalı & Balaman, 2019;Kauffmann et al., 2019).
In the library and information science field, sentiment analysis has been used lately for managerial purposes, such as performance evaluation. Papachristopoulos, Ampatzoglou, Seferli, Zafeiropoulou, and Petasis (2019) tried to overcome the handicap of the absence of a dedicated evaluation questionnaire of the Hellenic Open University Distance Library and Information Center by implementing sentiment analysis to user comments on the account of the library in social media platforms, while Moore (2017) applied sentiment analysis to comments that originated from the widely accepted library evaluation tool LibQUAL+.

Liber Quarterly Volume 30 2020
To our knowledge, this is the first application of sentiment analysis to a library and information science conference's tweets that examines tweeting activity from a managerial perspective. Twitter comments could be used creatively, under the lens of sentiment analysis, as a tool for updating the topics agenda of the LIBER Conference Program Committee or as an organizational compass for the hosts of the conference.

Methodology
The proposed work implements sentiment analysis on 5,530 tweets regarding the LIBER 47th Annual Conference in Lille, France, which was held between 4-6 of July 2018 and was entitled "Research libraries as an Open Science hub: from strategy to action". Tweets were identified by the hashtag #liber2018 and were mined via the TAGS tool (v6.1.9), 1 published in the period 18/6/2018 to 9/7/2018. Before analysing, we detected and removed many intruding hashtags that were irrelevant to the scope of the conference, but they were taking advantage of the attention of the #Liber2018 hashtag.
For the purposes of our study, we used Semantria 2 a platform which enables researchers to handle data from diverse channels, including Twitter, Facebook and WordPress, in multiple languages. Our decision to choose this specific tool is justified by other studies that compare the performance of similar software (Serrano-Guerrero, Olivas, Romero, & Herrera-Viedma, 2015). According to these tests, Semantria was found having stable behaviour with low errors. The software detects sentiment in various levels, such as in paragraphs, sentences or proper nouns that represent an entity, implementing the necessary Part Of Speech (POS) tagging. Semantria is available both as cloud API and Excel plug-in. For the needs of the current study, we used the latter version, modifying the sentiment polarity scale. The scale for sentiment ranking ranged from "−2: completely negative" to "2: completely positive". Intervals at "−1.2: mostly negative", "−0.6: slightly negative", "0.6: slightly positive" and "1.2: mostly positive", covered the interim levels of sentiment intensity, while "0" responded to entirely neutral statements. Score assignments close to 0 are characterised as neutral. It has to be mentioned that negative and positive polarities have various nuances and they should not be considered as being one-dimensional, e.g. bad or good. A positive comment may reflect happiness, calmness, interest for something, etc., and, accordingly, a negative may reflect sadness, anger, boredom and so on (Poria et al., 2012). We configured the system to detect queries' sentiments. Queries were specific, preferred terms that were imported in order to be matched together (query co-occurrence) and have their sentimental polarity compared. The queries that were imported to the system were a set of broad relevant terms to the themes of the conference, such as 'Citizen Science', 'Copyright', 'FAIR', 'Infrastructures', 'Metrics', 'Open Access', 'Open Science', 'Research Data', 'Science' and 'Skills'.

Queries
The queries were detected in 1,876 tweets, which constitute 33.92% of the total number of tweets. From these 1,876 tweets, 64 (3.4%) were detected and defined as negative, 1,542 (82.19%) as neutral and 270 (14.39%) as positive. The sentiment analysis of our queries (see Table 1) indicated that the most negative comments were found on copyright-related commentary, while, on the other hand, the most positive ones commented on the FAIR principles. It was evident that the majority of the tweets were categorised as neutral, but also interesting was the fact that, although the 'Open Science' query gathered   many comments -it was the second more popular -, only one of the tweets was labeled as negative.
Additionally, we detected query co-occurrence. Co-occurrence happens when topic a and topic b, regardless of their sentiment polarity, appear in the same document, i.e. tweet. Table 2 shows that otherwise conceptually distant topics, such as FAIR, Metrics and Copyright, are likely to co-occur. This shows however that the LIBER community is concerned about the copyright and impact aspects of FAIR data management. Similarly, Open Science was associated with the human and technical dimensions that its advocates take into account, as the professional skills and the infrastructures were in the spot of discussion.
In NLP, phrases that were recognised as important were classified as themes. Semantria detected the most descriptive noun phrases and assigned to them sentiment polarities. Table 3 presents the top themes according to their polarities. As one can see, the theme of 'Open Access' has been associated with both positive and negative mentions, indicating that there is a range of reactions to the developments of the field.

Hashtags
Regarding the hashtag #Liber2018, there were only 19 negative comments from a total of 5,530 comments. The comments can be grouped in five categories: users' comments concerned with keynote speakers' impression, the experience of conference's events, the support that the organisational team offered, the general atmosphere and the performance of facilities. Table 4 provides indicative tweets about each one of these categories. One of the most positive hashtags was #saveyourinternet. This was related to the DSM Directive and the risks for internet freedom that the Directive was introducing. We infer that that the hashtag was associated with acts of encouragement and fight for the aversion of the proposed articles. These articles had their respective hashtags, #article11 and #article13, been detected as negatively charged. The most positive hashtag was associated with the concept of innovation, namely #innovativedissemination, while a similar hashtag, this of #disruption, invoked negative feelings. Another positively charged hashtag was this of #collaboration, which is principal to the LIBER strategy.
We noticed that initially positively charged comments could be transferred to the other extreme and be re-charged negatively. For instance, #openscience was a positively expressed hashtag, but its co-occurrence with a mention of the complaint filing against Elsevier, a publisher criticised heavily for 'closed' practices, gave to it negative attributes and its sentiment was propagated through a much favourited retweet during the conference timeline. This shows that the hashtags can be interpreted differently in various contexts. Two specific sessions, namely the Knowledge Cafe and the Library Innovation Award, were highly appreciated by the attendees, as their respective hashtags, #knowledgecafe and #libraryinnovationsaward, received positive comments. While it is safe to conclude that the positive comments of the latter are quite natural, due to the nature of the session, which is a rewarding one, the organisational complexity and intensity of both parts of the Knowledge Cafe seem to have been unnoticed by the participants and left them good impressions.

Language constructs
As almost two thirds of the tweets were in English, it was evident that tweeting in the context of an international conference dictated the use of English language, which was followed by the language of the host country, French, and German, which was spoken by a considerable amount of participants.
Additionally, we analysed the syntactic context of the tweets and detected negation and intensification terms. Negation refers to a term that assigns negative tone to the words of its context, whereas intensification refers to adverbs or adverbial phrases that give emphasis to their context. Participants use more intensifying terms for positive reasons, such as 'absolutely', 'admittedly', 'all', 'already', 'easily' and so on. On the other hand, negation is expressed with a limited set of words, including only no, not, nothing, while the term 'really' belonged to both sides.
Finally, we explored the use of emoticons, which are very popular in the Twitter platform, as these can condense sentiment information in a visual way. However, we found that they are rarely producing the desired outcome and most of the times they are placed in tweets with neutral sentiment content. The emoticons that are mostly used are portraying happy faces, see -) and :3, grinning face (:d), winking faces, see ;) and ;-) and heart (<3).

Conclusions
The current study attempted to shed light on what Reinhardt et al. (2009) called "organisational enhancements". Sentiment analysis can be used both as conference assessment and planning tool. Conference committees could enhance the dialogue of the conference participants on topics that are loaded with both negative and positive polarities in order to give the necessary platform to the community to express and mature their opinions.
Our analysis indicated that 'Copyright' and 'Open Access' issues, although constantly present in the conference agenda, are so polarised that they have the potential to remain within the focal lens of LIBER's community. The copyright related tweets carried a lot of intense sentiments, as at the time of the conference there was a heated discussion about the new Directive on Digital Single Market. On the other hand, comments on Research Data were quite few, with rather lukewarm reactions, indicating that LIBER's community has a rather concrete opinion about this topic. Similarly, the Open Science issues have a concrete interest for the LIBER community, without though raising any high feelings. The co-occurrence of themes shows that several of the topics that LIBER has high in its agenda can be intertwined. The case of FAIR, Metrics and Copyright, which belong to different strategic directions, shows that there is margin for common work from the working teams of LIBER.
Furthermore, we must keep in mind that a conference is not only a venue of discussion, but also an experience. Such experience is affected by organisational issues which include the peripheral events, the facilities and equipment, the staff's and volunteers' support and the overall atmosphere. Every host organization should have in mind previous malfunctions and work towards their resolution that will conclude to positive testimonials. Any problem occurred may affect the image of the conference and, accordingly, have an impact on the retention of attendees (Lee & Back, 2008).