Identifying the Relevance of Research Goals through Collecting Citizens’ Voices on Social Media

Recent debates on the meaning and use of science are focused on addressing citizens’ needs or concerns of society in different fields. Researchers have developed different methodologies for capturing the relevance of topics to be addressed by research in order to map them. This article proposes a new methodology for identifying the relevance of research goals through collecting citizen’s voices on Twitter and Facebook combing two approaches: top down, starting with already defined research goals priorities, and bottom up, departing from the social media. The article presents the results of the application of this methodology through the research goals of Sustainable Development Goals to identify their relevance and if there are some topics not covered by them. Thus, researchers could integrate this methodology in their daily work and be more in line with the needs expressed by citizens in social media.

ne of the key aspects discussed internationally by the scientific community is how research can be an answer to the citizens' concerns in different fields and how this knowledge can be available for any citizen in any part of world (Miyairi, 2014;Molloy, 2011;Whyte & Pryor, 2011;Woelfle, Olliaro & Todd, 2011). Hence, one of the current contributions that are addressing these questions are how different research areas can ensure social impact of their research, considering social improvements (Flecha, Soler-Gallart & Sorde, 2015) as one of the criteria for evaluating this impact. For instance, some of the priorities that society is concerned about is violence against women, and one of the trends is focuses on finding ways to overcome it. One first step is to break the silence in those spaces where silence still prevails. This is the case of violence against women in Spanish Universities (Valls, Puigvert, Melgar & Garcia-Yeste, 2016), research evidences are contributing to this aim. There is a wide list of examples of research that are addressing some of the priorities set out by society in different scientific fields with the corresponding evidences of their success. For instance, research on the improvement of soil quality reducing oil nitrous oxide emissions (Cayuela et al., 2014) is addressing one of the concerns of the farming community to maintain soil quality and reduce negative effects of the use of artificial products on the agricultural fields.
But a previous step is also how to identify the relevance of research goals in order to address them (Schulz, 2016). The scientific community has developed different methodologies for identifying this relevance (Altmann, Whichard & Motter, 2013;Camarinha-Matos & Afsarmanesh, 2004;Valkila & Saari, 2013). These methods are mainly based on expert's views or documentation being useful for identifying current research trends. However, the dialogic turn in our societies (Aubert & Soler, 2006) is a transversal fact that affects all different areas of society as well as research. This means that the inclusion of the citizens' voices could be helpful for the identification of the relevance of research goals. In fact, the participation of the citizens in an egalitarian way together with the researchers has a wide trajectory in the communicative methodology (Gómez, 2014) as well as in fields such as dialogic leadership (Redondo, 2016).
One method for including the citizens' voices in the identification of relevance of research goals is through data collection in social media.
O International and Multidisciplinary Journal of Social Sciences,6(1) Citizens are currently using social media and other relevant online sources to express their opinions and interests (Wandhöfer et al., 2012). Yet it is important to consider the limitation that these interests are representative of those citizens who are using social media and online sources. Therefore, this article presents a methodology for identifying the relevance of research goals considering citizen's voices collected in social media through two approaches; one departing from the goals set out by supranational organizations such as the UN (top down) and gathering data collection in social media related to their corresponding keywords. In addition, another one that gathers the most pressing issues and concerns that are present in the social media (bottom up). To exemplify this methodology, the present article provides a comparative analysis of the results obtained through social media and other relevant online sources related to the priorities defined under the Sustainable Development Goals (SDG, hereinafter) defined by United Nations.

Methodology for the Identification of the Relevance of Research Goals through Social Media
The use of social media data as an information source for research purposes has increased over the last years in different scientific fields as well as subject matters (Ngai, Tao & Moon, 2015;Wu, Sun & Tan, 2013). The fact is that the number of citizens that use social media is increasing year by year. According to Statista 2 there is an estimation of 2.51 billion social media users around the globe. Facebook has 1.87 billion active users monthly and Twitter has 319 million active users monthly. In this sense, content and communication shared by citizens through social media is influencing different sectors. For instance, the information is taken into account in business and marketing (Khang, Ki & Ye, 2012) as well as for the political agenda (Bastos, Raimundo & Travitzki, 2013;Torres-Nabel, 2015), news covered in media (Broersma & Graham, 2013) or as a crucial means in natural disasters in order to save lives (Bruns & Liang, 2012) among others. Along these lines and with the aim to serve societal needs, research funded by different institutions must respond to the challenges that those institutions have designed in their priorities, which also need to be aligned to 74 Cabré-Olivé et al -Relevance of Research Goals on Social Media the citizens' needs. Hence, this methodology aims to contrast data obtained through social media with the priorities designed by the corresponding institution or researchers for identifying the major trends among the citizens' interests as well as to find out if some of the trends are not covered in these priorities. Top down and bottom up approaches are designed to obtain these results.

Top Down Approach
The top down approach consists in defining keywords extracted from the research goals designed by the corresponding institution to contrast if these goals are present in the citizen's opinion expressed in social media. Once the keywords were selected, they were converted into searchable words in social media. In the case of Twitter, this conversion of keywords implies using Hashtags. The scientific literature found that the use of Hashtags is one of the criteria to identify relevant topics in the Twitter community (Grasso & Crisci, 2016;Small, 2011).
To provide an example, there are 17 Sustainable Development Goals defined by United Nations 1 that have been converted in 30 searchable keywords on Twitter. This conversion consists in finding the most suitable hashtag for the goals defined here.
Once the list of obtained keywords is ready the following step is capturing the data on the Social Media and online sources selected. In this case, there are four sources selected considering that they are popular among citizens and are included as information sources in Almetrics. These sources are two social media (Facebook and Twitter) and two relevant online sources, Wikipedia (Internet free encyclopaedia) and YouTube (video sharing website). The contrast of information between different social media and online sources data is considered an important step in order to get a more consistent map of the information obtained (Nam, Lee & Park, 2014).

Data Collection & Analysis
In order to capture and treat data from Twitter and Facebook a combination of two software programmes is used: R-program and NVivo Plus. Wikipedia has an internal statistics tool, "Page view Statistics," and YouTube has its own search tool with a filter of view count.

Facebook
Data collection on Facebook is developed using the Rfacebook Package installed in the R-program. In this case, the data obtained is the number of public pages on Facebook that contain the searchable keyword in its public name. In this case the "N" is also defined with the value N = 10.000, but there is no time limitation. The value of the "talking about" determines if there are more or less people interacting with this page.

Wikipedia
Data collection on Wikipedia is extracted through the internal data statistics tool of Wikipedia namely "Page view Statistics". In order to analyse the presence of a keyword, it is necessary to find out if the keyword is on Wikipedia or not. If it is present, the number of page visits determines the value of presence. Moreover, another indicator is the number of languages in which the definition of this keyword is available. This tool also provides the option to select the period to be analysed, in this case, the range chosen is 30 days. Once some of keywords are introduced in the search tool one is redirected to other keywords that are synonyms and that are quoted in parenthesis. For instance, quality education is part of another general keyword that is education and results of the latter are collected.

YouTube
Data collection in YouTube is conducted using YouTube's internal search engine and applying the filter of "view counts". YouTube has many videos in its platform, YouTube does not indicate an exact number instead of this an approximation represented with this quote "About X videos". In order to extract the data, the number of the approximated videos is annotated, and the number of view counts of the five most viewed videos is added. One of strategies to refine results is to introduce the keywords in double quotes. Therefore, the relevance is determined by the number of view counts of the five most viewed videos for each keyword. It is noteworthy that this source is the only source where all keywords searched are present. Results obtained by each social media and online source analysed are ranked by larger or smaller presence. The ranking is useful for analysing which keyword is most used by the citizens who are interacting in these social media or online sources allowing a map of these keywords. Finally, this allows us to have a global overview of the keywords' presence. This ranking is called Ranking of Total online interactions and is based on the sum of online interactions in the two social media selected (Twitter and Facebook) and two relevant online sources (YouTube and Wikipedia).

Bottom Up Approach
The bottom up approach consists in identifying topics emerging from those keywords most used by citizens in different social media and online sources. Once the list of topics is obtained, results should be contrasted with the priorities defined by the corresponding research institution and analyse if there are some issues that are not covered by the institution. In this case, the list of topics obtained is contrasted with the Sustainable Development Goals defined by United Nations.

Data Collection & Analysis
There are two strategies for obtaining data collection from this approach. One of them is to analyse social media secondary literature reports that collect topics with large presence in the social media. For instance, Facebook elaborates a report of the most talked about global topics in a year. For the present article the report of 2016 is analysed 3 . Twitter also elaborates a list of the trending topics (TT) of the year; therefore 2016 trending topics of Twitter are analysed 4 . Lastly, the report on Wikipedia's popular pages 5 during a year is also used for identifying topics with most online attention. In this case the report of the 5.000 popular pages in 2016 is selected in order to analyse the first 500 popular pages. The next step is to monitor in real 78 Cabré-Olivé et al -Relevance of Research Goals on Social Media time the Trending Topics of Twitter in countries selected during a period defined. In order to do this monitoring the R-program is used with the application of extraction of the 50 Trending Topics. For the present research the code was designed for extracting the 50 Trending Topics for two moments of the day (noon and night) during one week (March 7-March 13, 2017) in 14 countries (Argentina, Belgium, Brazil, Canada, France, India, Ireland, Italy, Kenya, Mexico, Nigeria, Spain, United Kingdom, United States).
In both strategies, the selection of the topics includes all those keywords related to citizens' interests excluding all those topics related to TV shows, music, sports, religion, geography, entertainment that would be a subject matter for other research. Once the list is, obtained results are contrasted with the sustainable development goals in order to identify if there are some issues that are not covered by the institution's priorities.

Results
Results are classified in two sections those obtained through the Top down approach and those of the Bottom up approach. The first section of results is aimed to analyse which keywords of sustainable development goals have larger presence in each social media and online source selected as well as an overview of total online interactions for each keyword. This comparative analysis evidences those topics that receives most attention by citizens. On the other hand, results obtained through the bottom up approach is useful for the identification of emerging topics that are not yet covered under the sustainable development goals and could be a relevant information for developing the next strategies.

Top Down Results
Keywords with higher presence in Twitter are Climate Action and Justice, followed by the keywords with medium-high presence that are social topics such as gender equality or eradication of the hunger around the globe but also there are concerns related to the preservation of the Earth as well as natural resources (Climate Change, Clean Energy, Clean Water, International and Multidisciplinary Journal of Social Sciences, 6(1) 79 Biodiversity) and infrastructure. On the other hand, the five keywords with lowest presence are Inclusive Societies, Ocean Sustainability, Peaceful Society, and Responsible Production. Combat Desertification and Global Partnership do not have any tweet. Keywords appearing in the public names of Facebook pages with more "talking about" are Industry and Justice. The latter coincides with one of the keywords with higher presence in Twitter. Innovation, Infrastructure and Good Health are keywords with a medium-high presence on Facebook pages. On the other hand, the five keywords with lowest presence are Reduce inequalities, Global Partnership for Sustainable Development, Decent Work and Inclusive Societies. The keyword Ocean Sustainability appears in two names of Facebook pages, but nobody is talking about it.      As result of this comparison, the relevance of Sustainable Development Goals for the citizens considering the online presence obtained is the following.

Bottom Up Results
Bottom Up results are obtained through two strategies: secondary analysis of the social media reports and analysis of 50 Trending Topics of Twitter in 14 countries during one week. The topics of the secondary analysis were extracted from the yearly reports of Facebook, Twitter and Wikipedia.

Topics Emerged through Secondary Analysis
Facebook and Twitter provide ten issues as main topics of the year 2016. Among them, politics are relevant, for instance the US Presidential Election 2016 is present in the three sources analysed, Brazilian politics is a topic collected in Facebook's report and Euro2016 is from Twitter's report. Brexit is also a recurrent issue in Facebook and Twitter, it was a relevant concern by citizens from the United Kingdom, but also citizens from other countries have paid attention to this fact. Two controversial presidential candidates were topics during 2016; one of them is Rodrigo Duterte running in the Philippines' Presidential Election being a topic in Facebook, and the other one was Trump in Twitter, both candidates won the elections. Black Lives Matter also was a relevant topic during 2016 in Facebook and Twitter, this international activist movement began on Twitter under the hashtag #BlackLivesMatter and is aimed to combat violence against black people due to a systemic racism towards them. In the case of Wikipedia, higher attention is paid to historical violent facts (World War I, World War II, September 11 attacks, Syrian Civil War, and Ku Klux Klan), also the list of stock market crashes and bear markets is a relevant topic in Wikipedia. Social media and other relevant online sources are themselves a relevant topic in Wikipedia, for instance Facebook, Google, YouTube and Gmail are topics included in the list of most accessed pages, as well other technological concepts such as Java (programming language). This matches with the current trend of "learn to code movement" and the aim to teach the young to program with codes since childhood. There are three more topics: one of them is Earth, representing a curiosity for knowing the place where humans are living and how it is defined; Millennials as a topic that defines a generation that are currently changing the work culture, with familiar uses of social media and technology and with more social conscientiousness; and the last one is the Zika Virus which was an important health concern during 2016.

Topics emerged from the Monitoring of Trending Topics in Twitter
Topics emerged from the analysis of Trending Topics in Twitter are diverse, but there is a high presence of women's movement probably because the week chosen included March 8, the International Day of Women. However, at the same time, there are other topics that are present in these Trending Topics, such as the Violence Against Women, Frauds, or civil movements are present in these topics. Results are introduced in two steps: first, an overview of topics selected by each country and, second a table comparing the topics selected and if they are or not among the Sustainable Development Goals in order to find out if some topics are not yet taken into account by the United Nations. Once the list of Trending Topics monitored is obtained, the second step is to compare if they are present or not among the Sustainable Development Goals defined. Therefore, a classification of Trending Topics (TT, hereinafter) is developed in order to compare if all topics are covered or not under the Sustainable Development Goals. In this classification, we distinguish between those topics that are included among the 17 SDG and those that are not.
Three TT are directly related to the goal of No Poverty, for instance the hashtag #pobreza. There are 34 TT collected that are related to the goal of Gender Equality. Most of them focused on the dissemination of the International Day of Women and Equality (ex: #IWD17, #JourneeDesDroitsDesFemmes, #genderequality, among others), others focused on the Sexual Reproductive Rights of Women (ex: #March4Repeal) and Violence Against Women (ex: #ReportItToStopIt). Clean Water and Sanitation has only 1 TT that is #Right2Water. Industry, Innovation, Infrastructure is another SDG that contains many TT, specifically 24, with hashtags such as #innovation for the keyword innovation; TT related to infrastructure and transport (ex: #Viasexpressas, #Transporte) are the most used. Good Health and Well-being have 7 TT of health in a global perspective and mental health particularly (ex: #united4cymh). Climate action has one TT that it is #NousAccusons. There are 28 TT identified related to the goal of Justice, Peace and Strong Institutions. Terrorism is one of them (ex: #BokoHaram #ISIS), others are against fraud (ex: #NoFrauds, #STOPCorrupció), related to political elections (ex: #VoteNowKenyaDecides), justice (ex: Justiça Federal, #SupremeCourt) and peace (ex: #Kenyans4peace). There are 7 TT related to Quality Education, most of them with mobilizations to claim rights in education focused also on teachers' rights (ex: #Parodocentenacional, #Lecturestrike). There are 10 TT related to the SDG Affordable and Clean Energy focused on clean energy mostly (ex. #cleanenergyEU) and specific energy (ex: #solarpower). Only one TT identified is related to Life of Land, that is #biología. There are 15 TT identified related to the goal Decent Work and Economic Growth with topics focused on autonomous jobs (ex: #lavoroautonomo, #IniciativaPymes, #selfemployed), workers' rights (ex: #MarchaCGT), and investment for future work (ex. #futureofwork), TT related to the general economy and 94 Cabré-Olivé et al -Relevance of Research Goals on Social Media budgets are present (ex: #Transatlantic2017, #budget2017). There are two TT related to Zero Hunger and they focus on food and waste (ex: #UCCFoodmatters, #foodwaste). There are six TT related to Sustainable Cities and Communities that are focused on smart cities and innovation, fair cities (ex: #smartcitybru, #CITInnov8) as well as the Housing right claim (ex: #ViviendaPorMéxico, #LaPAHsePlanta). There are 7 TT related to Reduced Inequalities (ex: #equality, #equidadenpuebla, #egalitécompte). Lastly, no Trending Topics were identified that relate to the following three SDG: Responsible Consumption and Production, Life below Water, Partnerships for the Goals.
The Trending Topics identified that are not included in the Sustainable Development Goals are the following: topics related to Open Data (ex: #opendata, #openbelgium, #opensource) and debates focused on privacy and transparency of information (ex: #privacy #wikileaks). Also, debates around opening codes or learninh how to code are present among citizens (ex. #Rewritingthecode, RailsgirlsDUB). There is concern about racism (ex: #muslimban, #JusticePourSofiane) and about the most vulnerable people living in bad conditions, for instance refugees (ex: #refugees #patera), the recognition of minorities' rights (ex: #viacampensina, #NativeNationsRise) and solidarity itself (ex: #solidarity). Another trending topic is health, particularly striking are topics such as cancer (ex: #bloodcanferconf, #cancernoonshot) and kidneys, but also issues related to the health system (ex: #HealthStrikeDay94, #Trumpcare). Concerning the economy there are trending topics with new contributions such as bitcoin or circular economy, but also peer-to-peer investment for implementing new initiatives (ex. #PeerFundingPays, #PeerFundYourProject). There are worries related to disasters, as for instance, the consequences of Fukushima, as well as the scandal for not having a team to save Nyamai, or disagreeing with the information disseminated by Media (ex. #ShameonKmedia). Lastly, it is important to highlight that there are two trending topics defending the need of maintaining Europe united and its benefits (ex. #FutureofEurope, #pulseofeurope). Last but not least, there are concerns related to the consequences of breaking apart from Europe such as Brexit, with the Trending Topic of #brexitbill. And the last trending topic collected, which is