Identification of mobile development issues using semantic topic modeling of Stack Overflow posts

Background Increasing demands for mobile apps and services have recently led to an intensification of mobile development activities. With the proliferation of mobile development, there has been a major transformation in the architectures, paradigms, knowledge domains and skills of traditional software systems towards mobile development. Therefore, mobile developers experience a wide spectrum of issues specific to development processes of mobile apps and services. Methods In this article, we conducted a semantic content analysis based on topic modeling using mobile-related questions on Stack Overflow, a popular Q&A site for developers. With the aim of providing an understanding of the issues and challenges faced by mobile developers, we used a semi-automated methodology based on latent Dirichlet allocation (LDA), a probabilistic and generative approach for topic modeling. Results Our findings revealed that mobile developers’ questions focused on 36 topics in six main categories, including “Development”, “UI settings”, “Tools”, “Data Management”, “Multimedia”, and “Mobile APIs”. Besides, we investigated the temporal trends of the discovered issues and their relationships with mobile technologies. Our findings also revealed which issues are the most popular and which issues are the most difficult for mobile development. The methodology and findings of this study have valuable implications for mobile development stakeholders including tool builders, developers, researchers, and educators.


INTRODUCTION
In today's digital world, with the spread of mobile devices and technologies, the demands for mobile-oriented services and apps in all industrial and social fields are increasing exponentially day by day.Every day, a large number of mobile apps are presented for users in the Android Play store and Apple App store (Jabangwe, Edison & Duc, 2018).This strong increase in demand for mobile services and apps has led to the intensification of mobile software development activities and the emergence of mobile software engineering as a contemporary discipline (Ahmad et al., 2018b).The advent of mobile software engineering has also led to a major transformation in traditional software engineering architectures, methodologies, knowledge domains, and skills (Dhillon & Mahmoud, 2015;Nagappan & Shihab, 2016;Ahmad et al., 2018b).For this reason, an increasing number of identified.More specifically, the methodology of this study was designed to seek answers to the following research questions (RQ): RQ1.What issues are asked about mobile development?RQ2.What are the most difficult and most popular issues for mobile development?RQ3.How have the trends of mobile issues changed over time?RQ4.What are the most asked mobile technologies?RQ5.How have the trends of mobile technologies changed over time?

BACKGROUND AND RELATED WORK
We base the background of our study on three fundamental pillars and discuss the related work here under the headings including mobile development, Stack Overflow, and Latent Dirichlet Allocation.

Mobile development
Mobile developers design, develop, and execute applications for smartphones and other mobile devices.They usually develop mobile applications on a specific type of operating system, such as Android, iOS, or on cross-platforms (Nagappan & Shihab, 2016).Therefore, mobile development contains different paradigms and methodologies from those in traditional software engineering (Elsayed et al., 2019;Gurcan et al., 2022b).In the process of developing mobile applications that have become a part of our lives, mobile developers face a wide spectrum of experiences, issues, and challenges unlike common developer issues (Ahmad et al., 2018b).From this perspective, the processes, experiences, and practices of mobile development contain domain-specific issues and challenges faced by the developers (Nagappan & Shihab, 2016).
Given the issues and challenges of mobile development, we have observed that the studies so far has focused on issues and challenges in specific contexts of mobile development such as platform-specific issues including native, cross-platform, or hybrid development (El-Kassas et al., 2017), mobile cloud computing (Malik et al., 2021), development process and life-cycle (Jabangwe, Edison & Duc, 2018), testing (Zein, Salleh & Grundy, 2016), usability and UI design (Taba et al., 2017), app security and privacy (Gurcan et al., 2022a), tools and frameworks (El-Kassas et al., 2017).
In a more specific outlook, with the aim of identifying mobile development issues, a number of empirical studies were conducted, partly similar to our current study.In an empirical study, Rosen & Shihab (2016) analyzed the Stack Overflow data dump using a topic modeling approach to examine what mobile developers are asking about.They revealed 40 topics and 32 different category mapping mobile development issues.In a study with a similar perspective, Linares-Vásquez, Dit & Poshyvanyk (2013) implemented topic modeling to Stack Overflow data dump to extract the trending topics from mobile development questions.Beyer & Pinzger (2014) performed a manual categorization of Android app development issues on Stack Overflow considering problem types.Villanes et al. (2017) conducted a study using Stack Overflow data in order to analyze and cluster the main topics on Android testing.Ahmad et al. (2019) identified the topics and trends of non-functional requirements for development of iOS applications using Stack Overflow data.Also, Fontão et al. (2018) explored main topics and indicators in the mobile software ecosystem by analyzing technical questions about mobile platforms on Stack Overflow.Apart from the aforementioned studies using stack overflow data, in a qualitative study, mobile challenges were identified through a systematic literature review and then validated by interviewing practitioners (Ahmad et al., 2018b).Besides, Pandey, Litoriya & Pandey (2018) identified 14 mobile issues, and using an interpretive structure modeling (ISM) approach, categorized them into four groups: dependent, driving, linkage and autonomous.
Furthermore, researchers have so far conducted many empirical studies using Stack Overflow data in order to shed light on many aspects of software development issues and challenges (Ahmad et al., 2018a).In particular, some remarkable studies which use Stack Overflow data have been carried out in order to discover main issues and challenges in specific sub-domains of software development such as testing (Kochhar, 2016), security (Yang et al., 2016), mobile development (Linares-Vásquez, Dit & Poshyvanyk, 2013;Rosen & Shihab, 2016), programming languages (Chakraborty et al., 2021), requirements (Zou et al., 2017), concurrency development (Ahmed & Bagherzadeh, 2018), IOT development (Uddin et al., 2021), and machine learning (Alshangiti et al., 2019).Beyond aforementioned studies, a comprehensive collection of research which uses Stack Overflow data from 2009 to date is provided in more detail by Vasilescu on Stack Exchange Meta (Vasilescu, 2014).Also, Ahmad et al. (2018a) conducted a comprehensive literature review and categorized 166 research articles (from 2008 to June 2016) using Stack Overflow data.In summary, our current study complements the aforementioned work since our methodology focuses on in-depth analysis of the mobile-related posts shared on Stack Overflow in order to identify mobile development issues and their trends.

Latent Dirichlet allocation
Topic modeling encompasses a set of methods, procedures, and tools that enable the discovery of hidden semantic structures, called topics, in large collections of textual information (Blei, 2012).In text mining and natural language processing implementations, topic modeling approaches widely used for semantic context analysis of the document collections.In topic modeling algorithms, latent Dirichlet allocation (LDA) is a generative probabilistic approach widely used for topic modeling of document collections (Blei, Ng & Jordan, 2003).The intuitive idea behind LDA is based on the assumption that each document is characterized by more than one topic, and each topic is characterized by the distribution of words in an empirical corpus.LDA treats the words and documents observed in a corpus as being created by an underlying topic structure.
It is a difficult process to obtain the posterior distribution by computation in extracting the hidden topic structure of the documents.Therefore, various techniques have been developed for approximate inference, including Gibbs sampling (Griffiths & Steyvers, 2004) and variational Bayes approximation (Blei, Ng & Jordan, 2003).Each of the mentioned inference techniques possesses distinct advantages and disadvantages that are traded off in terms of their speed, complexity, accuracy, and simplicity (Vayansky & Kumar, 2020;Gurcan, 2023).
Since the LDA model is based on unsupervised machine learning, it enables the discovery of semantic topics in a short time without the need for any training process (Blei & Lafferty, 2007;Gurcan & Cagiltay, 2022).Beyond textual data, the LDA model can be effectively applied to different types of data, such as genetic data, software codes, images, videos, forums, blogs, and social networks (Blei, 2012;Silva, Galster & Gilson, 2021).Because of these supportive features, the LDA model is considered by many authorities as a robust and efficient approach for semantic content analysis that automates the detection of latent topics in the textual contents of a huge corpus (Blei, 2012;Gurcan et al., 2022b).
From its emergence to the present, the LDA model is also often used in software engineering research to analyze structured or unstructured data in software repositories, such as natural language texts, web archives, log files, source codes, mailing list archives, bug reports, Git repositories, Q&A posts, and requirements documents (Silva, Galster & Gilson, 2021;Gurcan et al., 2022a;Gurcan, 2023).Considering the studies specific to mobile development, a number of studies used to the LDA model to investigate mobile development issues asked on Stack Overflow (Linares-Vásquez, Dit & Poshyvanyk, 2013;Rosen & Shihab, 2016;Villanes et al., 2017;Fontão et al., 2018;Ahmad et al., 2019); to analyze users' feedback, reviews and ratings for mobile apps (McIlroy et al., 2016;Hu et al., 2019;Noei et al., 2019); to extract features from mobile app descriptions and recommend new features for the similar apps (Jiang et al., 2019); to detect permission reauthorization vulnerabilities in Android apps (Demissie, Ceccato & Shar, 2020); and to reveal the usage of common interface elements in Android apps (Taba et al., 2017); and to distinguish malicious Android apps (Yang et al., 2017).
As seen from the aforementioned studies, the LDA-based topic modeling approach is widely used in software engineering research.Considering this background, Silva, Galster & Gilson (2021) conducted a comprehensive literature review study which revealed the usage of topic modeling in software engineering research.From a similar point of view, Chen, Thomas & Hassan (2016) performed a survey on the use of topic models when mining software repositories.Apart from these studies, it is possible to talk about the existence of a large number of the studies based on LDA.As a result, the effectiveness and suitability of this topic model approach for software engineering research further increases our motivation to use the LDA model for investigating mobile development issues.

METHOD Data collection and extraction
With the intention of achieving an objective methodology, we used the Stack Overflow data dump, which is publicly available as XML files (Internet Archive, 2023).In the first step, we downloaded the SO data dump in XML format (posts.xml, last updated March 21, 2022) and parsed it into a PostgreSQL database.This parsed data file contained a total of 55, 027,254 (22,100,401 questions and 32,926,853 answers) posts from July 2008 to March 2022.Each post in the data dump contains various metadata elements such as title, body, tags, and so on.The datasets generated during and analyzed during the current study are publicly available in the Internet Archive repository (Internet Archive, 2023), as a data dump including XML files.

Identification of mobile-related posts
Posts on Stack Overflow cover a wide range of expertise, experience, and knowledgedomains for developers.Considering that the posts recorded in the database may be related to any specific subject, in this study, we are only interested in mobile related posts and we aim to extract posts within this scope.From this perspective, we have endeavored to put forward an effective and methodological approach to detect only mobile development related posts in a systematic way.From this perspective, we aimed to present an effective and objective approach to identify only mobile-related posts.To achieve this, we created the first draft list containing the keywords related to mobile development and presented them in Table A1.
At this stage, we identified the primary mobile keywords, taking into account previous studies (Linares-Vásquez, Dit & Poshyvanyk, 2013;Rosen & Shihab, 2016) and Stack Overflow's annual developer surveys (Stack Overflow, 2022).Although the list in Table A1 does not include all mobile-related keywords, as a first draft, these keywords consist of fundamental components of mobile development such as operating systems, hardware, development platforms and SDKs.Namely, beyond the mobile-related posts obtained using these initial keywords, we envisaged that mobile-related tags include a wider range and there may be many mobile tags that are not included in Table A1.Therefore, we performed a number of sequential procedures to identify additional mobile-related tags in a systematic way.Initially, we identified all Stack Overflow posts that contain any of the first keyword listed in Table A1.Then, we extracted the tags for each of these posts.The tags represent keywords that users associate with their questions.Thus, we extracted all the tags for these mobile posts and obtained a larger set of tags.This approach we used allowed us to discover new tags and thus get a richer set of mobile posts.On the other hand, the drawback of this approach was that it could also include a large number of posts not related to mobile in the dataset.For example, let's consider a post with tags "android", "testing", "unit-testing", and specify three tags from that post.Let us say we then include all posts that contain any of these three tags.In such a case, some posts may be related to the testing process of any desktop or web application, even though they have the "testing" tag.
In another example, although Java is a common language for Android apps, posts with the tag "java" may not always be mobile related.Because Java is used in many other types of applications other than mobile.Because, Java is also used for many platforms other than mobile.Therefore, many of the posts with the "java" tag may be related to many other development issues besides mobile.In such cases, adding all posts having a "java" tag would cause unrelated posts with mobile to be included in the dataset.Accordingly, this process would lead to a significant noise in the dataset.
In order to overcome this problem, we employed a set of procedures based on quantitative approaches.In the first step, we extract all the tags of the posts containing the keywords in Table A1 and define these tags as candidate tags (C t ).In this stage, we aimed to identify mobile-related ones among these candidate tags and to obtain more tags.In order to identify only mobile-related ones among the candidate tags, and to calculate how relevant they are to mobile, we defined three variables Var A , Var B , and Var C for each (C t ) candidate tag.Specifically, Var A indicates the number of posts that contain both the candidate tag (C t ) and at least one of the mobile keywords in Table A1 within their tags.Var B indicates the number of posts that contain the candidate tag (C t ) in their tags among all posts.Considering Var A and Var B , we defined a tag relevance score (TRS t ) for each candidate tag (C t ) as follows: TRS t indicates how relevant the candidate tag (C t ) is to mobile.The value of TRS t ranges from 0 to 1.The greater the value of TRS t , the more relevant the candidate tag (C t ) is to mobile.The case where the value of TRS t is equal to 1 indicates that the candidate tag (C t ) tag appears only with the mobile tags in Table A1.With this in mind, we performed a number of experiments with different TRS t values.We manually evaluated the tags listed as the output of each experiment and concluded that using the TRS t value of 53% produced optimal results without being too restrictive.After excluding irrelevant tags using TRS t , we detected very low-importance tags that were related to a very rare issue that only appeared in one or two posts.The value of TRS t is 1 for a candidate tag (C t ) that only appears on a post (e.g., "android-iconics", "android-content-capture" or "android-device-controls").In this case, the inclusion of such low frequency tags will lead to an increase in the amount of unimportant data in the dataset.In order to solve this problem, we set another threshold value called tag significance score (TSS t ) for each candidate tag (C t ) and calculated as follows: In this formula, Var C is the number of mobile posts containing the most popular mobile tag.In the corpus of our study, the tag "android" was the most common, contained within 465,178 posts (from 2017 to 2021).We tried different TSS t values and evaluated the tags listed as the outputs of each experiment.After that, we concluded that the inclusion of candidate tags (C t ) with TSS t values of 0.5% and above gives optimal results.Finally, the list of 66 identified tags, and their TRS t and TSS t values are given in Table A2.The tags in Table A2 are sorted by TSS t in descending order.Similar approaches to the one we used to identify mobile-related tags have also been used in previous studies for investigation of different sub-contexts of Stack Overflow (Rosen & Shihab, 2016;Yang et al., 2016;Uddin et al., 2021).

Creation of empirical corpus of mobile posts
In this study, we aimed to reveal the landscape of themes and trends in mobile development in more detail, especially in recent years, so we included the posts covering the last 5 years from January 1, 2017 to January 1, 2022 in our experimental dataset.To this end, we tried to extract mobile-related posts shared in the last 5 years using the final set of mobile tags given in Table A2.Initially, we identified all question posts containing the tags in Table A2 within the tags assigned to each question post.Next, we extracted the answers and descriptive indicators (title, body, creation date, favorite count, comment count, view count, score, answer count, etc.) of these questions.These extracted question and answer posts constitute our final empirical dataset that we will use in our experiments.In total, our empirical corpus contains 2,242,504 posts including 1,036,682 questions and 1,205,822 answers.The monthly distribution of the number of Stack Overflow posts over the last 5 years is given in Fig. A1.According to Fig. A1, it is observed that the number of questions and answers has been decreasing over time and since the end of 2020, the number of answers has decreased below the number of questions.During this period, mobile-related post counts compared to all posts ranged from 8% to 14% per month, as shown in Fig. A2.The significant decrease in the quantity of posts observed in the latter period of 2020 can be attributed to the onset of the COVID-19 pandemic, as evidenced by Figs.A1 and A2.

Topic modeling using LDA
At this stage, we conducted a semantic content analysis on SO mobile posts using Latent Dirichlet Allocation (LDA), a probabilistic approach for topic modeling, in order to reveal the most common issues faced by mobile developers.In text mining and natural language processing research, topic modeling provides a systematic methodology to discover the latent semantic structure of a document collection.In this respect, a number of topic modeling approaches are available such as Latent Dirichlet Allocation (LDA), Latent Semantic Indexing (LSI), Hierarchical Dirichlet Process (HDP), Non-Negative Matrix Factorization (NMF), and Dirichlet Multinomial Regression (DMR) (Gurcan & Cagiltay, 2022).Among the topic modeling approaches, LDA is a generative model, whose capability and efficiency is widely accepted for research based on semantic text mining, and therefore it is extensively preferred in software engineering research (Silva, Galster & Gilson, 2021;Gurcan et al., 2022b).In addition, a remarkable body of work used LDA to implement topic modeling on a number of sub-contexts of Stack Overflow (Silva, Galster & Gilson, 2021).LDA discovers the topics by combining words that tend to coexist commonly in text documents within the experimental corpus and that together form a semantic integrity (Blei, 2012).It uses the frequencies of words in documents and the co-occurrence of frequencies in order to create a topic model of related words.The LDA model also provides a number of well-organized methods for estimating the optimal number of topics, calculating the coherence score of discovered topics, and optimizing the topic-term distribution (Blei, Ng & Jordan, 2003).
Therefore, the LDA model was used in this study for the topic modeling analysis of our experimental corpus containing a very large number of mobile-related posts on Stack Overflow.In the following, we describe how the LDA model was fitted and implemented to our corpus.Initially, the preprocessing steps necessary to increase the success of the topic modeling analysis were implemented to the corpus (Gurcan, 2023).In the first step, we included only the title of the question posts in our corpus for topic modeling analysis by disregarding other metadata other than the title (Rosen & Shihab, 2016).Because, the titles are the part that best demonstrate the focal points and concepts of the issue emphasized in the posts.On the other hand, the body of the questions may contain extra information that is irrelevant to the main idea of the question (the questioner's previous experiences and comments with the problem, previously tried methods and code snippets, other factors that triggered the problem, and so on) (Rosen & Shihab, 2016).This extra information creates noise in the empirical data.Consequently, as we focused on what issues developers were asking about, we excluded the body of the questions as well as the answer posts, and created a corpus containing only the title of the question posts for the topic modeling analysis (Rosen & Shihab, 2016).In the second step, tokenization, lowercase conversion, deleting numbers and punctuation, deleting stop words, and lemmatization were implemented on this corpus using Gensim (Řehůřek & Sojka, 2011), a pure Python library.Thus, data preprocessing has been completed and the empirical corpus has been adapted to the appropriate form essential for LDA-based topic modeling analysis (Řehůřek & Sojka, 2011;Gurcan et al., 2023).
In the topic modeling stage, we used the Gensim (Řehůřek & Sojka, 2011), a pure Python library developed for text preprocessing and topic modeling, to implement the LDA-based topic model to our corpus.Firstly, the values of the prior parameters (α, β, and K) were specified to fit and optimize the LDA model to the empirical corpus.The value of α parameter, which indicates the distribution of topics in documents, was used as α = 0.1, and the value of β parameter, which indicates the distribution of words in the topics, was used as β = 0.01, considering previous work on the topic modeling of short texts (Zuo et al., 2016;Vayansky & Kumar, 2020;Gurcan & Cagiltay, 2022).The other parameter used to obtain the ideal model was the K parameter, which indicates the number of topics.The higher the K value, the more fine-grained topics are obtained, while the lower the K value, the more coarse-grained topics are obtained.With the aim of choosing the ideal number of topics, the LDA model was implemented with various K values in the range of K ∈ {10, 11, 12, …, 50}.Concurrently with this process, a coherence score (C V ) was calculated for each topic model implemented for each K value (Řehůřek & Sojka, 2011).As a result, a maximum coherence score (C V = 0.4189) was obtained for the number of topics K = 36 (see Fig. A3), which reveals the optimal topic-word allocations for each document.

CASE STUDY AND RESULTS
RQ1: what issues are asked about mobile development?
As a result of the LDA-based analysis, 36 topics were discovered, in which each topic was described by 15 descriptive keywords.After examining the consistency of the topics, each topic was named taking into account the descriptive keywords of the topics.Then, we calculated the percentages of each topic in the entire corpus, considering the dominant topic to which each document was assigned.For example, if a topic has a 5% rate, 5% of all question posts are assigned to that topic.
The 36 topics discovered by LDA-based topic modeling, with their names, descriptive keywords, and rates are presented in Table 1.The topics in Table 1 illustrate main issues specific to mobile development, so the terms topic and issue are used interchangeably throughout this article.As seen in Table 1, the topics (issues) are listed in descending order by their percentages.Accordingly, "Android Studio", "Kotlin", and "Arrays" emerged as the top three most frequently asked topics, respectively.On the other hand, "Xamarin", "Dialog Alerts", and "Testing" were the least asked topics.
The discovered topics indicated that mobile developers faced a wide range of issues, from app development tools to debugging, database services to UI settings, casting to threading.With the aim of understanding the main knowledge domains of mobile Testing development, we categorized the discovered topics and found that the topics fall under the following six categories: "Development", "UI Settings", "Tools", "Data Management", "Multimedia", and "Mobile APIs" (see Table 2).Leveraging these indicators, we performed a set of computational analysis to identify the difficulty and popularity of each topic.Firstly, for each topic, we identified the questions in which a topic was dominant and calculated the question count related to each topic.We then calculated the average view count for each topic (dividing the total number of views by the total number of questions).In this way, we revealed the popularity of the topics.From a similar perspective, total question count, average view count, average favorite count, average voting score, average answer count, and average accepted answer count were computed and presented in Table 3.The topics in this table are sorted in descending order by their percentages.
With the aim of revealing a more understandable landscape of the difficulty and popularity of the topics, we summarized some indicators given in Table 3. Following, we depicted the first five and last five topics in Fig. 1, taking into account the average view count (for popularity).According to Fig. 1, the top five most viewed (most popular) topics are "Flutter", "Project Building" and "Error Handling", "Android Studio", and "React-Native", respectively.On the other hand, the least viewed (least popular) topic is "Firebase", followed by "App Crash" and "Dialog Alerts".
In addition, in order to reveal the difficulty level of the topics, we showed the first five and the last five topics in Fig. 2, considering the average number of accepted answers.As seen in Fig. 2, the "Connection" topic, which has the lowest rate (0.27) according to the accepted answer count, emerges as the most difficult topic.This is followed by "Media Streaming", "Notifications", "Web View", and "Emulator", respectively.According to the average answer count (see Table 3), the three most difficult topics are "Notifications", "Connection", and "Media Streaming", which are similar to the accepted answer count.Furthermore, in order to reveal other dimensions of developers' interest in topics, we presented the voting score and favorite count for each topic in Table 3.

RQ3: how have the trends of mobile issues changed over time?
At this stage, we will try to analyze how mobile issues have changed in the last 5 years.To achieve this, we consider the distribution of the number of questions for each topic in these years.Namely, we calculated the percentage rate of the number of questions pertaining to each topic in each year.Then, we subtracted the percentages of the topics in the previous period from the percentages in the current period.Accordingly, we calculated how much the topics changed in the current year compared to the previous year.Finally, we calculated the total temporal trend of the topics at the end of these 5 years by summing the percentage changes for each topic in each period.Overall trends and annual percentages of the topics are presented in Table 4.The topics in this table are given in descending order according to their overall trend values in the last column.Among the topics, it was observed that 11 topics had an increasing trend, seven topics had a constant trend (i.e., trend values between −0.2 and 0.2), and 18 topics had a decreasing trend.As seen in Table 4, "Flutter", "React-Native", "Kotlin", "Error Handling" and "Project Building" are the top five topics with the most increasing trend, while topics "Layout Settings", "Fragment Activity", "Event Issues", "Map API", and "Database Tasks" have the most decreasing trend.RQ4: what are the most asked mobile technologies?
Each question post shared on Stack Overflow contains specific tags that reflect the context of the issue mentioned in that question.These tags are added to that question by the user asking the question.The tags are descriptive keywords that reveal the main themes, technologies, and tools that users associate with their questions.In order to reveal the tags related to mobile issues, we initially separated the tags of each post into singular tags and calculated the frequencies of the tags for all posts.Then, we identified the top 20 tags with the highest frequency among them.Following this, we calculated the distribution of the tags of the posts according to the topics and identified the top ten tags for each topic.By analyzing the tags of all the posts in the corpus, we found that mobile developers have used 22,868 different unique tags in the last 5 years.The total number of occurrences of these unique tags was found to be 3,411,771.The average number of different unique tags used for each year was found to be 7,684.Considering the frequencies of the tags in the corpus, the top 20 tags with the highest frequency were identified and given in Fig. 3 in descending order by their percentages.As seen in Fig. 3, mobile technologies indicated by the tags include a wide spectrum of modules such as platforms, programming languages, development tools, and database services.Android and iOS, the two main mobile platforms, are in the first and second places, respectively.They are followed by Swift and In the next step, with the aim of revealing the mobile technologies related to each topic, we further expanded our analysis and identified the top ten tags for each topic and presented them in Table 5.The topics are listed in descending order by their percentage in this table, where likewise the top ten tags for each topic are in descending order.In this way, we revealed a number of mobile technologies (i.e., platforms, programming languages, app development tools, data services, etc.).As seen in Table 5, Android is featured as the first tag in 29 of 36 topics.In other words, it is seen that the Android platform is dominant in mobile problems.Although Android and iOS are seen together in many topics, iOS is in first place in only two topics ("Layout Settings" and "Event Issues").

RQ5: how have the trends of mobile technologies changed over time?
In this trend analysis, taking into account the top 20 tags we calculated how each tag changed in that period compared to the previous period.In this way, we found the amount of increase or decrease of the tags for each year.Finally, by summing up the change amounts for each year, we found the overall trend of each tag for the last 5 years.Our findings on trends in mobile technologies include annual percentages of the top 20 mobile technologies and their trend values, which are presented in Table 6.In this table, the mobile technologies are sorted by the trend values in descending order.Android and iOS, the two main mobile platforms with the highest rates, stand out as the ones with the

DISCUSSION
The wide spectrum of principal issues in domain-specific contexts Mobile software engineering has highly dynamic and competitive working environments where paradigms, tools, technologies, skills, and experiences are constantly changing and evolving (Rosen & Shihab, 2016).Our analysis revealed the issues and challenges most discussed by mobile developers as 36 separate topics.The findings of our analysis clearly showed that mobile development encompasses a wide spectrum of principal issues and challenges in six domain-specific contexts including "Development", "UI settings", "Tools", "Data Management", "Multimedia", and "Mobile APIs".In order to compare the topics from our analysis with those from other studies (Linares-Vásquez, Dit & Poshyvanyk, 2013;Beyer & Pinzger, 2014;Rosen & Shihab, 2016;Fontão et al., 2018), we presented a comparative list of topics revealed by these studies, in Table 7.
Considering the results in Table 7, we found that the topics of "Layout Settings", "Database Tasks", "Media Streaming", "Error Handling", "View Controller", "Web View", "API Requests", "Casting", and "Fragment Activity" were discussed in at least three studies.In this way, these nine topics were featured as the most focused mobile issues.On the other hand, topics "Arrays", "Emulator", "Button Actions", "Text Settings", "Style-Theme", "Event Issues", "File Settings", "Firebase", "Threading", and "Dialog Alerts" were covered in only one of these four studies.Unlike other studies, topics "Android Studio", "Kotlin", "Flutter", "React-Native", "Functions", "Cloud Firestore", "Xamarin", and  Insight into the use of mobile platforms, tools, and technologies Mobile developers effectively use a wide-ranging collection of platforms, tools, and technologies covering programming languages, SDKs, IDEs, frameworks, APIs, databases, data services, and cloud-based resources in order to develop mobile apps in a more proficient way (Jabangwe, Edison & Duc, 2018).Our findings provide notable implications for time-dependent trends in mobile development issues, tools, and technologies (see Tables 4 and 6).Mobile development is a dynamic area where the technologies and tools used are constantly updated.Therefore, some paradigms, tools, and technologies used by the developers remain up-to-date over the years, while others become outdated in a very short time.As can be understood from our findings, while we have witnessed the dominance of Android and iOS among mobile platforms in the last 5 years, we have experienced the withdrawal of other platforms such as Windows-Phone.When evaluating our analysis and interpreting the results, it is necessary to take into account the fact that  have the highest percentages (see Fig. 3), they have the most decreasing trends (see Table 6).In fact, fewer new questions can be asked on older topics.This inference is not because the topics have diminished in importance, but because many of the questions on older topics have already been answered, and repetition of questions that already exist on Stack Overflow is not allowed.Consistent with this inference, it seems that more questions are asked on topics related to newer technologies (e.g., "Flutter", "React-Native", "Kotlin", and "Android Studio", see Tables 4 and 6), as they are new and most of the questions have not been asked before (Biørn-Hansen et al., 2020).Many of the increasingly trending topics presented in Tables 4 and 6 are fairly new technologies that became popular after 2015.
Our findings also indicated that Flutter, React-Native, Xamarin, Ionic, and Cordova are the most used cross-platform development tools for mobile apps (see Fig. 3).The findings reveal a remarkable progress from native app development to cross-platform development.
The strongest proof of this insight is that the top two topics with the highest increasing trend are "Flutter" and "React-Native", respectively (see Table 4).Because, "Flutter" and "React-Native" are considered as the two most effective tools for cross-platform development.Contrasting other studies (Linares-Vásquez, Dit & Poshyvanyk, 2013;Beyer & Pinzger, 2014;Rosen & Shihab, 2016;Fontão et al., 2018), topics such as "Flutter", "React-Native", "Xamarin" first appeared in our current study (see Table 7).As a result, these empirical findings make it clear that today's mobile developers are increasingly embracing cross-platform development over time.Android Studio, Xcode, and Xamarin are the most preferred mobile IDEs (see Fig. 3).Swift, Java, Kotlin, Dart, Javascript, Objective-C, and C# are the most popular programming languages for mobile development (see Fig. 3).
Another remarkable finding of our analysis is the transition from traditional databases to cloud-based data services.Mobile developers are increasingly utilizing Firebase and its extensions, such as Google-Cloud-Firestore and Firebase-Realtime-Database (Ozyurt et al., 2022).As indicated by our findings, the topics of "Firebase" and "Cloud Firestore" (see Table 1) are closely related to cloud-based data services.Also, it was seen that firebaserealtime-database tag is among the top 20 most used tags (see Fig. 3).One of the important findings of our study is that mobile developers faced wide-ranging issues based on UI design, development, and its usability.In line with this background, we identified ten topics under the "UI Settings" category.The broad scope of these UI-related topics highlighted the importance of UI design, development, and optimization for mobile apps, which has also been discussed in a number of previous studies (Punchoojit & Hongwarittorrn, 2017;Taba et al., 2017).In particular, the topic of "Layout Settings" was emphasized in all of the studies compared in Table 7.These studies also indicated the necessity of "View Controller", "Web View", and "Fragment Activity" topics for UI settings.Our extensive findings, supported by other studies, potentially indicated that UI design and development is a common issue for mobile developers across various platforms (Punchoojit & Hongwarittorrn, 2017;Al-Razgan et al., 2021).In our analysis, "API Requests" and "Map API" emerged as the two main topics that revealed mobile API issues.Besides, a number of studies strongly highlighted the problems of mobile developers regarding the topic of "API Requests" (see Table 7).With regard to imaging and streaming issues, we identified two topics including "Imaging" and "Media Streaming".In particular, the topic of "Media Streaming" was also specifically discussed in all of the studies we compared in Table 7.Among the topics discovered, "Media Streaming" has emerged as the second most difficult topic since the number of accepted answers on this topic is very low.Based on this finding, we can say that the topic of "Media Streaming" contains relatively difficult problems for mobile developers to solve compared to other problems.

Implications for researchers and practitioners
Insights and implications from the analysis of posting data shared on Stack Overflow and similar Q&A websites can provide motivation for researchers and practitioners to create solutions for the prominent problems of mobile development.The empirical background, methodology, and findings of this study can serve as a guide for mobile development communities with diverse profiles, such as developers, researchers, practitioners, educators, and enthusiasts, to understand and contribute to the field.Revealing the wide range of development challenges faced by mobile developers, our findings provide important insights for researchers into potential research gaps that could address these issues.Each of the 36 topics discovered can be prioritized by the researchers according to the rate at which they are asked, viewed, and answered and can be considered a subresearch topic in its own context.For example, an experimental study focusing on the problems of mobile developers only in the context of UI development, taking into account the topics about UI settings (e.g., "View Controller", "Web View", "List View", "Button Actions", "Text Settings," and so on), can be carried out using Stack Overflow data.Although each of the mobile development issues discovered deals with a different problem in its own right, we conclude from our findings that field researchers should prioritize, especially the most viewed and priority issues awaiting solutions.Research aimed at solving the prominent problems of mobile development can take into account the most viewed (e.g., "Flutter", "Project Building", and "Error Handling") or most difficult issues (e.g., "Connection", "Media Streaming", and "Notifications") raised by our findings.One potential approach to tracking the evolution and progression of mobile development trends in the future is doing periodic iterations of the present study at more frequent intervals.The methodology we have developed can also be utilized by researchers for doing experimental analysis on various textual settings.
Practitioners can contribute to the development and innovation of the field by creating useful tools and applications to solve the dominant problems of mobile development revealed by our findings.Mobile developer candidates who are new to the field can pursue a career in these areas by considering which areas have talent gaps and which topics and tools are popular.For example, "Flutter", "Android Studio", and "React-Native", which are in the top five of the most popular (most viewed) topics, offer an important perspective for practitioners on which development tools they should focus on.It can create more supportive libraries, frameworks, or guidelines for such development tools that practitioners commonly use.The "Firebase" and "Cloud Firestore" topics, which are closely related to cloud data services, highlight the need for practitioners to focus on cloudbased data services rather than traditional databases to develop data-driven mobile apps and services.Another of our implications for practitioners that needs to be highlighted is that the top two trending topics, "Flutter" and "React-Native", point to remarkable progress from native app development to cross-platform development.Furthermore, tool developers can use our findings to fine-tune existing problematic development tools and provide more effective support and documentation.For example, the fact that "Android Studio" is the first among the discovered topics reveals the need for tool support for Android developers.In this context, it emerges as an important requirement for practitioners to prepare new helpful tools and guiding documentation for Android developers on problematic matters.
As a final word, educators can use our findings for developing curricula and training strategies that are in line with current mobile development trends.Enthusiasts and general readers within the mobile development ecosystem can refer to our findings to follow emerging developments and trends in the mobile development industry and communities.Stack Overflow and other Q&A platforms can also leverage our analysis to better categorize and tag user posts based on a more structured taxonomy.We hope that our work will guide future additive research in this area.

CONCLUSIONS
In this study, we analyzed mobile-related posts shared on Stack Overflow using semantic text mining and LDA-based topic modeling to identify the most common issues and challenges for mobile development.In addition, we investigated the most popular mobile technologies and their temporal tendencies.The findings of our study revealed that mobile developers most frequently asked questions related to six main categories, which included "Development", "UI settings", "Tools", "Data Management", "Multimedia", and "Mobile APIs".The topics of "Flutter", "Project Building" and "Error Handling" emerged as the most popular topics.On the other hand, the most difficult topics were "Connection", "Media Streaming", and "Notifications".Our study also found that Android and iOS are the most used two platforms for mobile development.One of the key findings of our analysis was the observation of a strong transition from native app development to crossplatform development.Another notable finding was the rapid movement from traditional databases to cloud-based data services.Our analysis provides many insights into the up-todate perspectives, issues, and needs of mobile developers.Our findings can help researchers, practitioners, and educators by revealing a wide spectrum of issues faced by mobile developers over time.
Like all studies, this one has some limitations.Initially, our research was limited to the Stack Overflow dataset.Although Stack Overflow is one of the most widely used Q&A sharing web sites among developers, it should be highlighted that focusing on a single data source may limit the scope of outcomes.Second, our analysis includes post-data provided between 2017 and 2021, and the results of our trend analysis are exclusively based on the timeline of the Stack Overflow dataset we're working with.Another limitation of our study is that, as in other clustering techniques, the process of identifying the topic labels is subject to the authors' perspective and interpretation of the results.Fourth, the topic estimation parameters for the LDA topic modeling approach used in this study may vary depending on the data type and context used.Finally, because our study incorporates inductive and probabilistic exploratory procedures, future confirmatory research is required to test and enhance our findings.
Future work can extend this study in many avenues.Researchers can leverage our methodology to analyze trending topics and movements specific to the different research contexts they are interested in.The methodology employed in this study can be extended to encompass further developer Q&A web sites, such as Kaggle, Reddit, GitHub, and Quora.Our methodology can be applied to other data resources such as web portals, social networks, developer blogs and forums, and compare our findings for compatibility with those in these environments.Different data processing methods, preprocessing stages, and semantic text mining approaches can be joined to develop new hybrid models.The present methodology can be enhanced with new supportive approaches for topic discovery in different domains.Studies planned for the future will extend our methodology using different topic modeling approaches such as Hierarchical Latent Dirichlet Allocation (HLDA), Hierarchical Dirichlet Process (HDP), Non-Negative Matrix Factorization (NMF), and Dirichlet Multinomial Regression (DMR).

Table 1
The 36 topics discovered by LDA.

Table 2
Taxonomy of the topics.: what are the most difficult and most popular issues for mobile development?A question post on Stack Overflow has descriptive indicators such as the view count, answer count, accepted answer count, score, favorite count, and comment count.

Table 4
Yearly trends of the topics.

Table 5
Related tags of the topics.

Table 6
Temporal trends of the top 20 mobile technologies.PeerJ Comput.Sci., DOI 10.7717/peerj-cs.1658fewernew questions are asked about combined technologies or general framework topics, as answers to many previously asked questions are already available on Stack Overflow.Considering the top increasing and decreasing trends, our findings make it clear that certain topics and tags have reached saturation.For example, although Android and iOS

Table 7
Common issues in the current study and previous studies.