Keywords

1 Introduction

An important trend in the area of digital government is the exploitation of ICT for supporting higher-level functions of government agencies, with main emphasis on public policy making, leading to the gradual development of ‘policy analytics’ (sometimes termed also ‘policy informatics’) [1,2,3,4,5]; it can be defined as the exploitation of existing data of government agencies, possibly in combination with data possessed by private sector firms (such as business information and consulting ones), using advanced quantitative analysis techniques, in order to support various stages of public policy making for the complex problems that modern societies face. Policy analytics represents a major and highly ambitious expansion of digital government, beyond the support of internal processes and operations, as well as transactions and consultations with citizens and firms, which were the main objectives of its first generations [6, 7], towards higher-level functions of government. Some first research has been conducted in this area of policy analytics, which has developed some useful knowledge concerning approaches and methodologies for exploiting various sources of public sector data, in order to support various stages of the policy making cycle in some domains of government intervention, such as the economy, the social insurance, the environment, the energy, the justice and the management of emergency crises [8,9,10,11,12,13]. However, this promising area of policy analytics is still in its infancy, and the potential of the large quantities of the data that government possesses for supporting and enhancing policy making has been exploited only to a limited extent. So it is necessary to conduct extensive research in order to exploit these data for supporting the highest-level government functions of policy making, focusing on the most important problems/challenges our societies face.

Another important trend in the area of digital government is the increasing exploitation of artificial intelligence (AI) techniques by government agencies. According to a recent relevant report of the EU [14] AI can be defined as a group of technologies that enable computers to become more intelligent, by learning from their environment, gaining knowledge from it, and using this knowledge for taking intelligent action or proposing decisions. There are several technologies in this group that provide such learning capabilities, with the most important of them being definitely the Machine Learning (ML). Though most of the AI technologies, and in particular the ML algorithms, exist for several decades, it is only recently that an increasing ‘real life’ exploitation of them has started, mainly by private sector firms, and to a lower extent by government agencies, due to: (a) the availability of large amounts of data enabling more effective training of AI algorithms (i.e. extraction of more reliable rules and models); (b) advances in computing power and cost reduction of it; (c) substantial improvements of the AI algorithms [14, 15]. The multiple success stories of the first experimentations of AI exploitation in the private sector have generated high levels of interest in exploiting and applying AI techniques in the public sector as well, in order to automate or support more sophisticated mental tasks than the simpler routine ones automated or supported by the traditional operational IS of government agencies [16,17,18,19]. However, the exploitation of AI in the public sector so far has been mainly for the automation, support and enhancement of daily operational tasks and lower-level decision making (e.g. [20,21,22,23,24]), but only to a limited extent for the support and enhancement of higher-level functions, and especially policy making. So it is necessary to conduct extensive research in order to exploit at AI for supporting this highest-level government function of policy making.

Our study contributes to the advancement and the combination of these two important and promising digital government trends, policy analytics and AI exploitation, towards filling the abovementioned research gaps, focusing on one of the most serious problems that governments face: the economic crises [25, 26]. In particular, it proposes a policy analytics methodology, which exploits existing data of taxation authorities, statistical agencies, and also of private sector business information and consulting firms, in order to support the design of policies for reducing the negative consequences of the economic crises, which repeatedly occur with varying intensities in market economies, leading to serious contractions of economic activity, with quite negative consequences for huge numbers of citizens (see Sect. 2.1). In particular, our methodology enables the identification of characteristics of a firm (e.g. with respect to strategic directions, resources, capabilities, practices, etc.) as well as its external environment (e.g. with respect to competition, dynamism, etc.) that affect (positively or negatively) its resilience to economic crisis with respect to sales revenue (i.e. the degree of sales revenue reduction due to economic crisis). For this purpose we are using a ‘big data oriented’ AI technique, feature selection (FS), which enables filtering from a large number of potential independent variables (contained in the available high-dimensionality big datasets) the ones that actually affect a dependent variable of interest; in particular, we are using the Boruta ‘all-relevant’ variables identification FS algorithm [27, 28] (see Sect. 2.2). The proposed policy analytics methodology can provide a substantial support for the design of public policies for reducing the negative impact of economic crises on firms, which can significantly impair both short-term and also medium- and long-term performance and competitiveness. This is highly beneficial as economic crises of various origins and intensities frequently occur in market economies, being an inevitable trait of them, have quite negative consequences for the economy and the society, so they are among the toughest challenges that governments face. Furthermore, an application of our methodology is presented, which is based on Greek firms’ data concerning the economic crisis period 2009–2014, leading to interesting conclusions/insights, providing a first validation of the usefulness of our methodology.

In the following Sect. 2 the background of our economic crisis policy analytics methodology is outlined, while in Sect. 3 the methodology is described, and in Sect. 4 its abovementioned application is presented. The final Sect. 5 summarizes conclusions and proposes directions for future research.

2 Background

2.1 Economic Crises

One of the most serious problems of market-based economies are the fluctuations of economic activity that repeatedly appear, which cause many problems to the economy and the society in general, so they have to be addressed by government through appropriate policies, aiming on one hand at reducing their intensities and durations, and on the other hand at mitigating their negative consequences on firms and citizens [25, 26]. Economic crises can be defined as significant reductions of economic activity, which can be due to the ‘business cycles’ (i.e., the fluctuations that economic activity usually exhibits in market-based economies), or caused by various kinds of events in the society or economy (such as the oil crisis in the early 1970, or banking crises) [26]. The economic crises have negative short-term as well as medium- and long-term consequences for the economy and the society. The short-term consequences usually are reductions of the demand for many goods and services, which result in serious decrease of firms’ sales, production and profits, as well as personnel employment (increasing unemployment) and materials’ procurement (propagating the crisis towards the suppliers). Furthermore, during economic crises firms usually reduce capital investment in production equipment, ICT, buildings, etc., and also in product, service and process innovations, which reduce the degree of renewal and improvement of their equipment, products, services and operations, as well as exploitation of emerging new technologies, and this has serious medium- and long-term consequences on their efficiency and competitiveness [25, 26, 31]. Therefore it is imperative that government agencies, especially the ones having competences and responsibilities for the domains of economy and social welfare, design and implement public policies for reducing the above negative short-term as well as medium- and long-term consequences of the economic crises.

However, it should be noted that the above negative consequences of the economic crises are not the same for all firms: the more efficient and effective firms, which offer higher value-for-money products and services, and have higher capacity to make the required adaptations, are more resilient to the crisis, and have less negative consequences on their sales revenue, and therefore on their employment, procurement, as well as their capital investment and innovation, than the less efficient and effective ones. Therefore, it is important and highly useful to develop policy analytics methodologies for identifying characteristics of the firm, as well as its external environment, that affect positively or negatively its resilience to economic crisis. Our study makes a contribution in this direction, based on the Boruta ‘all-relevant’ FS algorithm, which is outlined in the following section.

2.2 Artificial Intelligence Feature Selection – The Boruta Algorithm

The FS algorithms are an important class of ‘big data oriented’ AI algorithms, which aim at determining from a large number of features - potential independent variables - the ones that actually affect a dependent variable of interest [30]. They can be divided into two main categories: (i) the ‘minimal – optimal’ ones, which determine a small - minimal set of features affecting the dependent variable, which can provide optimal prediction accuracy for it (most traditional FS algorithms belong to this category); (ii) the ‘all-relevant’ ones, which determine all the features that affect the dependent variable, not only the non-redundant ones, as it happens in ‘minimal – optimal’ FS algorithms (there is a smaller number of novell algorithms belonging to this category) [27,28,29,30]. Therefore if there are a number of features that affect the dependent variable, which are to some extent redundant (i.e. there is some degree of association among them), the ‘minimal-optimal’ FS algorithms will select only some of them (a minimal subset), which have low levels of redundancy (association) among them; however, they will not select some other features, which affect the dependent variable, but have high levels of association with the selected ones, as the latter features do not increase further the prediction accuracy of the dependent variable, beyond the accuracy achieved based on the former features. On the contrary the ‘all-relevant’ FS algorithms will select all the features affecting the dependent variable, regardless of possible associations among them.

Since the objective of this study is the to extract from available datasets as much knowledge and insight as possible concerning characteristics of a firm as well as its external environment that affect the degree of sales revenue reduction due to economic crisis we are using an ‘all-relevant’ FS algorithm. In particular, we are using the Boruta ‘all-relevant’ variables identification algorithm [28, 29], which is a FS approach, particularly useful when one is interested in understanding the mechanisms related to a dependent variable of interest, rather than just building a ‘black box’ predictive model of it with good prediction accuracy. The basic idea of the Boruta algorithm is that based on the original feature set, another artificial set of features is created, which consists of shuffled copies of all features (which are called ‘shadow features’). This shadow set is then merged with the original one, a Random Forest classifier is constructed based on the merged data set, and for each feature an importance measure is calculated (the default one is the ‘Mean Decrease Impurity’ (MDI) of the feature), in order to evaluate the importance of each feature. At each iteration, Boruta FS algorithm evaluates one real feature, by assessing whether it has a higher importance than the best of the shadow features, and if this does not happens the feature is removed as unimportant for the dependent variable. Finally, the algorithm stops when either all features gets confirmed or removed, or it reaches a specified limit of runs.

The Boruta FS algorithm offers three crucial advantages:

  1. (I)

    It can handle large numbers of features without performance and reliability deterioration, so it is appropriate for exploiting really ‘big data’; this does not happen in other techniques that might be used for the same purpose, such as regression analysis, in which when the number of independent variables increases, the confidence intervals of the estimated bi coefficient increase as well, so some statistically significant ones may incorrectly be found non-significant.

  2. (II)

    If there are associated – correlated features that all affect the dependent variable the Boruta FS algorithm will not omit some of them due to their association – correlation with other selected features; again this does not happen in other techniques that might be used for the same purpose: for instance in regression analysis, if some independent variables that actually affect the dependent variable have high levels of correlation, then for some of them their bi coefficients might be incorrectly found statistically non-significant (multi-collinearity problem).

  3. (III)

    Also, the Boruta FS algorithm can identify not only the features that have linear effects on the dependent variable, but also the ones having non-linear effects.

3 The Proposed Methodology

The proposed economic crisis policy analytics methodology aims to identify characteristics of the firm and its external environment that affect its resilience to economic crisis, with respect to the most important negative aspect/consequence of economic crisis: the reduction of firms’ sales revenue. Therefore the dependent variable of our methodology is the degree of firm’s sales revenue reduction due to the crisis. The capabilities and advantages offered by abovementioned advanced Boruta AI FS algorithm (outlined in Sect. 2.2) allow us to examine a large number and a wide thematic range of potential independent variables, in order to identify all the relevant and influential ones. The above can provide a substantial support for the design of policies for reducing this negative impact of economic crisis on firms, which can significantly impair their short-term and also their medium- and long-term performance and competitiveness.

Our methodology exploits existing data from two main sources:

  1. (I)

    from taxation authorities: data about firms’ sales revenue before and during the economic crisis, from which sales revenue reduction due to crisis can be calculated;

  2. (II)

    from statistical agencies, and also from private sector business information and consulting firms: data concerning various characteristics of firms (concerning strategic directions, resources, capabilities, practices, etc.) and their external environment (concerning the intensity of competition, the degree of dynamism, etc.).

These data undergo two stages of processing:

  1. (a)

    An initial selection of potential independent variables (characteristics of firm as well as its external environment) from the numerous variables that might be available in the large government and private sector datasets we are using, based on theoretical foundations from previous management science research. One of them is definitely the classical ‘Leavitt’s Diamond’ framework [32], which defines four main elements of a firm, which should be strongly interconnected: (a) task (=the strategies as well as the administrative and production processes of the firm); (b) people (=the skills of firm’s human resources of the firm); (c) technology (=the technologies used for implementing the above processes); and (d) structure (=the organization of the firm in departments, and the communication and coordination patterns them). Also, highly useful can be theoretical foundations developed in previous research concerning firm’s resources as well as capabilities [33, 34], both ordinary and dynamic ones [35, 36]. With respect to the selection of external environment characteristics useful can be foundations from previous research concerning ‘generalised competition’ (such as Porter’s Five Forces Framework [33]) and environmental dynamism [37].

  2. (b)

    Processing of the selected variables through the abovementioned Boruta AI FS algorithm in order to identify ‘all-relevant’ ones (i.e. all the variables that actually affect the degree of sales revenue reduction due to the crisis).

In particular, we can select potential independent variables of the following eight categories:

  • Strategic Orientations: this category can include variables concerning the degree of adopting the main strategies described in relevant strategic management literature [33], such as cost leadership, differentiation, focus, innovation, export, etc.

  • Processes: it can include various characteristics of firm’s processes, such as complexity, efficiency, formality, flexibility, etc.

  • Human Resources: it can include variables concerning the general education/skills level of firm’s human resources (e.g. shares of firm’s personnel having tertiary education, vocational/technical education, etc.), as well as the possession of specific skills concerning various ICT, the provision of relevant training, etc.

  • Technology: variables concerning the use of various important ICTs (such as Enterprise Resource Planning (ERP) systems, Customer Relationships Management (CRM) systems, Supply Chain Management (SCM) systems, Business Intelligence/Business Analytics (BI/BA) systems, Collaboration Support (CS) systems, e-sales, social media, cloud computing, etc., or the use of various production technologies).

  • Structure: variables concerning various aspects of the structure of the firm, such as main structural design (functional, product/service based, geographic, matrix), degree of differentiation, specialization, centralization/decentralization, use of organic structural forms (such as teamwork, job rotation), etc. [38].

  • Ordinary Capabilities: variables concerning the levels of firms capabilities to perform efficiently and effectively the main firm’s functions, such as the ones proposed by Porter’s Value Chain Model (Inbound Logistics, Operations, Outbound Logistics, Marketing and Sales, Service (primary ones), and Human Resource Management, Technology Development, Procurement, Infrastructure) [33]; and also the levels of various ICT capabilities of the firm [34].

  • Dynamic Capabilities: variables concerning various aspects of firm’s ACAP (such recognition and acquisition or relevant external knowledge, assimilation of it, integration/combination of it, and exploitation for innovations in its processes, products and services) [39, 40], agility (e.g. with respect to emergence of new technologies, new suppliers, new products and services as well change of prices by competitors) [37, 41].

  • External Environment: variables concerning the intensity of the five aspects of the ‘generalized competition’ proposed by Porter’s Five Forces Framework [33]: price and non-price competition, bargaining power of buyers, bargaining power of suppliers, threat of new entrants and threat of substitutes; and also variables concerning various aspects of dynamism of firm’s [35,36,37].

4 Application

An application of the economic crisis policy analytics methodology described in the previous section has been made for the identification of characteristics of Greek firms as well as their external environment that affect the degree of their sales revenue reduction due to the long and intensive economic crisis that Greece experiences since 2009. For this purpose, we have used existing Greek firm’s data for the period 2009–2014 from three sources: (i) the Ministry of Finance – Taxation Authorities; (ii) the Hellenic Statistical Authority; and (iii) the ICAP S.A., a well-known business information and consulting firm. In particular, we have used data from these sources for 363 Greek firms; 40.2% of them were manufacturing ones, 9.4% constructions, and 50.4% services ones; 52.6% of them were small, 36.1% medium and 11.3% large ones.

Our dependent variable was the percentage of sales revenue reduction due to the economic crisis in the period 2009–2014, which was discretized by the Ministry of Finance (in order to avoid providing too detailed data about this critical topic) into a variable with 13 possible discrete values (SALREV_RED): increase by more than 100%; increase by 80–100%; increase by 60–80%; increase by 40–60%; increase by 20–40%; increase by 1–20%; unchanged sales; decrease by 1–20%; decrease by 20–40%; decrease by 40–60%; decrease by 60–80%; decrease by 80–100%; decrease by more than 100%. We selected the following 64 independent variables, from 7 out of the 8 categories described in the previous section (with the only exception of the ‘Processes’ category, for which we did not find any variable in the available datasets):

  • Strategic Orientations: degree of adopting a cost leadership strategy (STRAT_CL), a differentiation strategy (STRAT_DIF), a product/service innovation strategy (STRAT_INNOV) (five levels ordinary variables); introduction of product/service innovations in the last three years (INNOV_PRS), introduction of process innovations in the last three years (INNOV_PROC) (binary variables); percentage of sales revenue coming from new products/services introduced in the last three years (NEW_PRS_P), percentage of sales revenue coming from products/services significantly improved in the last three years (IMPR_PRS_P) (continuous variables); introduction of innovations in the production processes or in the services delivery processes (INN_PRSD), introduction of innovations in the sales, shipment or warehouse management processes (INN_SSWM), introduction of innovations in support processes (such as equipment maintenance) (INN_SUPP); conduct of R&D (R&D) (binary variables); exports as percentage of firm’s sales revenue (EXP_P) (continuous variable).

  • Human Resources: number of employees (EMPL); percentage of firm’s employees having tertiary education (EMPL_TERT), vocational/technical education (EMPL_VOT), high school education (EMPL_HIGH), elementary school education (EMPL_ELEM); percentage of firm’s employees using for their work computer (EMPL_COM), firm’s Intranet (EMPL_INTRA), Internet (EMPL_INTER); ICT personnel as a percentage of firm’s total workforce (EMPL_ICT) (continuous variables).

  • Technology: degree of ERP systems use (D_ERP), CRM systems use (D_CRM), SCM use (D_SCM), BI/BA use (D_BI_BA), CS systems use (D_CS) (five levels ordinary variables); conduct of e-sales (E-SAL) (binary variable); use of social media for sales’ promotion (SM_SALPRO) for collecting customers’ opinions and complaints about firm’s products and services (SM_OPCOM), for collecting ideas for improving products and services (SM_IMPR), for finding personnel (SM_PERS), for internal co-operation within the firm (SM_INT), for information exchange with other partner firms (SM_PART) (three levels ordinary variables); use of cloud computing (CLOUD) (binary); degree of using cloud computing IAAS (CL_IAAS), cloud computing PAAS (CL_PAAS), cloud computing SAAS (CL_SAAS) (five levels ordinary variables).

  • Structure: use of organic structural forms (teamwork, job rotation) (ORG) (binary variable).

  • Ordinary Capabilities: six variables concerning the main ICT capabilities [34] for: ICT strategic planning (ICT_PLAN), cooperation between ICT and business units (ICT_BUS), cooperation with ICT vendors (ICT_VEND), development of ICT applications (ICT_DEV), modification of ICT applications (ICT_MOD), integration of ICT applications (ICT_INT) (five levels ordinary variables).

  • Dynamic Capabilities: four variables concerning the main aspects of ACAP [39, 40]: external relevant knowledge recognition and acquisition (ACAP_ACQ), dissemination and analysis (ACAP_DIS), assimilation and integration in firm’s knowledge base (ACAP_INT) and exploitation for process, products and services innovations (ACAP_EXP); and six variables concerning the main aspects of organizational agility [37, 41] with respect to introduction of new products and services by competitors (AG_PRS), new pricing policies of them (AG_PRI), changes of the demand for its products and services (AG_DEM), customization of products and services to customers’ special needs (AG_CUST), expansion to new markets (AG_EXP) and change of suppliers (AG_SUP) (five levels ordinary variables).

  • External Environment: number of competitors (N_COMP) (continuous variable); intensity of price competition (INT_PCOM), non-price competition (INT_NPCOM); and also four environmental dynamism variables concerning changes in products and services (DYN_PRS), technologies (DYN_TECH), competitors’ movements (DYN_COMP) and demand for our products/services (DYN_PRS) (five levels ordinary variables) [35,36,37].

  • General: sector (SECT) (binary variable: manufacturing/services).

The results from processing these variables using the Boruta FS AI algorithm are shown in Table 1, in which we can see ‘all-relevant’ identified variables affecting the degree of sales revenue reduction due to the crisis in order of importance. In particular, ten variables have been identified that actually affect the degree of sales revenue reduction due to the crisis (SALREV_RED). For each of them we examined whether it has a positive or negative effect: for the binary and ordinary variables this was done by calculating and comparing the averages of SALREV_RED for all their discrete values; for the continuous variables we first discretized them (initially we recoded them into corresponding binary variables based on their median values; and then we recoded them into corresponding four levels variables based on their quartile values) and followed the same procedure. The results are shown in the second column of Table 1.

Table 1. Relevant variables affecting the degree of sales revenue reduction due to the crisis.

We remark that the most important of the examined variables for the degree of sales revenue reduction due to the crisis is the use of organic structural forms (such as teamwork, job rotation), which belongs to the structural characteristics category; it has negative impact on SALREV_RED, so it reduces the negative consequences of the crisis on firm’s sales revenue. The economic crises give rise to big changes in firms’ external environment (e.g. decrease of the demand for their products and services, changes in customers’ needs and preferences, new products and service offerings by competitors, etc.) and increase its complexity; the adoption of organic structures (mainly horizontal teams) allows a more intensive exchange and synthesis of information and knowledge among employees from different functions, departments and geographic locations, so they enable a better and understanding of these changes/complexities, and a more effective design and implementation of actions for responding to them (such as new pricing policies, new products/services with higher value-for-money, expansions to new markets, both domestic and foreign ones, etc.).

Also, we remark that four out of the ten identified relevant variables belong to the strategic orientations’ category, and all of them have negative impact on SALREV_RED, so they represent strategies that increase firms’ resilience to economic crisis. Three of them concern innovation strategies: percentage of sales revenue coming from new products and services, and process innovations concerning sales, shipment, warehouse management and support activities (such as equipment maintenance). Therefore, the introduction of new products and services creates new markets and sales opportunities, so it reduces the negative impact of crisis on sales revenue; also the above process innovations increase efficiency, which increases resilience to the difficult conditions of economic crisis. The fourth one concerns export strategies: it indicates that exports generate sales revenue from foreign markets, and reduce the reliance on firm’s domestic market, so they decrease negative consequences of crises in the latter.

Furthermore, there are two of the identified relevant variables that belong to the technology category, and concern ICTs, both of them having negative impact on SALREV_RED: the use of ERP systems and the capabilities for integration of existing ICT applications. This reveals two important firm’s technological characteristics that increase its resilience to economic crisis is. The use of ERP systems provides comprehensive and integrated electronic support of all firm’s functions, so it enhances their efficiency, which is quite useful for coping with the crisis. Also, high level of capability for integrating existing ICT applications enables isolated ‘islands of automation’ (belonging to the same or different departments) to be interconnected and evolve towards an integrated ICT infrastructure, enabling data and functionality of one ICT application to be exploited by others as well; this improves cooperation and coordination between firm’s departments, and enhances firm’s efficiency, which increases its resilience to crisis. Finally, there are two of the identified relevant variables that concern belong to the human resources category: the employment of personnel with tertiary education increases firm’s ability to cope with the crisis, however the employment of personnel having lower vocational/technical level education (though less costly) has the opposite effects. The number of firm’s employees has a also negative impact on SALREV_RED, indicating that larger firms have lower reductions of sales revenue due to the crisis.

The above findings indicate that the Greek government agencies in order to reduce the negative consequences of the economic crisis on firms should design and implement effective public policies (such as legislation, financial support, provision of training and consulting, etc.) for promoting firms’ innovation and export activities. Furthermore, it is necessary to design and implement effective public policies for promoting the adoption of ERP systems, organic structural forms (complementing their hierarchical structures with horizontal teamwork), and for employing personnel of higher educational level. These public policies should be focused mainly on the small and medium firms.

5 Conclusions

In the previous sections has been presented a policy analytics methodology, which exploits existing public and private sector data, based on an advanced big data oriented AI FS algorithm, in order to support policy making concerning one of the most serious problems that governments repeatedly face: the economic crises. It allows identifying firms’ characteristics that affect their resilience to economic crisis. Furthermore, a first application of it has been presented, which provides a first validation of the usefulness of this methodology, and leads to interesting conclusions and insights.

Our research has interesting implications for research and practice. With respect to research, it creates new knowledge in two emerging, highly important for the society, but minimally researched, digital government research domains: policy analytics and government AI exploitation. With respect to practice, it provides support to government agencies for designing policies for reducing the negative impact on firms of one of the most important problems of our market economies: the economic crises. In general, it provides an approach for combining and exploiting multiple sources of public and private sector data in order to understand the characteristics of firms exhibiting various positive behaviours/evolutions (e.g. export, expansion to other countries, adoption of new technologies, etc.) or negative ones (e.g. reduction of personnel employment, disinvestment, etc.), which will be useful for the design of relevant public policies. However, further application of our methodology is required in various national contexts, at both national and sectoral level, using a wider range of potential independent variables, and based on the experience gained improvements of the methodology.