Elsevier

Knowledge-Based Systems

Volume 110, 15 October 2016, Pages 49-59
Knowledge-Based Systems

Improving business process retrieval using categorization and multimodal search

https://doi.org/10.1016/j.knosys.2016.07.014Get rights and content

Abstract

Enterprises use repositories of Business Processes to standardize and adapt their operations in order to reuse them for new functional requirements. However, a disorganized growth of these repositories have hampered the search of Business Processes which is fundamental for reusing them. In this paper an approach for organizing and searching Business Processes is proposed, which is composed of two phases: First, an automatic and semantic categorization phase to classify Business Processes based on their functionality, and second a multimodal search phase in order to rank Business Processes based on structural and textual features. The proposed approach was tested in an evaluation over a closed repository collaboratively built by 20 expert evaluators. Initially, evaluators were asked to rate categories assigned by our approach to each Business Process in order to assess our results against the user perspective. Later, evaluators were asked to compare six queries against the repository in order to obtain a set of relevant Business Processes for each query. With these results, precision, recall and F-measure were calculated to evaluate the relevance and ranking concordance of the proposed approach against state-of-the-art algorithms for Business Process similarity search. Additionally, we applied the Friedman and the Wilcoxon signed rank tests over the results obtained for each query over precision and F-measure in order to evaluate the statistically significance of these results. The results obtained demonstrated the effectiveness of the proposed approach for categorizing and retrieving Business Processes.

Introduction

Many companies have adopted Web Services (WS) and Business Processes (BP) as modular components to represent their organizational operations or procedures in order to automate the management of such activities [1], [2]. A BP captures a set of interrelated activities within an organizational structure to achieve a common business objective, so that the information on these activities become explicit knowledge accessible to all members of the organization [3].

BP are stored in large repositories because of the volume and importance of the information represented within them. These repositories may contain hundreds or even thousands of models. Therefore, it is important to understand and discover similarities of BP to identify common tasks between them, which can be reused as components for future implementations [4]. It may help companies to improve decision-making procedures about how BP should be merged, normalized and consolidated in order to increase the efficiency of their organizational operations [5].

Reusing software components, which may be activities or even complete BP, helps companies to deploy new and value-added services in order to attract and retain customers. In this way, these companies can introduce a competitive differentiation and increase the level of service offered to their customers [6].

However BP repositories have grown in a significant and disorganized manner [7], without taking care with classification and organization regarding the purposes of each BP [8]. Accordingly, reusing BP has become a complex and time-consuming task due to the complexity for finding BP with specific functionalities, and keeping coherence between different versions of BP when various users try to edit the same model [9].

With regard to the aforementioned problem, several authors have developed mechanisms to search and retrieve BP [10]. The aim of those mechanisms is to find a set of BP that are similar to a query expressed in different ways, for example: a set of keywords, a complete BP or a fraction thereof. This paper proposes an approach for automatic categorization of BP useful to organize repositories and reduce the search space to a subset of BP belonging to a common set of categories. This approach forms a structure of categories that links each BP to the context in which they were created within the organization. Searching BP based on a specific set of categories let users to obtain lists with coherent results regarding the textual information within related functionalities. Furthermore, automatic categorization approaches have been used in different fields. For example, managers for software quality models [11], recommender systems for digital libraries [12], [13], [14], textual collections [15], [16], social media [17], among others.

The following contributions are highlighted in this paper: 1) an automatic semantic categorization approach and 2) a multimodal search algorithm that integrates textual and structural information to rank BP retrieved for a specific set of categories. To validate the categorization approach a multimodal search algorithm (to search for BP based on textual and structural information) has been integrated and a comparative evaluation of the results of multimodal with categorization and without categorization has been conducted.

The paper is structured as follows: Section 2 summarizes related works, Section 3 presents the architecture of the categorization approach, Section 4 describes experiments and results obtained, and Section 5 discusses the main conclusions and future directions.

Section snippets

Related works

In the past, many BP retrieval techniques have been proposed that take into account one or more of the following BP properties: linguistics, structure, and behavior [18]. Linguistic-based approaches use BP textual features, for example, name or description of its activities or events. In these proposals, the techniques used include vector space representation with term frequency (TF) and cosine similarity for ranking results [10]. Structure-based approaches take into account the topology of the

Architecture of the approach for automatic categorization and recovering BP

The proposed approach is composed of two phases: a categorization phase and a search phase. The categorization is an automatic and semantic approach, which classifies BP based on their functionalities. The search phase is an approach to retrieve BP taking advantage of the reduced search space obtained by the categorization phase.

The categorization phase may be considered as a filter for reducing the search space that a further searching algorithm, in the search phase, may use to produce a

Experimentation and results

The evaluation of our proposal was conducted into three phases involving 20 expert evaluators in BPMN modeling, a repository with 100 BP (6 of those BP were selected as test queries) and other three BP similarity search algorithms. In the first phase the relevance of the categorization of the BP stored in the repository was considered. In the second phase the relevance of the whole search process, i.e., the search of BP plus their categorization was addressed. In the third phase the

Conclusions and future work

This paper presents an approach for improving Business Process (BP) retrieval using categorization and multimodal search. Categorizing BP is useful to create an organized repository according to the categories covering the functionality of BP. In this sense, it was possible to identify BP sharing a similar set of categories with a query (a BP used as input) to generate a consistent ranking.

BP in the ranking not only share textual and structural features, but also functional purposes represented

References (63)

  • H.-R. Zhang et al.

    Three-way recommender systems based on random forests

    Knowl. Based Syst.

    (2016)
  • Q. Zhu et al.

    Harmonization and semantic annotation of data dictionaries from the Pharmacogenomics Research Network: A case study

    J. Biomed. Inf.

    (2015)
  • C. Figueroa et al.

    A Multilevel approach for business process retrieval

    Revista Ingenierías Universidad de Medellín

    (2015)
  • R. Dijkman et al.

    Similarity of business process models: metrics and evaluation

    Inf. Syst.

    (2011)
  • C.D. Maio et al.

    A framework for context-aware heterogeneous group decision making in business processes

    Knowl.-Based Syst.

    (2016)
  • J. Melcher et al.

    Visualization and clustering of business process collections based on process metric values

    Symbolic and Numeric Algorithms for Scientific Computing, 2008. SYNASC ’08. 10th International Symposium on

    (2008)
  • D.G. Ferrari et al.

    Clustering algorithm selection by meta-learning systems: A new distance-based problem characterization and ranking combination methods

    Inf. Sci.

    (2015)
  • Q. Zhong et al.

    Moving object tracking based on codebook and particle filter

    Procedia Eng.

    (2012)
  • Y.-C. Hu et al.

    Fast vq codebook search algorithm for grayscale image coding

    Image Vision Comput.

    (2008)
  • M.L. Rosa et al.

    Apromore: An advanced process model repository

    Expert Syst. Appl.

    (2011)
  • J. Kekäläinen et al.

    Using graded relevance assessments in ir evaluation

    J. Am. Soc. Inf. Sci. Technol.

    (2002)
  • A. Jiménez-Ramírez et al.

    Generating optimized configurable business process models in scenarios subject to uncertainty

    Inf. Software Technol.

    (2014)
  • H. Reijers et al.

    Improved model management with aggregated business process models

    Data Knowl. Eng.

    (2009)
  • A. Koschmider et al.

    Recommendation-based editor for business process modeling

    Data Knowl. Eng.

    (2011)
  • Q. Wu et al.

    Forestexter: An efficient random forest algorithm for imbalanced text categorization

    Know.-Based Syst.

    (2014)
  • J. Bernabé-Moreno et al.

    Caresome: a system to enrich marketing customers acquisition and retention campaigns using social media information

    Knowl.-Based Syst.

    (2015)
  • M.D. RemcoDijkman et al.

    Graph matching algorithms for business process model similarity search

    Business Process Management: 7th International Conference, BPM 2009, Ulm, Germany, September 8–10, 2009, Proceedings

    (2009)
  • B. Cao et al.

    Mapping elements with the hungarian algorithm: an efficient method for querying business process models

    Web Services (ICWS), 2015 IEEE International Conference on

    (2015)
  • C.J. Turner et al.

    A genetic programming approach to business process mining

    Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation

    (2008)
  • D.A. Rosso-Pelayo et al.

    Business process mining and rules detection for unstructured information

    Proceedings of the 2010 Ninth Mexican International Conference on Artificial Intelligence

    (2010)
  • H. Leopold et al.

    On the refactoring of activity labels in business process models

    Inf. Syst.

    (2012)
  • Cited by (9)

    View all citing articles on Scopus
    View full text