Constraint-handling through multi-objective optimization: The hydrophobic-polar model for protein structure prediction
Introduction
Evolutionary computation methods and other metaheuristic algorithms have been successfully used to solve complex optimization problems which arise in a diversity of scientific and engineering applications. Often, however, optimization involves not only to reach the best value for a given objective function (or set of objective functions), but also to satisfy a certain set of predefined requirements called constraints. Therefore, additional mechanisms need to be implemented within metaheuristic algorithms in order to search effectively through this kind of constrained solution spaces.
The hydrophobic-polar (HP) model [1], [2] is an abstract formulation of the protein structure prediction (PSP) problem, where hydrophobicity is assumed to be the main stabilizing force in the protein folding process. Under this model, PSP is defined as the problem of finding a self-avoiding embedding of the protein chain on a given lattice, such that the interaction among hydrophobic amino acids is maximized. From the computational point of view, the HP model entails a challenging problem in combinatorial optimization [3], [4]. One of the main sources of difficulty in this problem lies in the fact that, using the existing problem representations, a significant portion of the solution space encodes infeasible (non-self-avoiding) protein structures. Hence, it is important to devise effective mechanisms for handling the constraints that this problem presents. Two main research directions have been adopted to cope with this issue. On the one hand, the search can be confined to the space of only feasible, self-avoiding protein conformations. On the other hand, infeasible protein conformations can also be taken into consideration, which has been achieved in the literature by implementing a penalty strategy. From the literature, however, it is not possible to identify a clear consensus on which of the two directions, i.e., to avoid or to consider infeasible conformations, could lead to the development of more efficient metaheuristics for solving this problem [5], [6], [7], [8], [9].
Premised upon the belief that infeasible conformations can provide valuable information for guiding the search process, this research work inquires into the use of multi-objective optimization as an alternative constraint-handling strategy for the HP model. Particularly, constraints in the HP model are treated as a supplementary optimization criterion, leading to an unconstrained multi-objective problem.1 Using such an alternative formulation of the HP model, infeasible solutions can become incomparable with respect to feasible ones, having thus better opportunities for participating throughout the search process. In contrast to the penalty strategy, which represents one of the most widely used techniques in the constraint-handling literature, in essence the multi-objective (MO) method does not require the fine-tuning of the penalty parameters2; in the penalty strategy, finding the right balance between objective function and penalty values has been regarded to be a difficult optimization problem itself [10], [11]. The use of multi-objective optimization for handling constraints is not a novel idea; recent reviews on this topic can be found in [11], [12]. Nevertheless, it was not until recently that the preliminary results of this research reported for the first time, to the best of the authors׳ knowledge, the application of the MO constraint-handling strategy to the particular HP model of the PSP problem [13].
Building further on this research, the primary aim of this study is to contribute to the general understanding of the functioning of the MO constraint-handling technique. First, a detailed analysis is conducted in order to investigate the potential effects of the problem transformation from the perspective of the fitness landscape. More specifically, it is evaluated how the use of the MO problem formulation impacts on an important property of the fitness landscape: neutrality. It has been argued that the MO approach to constraint-handling could be rather ineffective if a search bias towards the feasible region is not introduced [14]. Therefore, the second part of this document concerns the study of different mechanisms which can be employed for providing the MO strategy with such a search bias. The last part of this research work extends the comparative analysis reported in [13], where the MO approach is evaluated with respect to commonly adopted techniques from the specialized literature. While the preliminary results presented in [13] assumed a fixed biasing scheme for the MO method and focused only on the performance of a population-based algorithm, the different biasing mechanisms analyzed in the second part of this study, as well as both single-solution-based and population-based algorithms, have been included in the present study. Likewise, only 15 test instances for the two-dimensional HP model (based on the square lattice) were used in [13]. In contrast, the present study covers also the three-dimensional case (based on the cubic lattice) and a total of 30 test cases have been considered.
The remainder of this document is organized as follows. Section 2 provides background concepts and sets the notation used in this study. Section 3 reviews related work on constraint-handling methods for the HP model as well as on the topic of single-objective to multi-objective transformations. The studied MO constraint-handling approach is described in Section 4. Section 5 presents the analysis with regard to the fitness landscape transformation. The search bias issue is addressed in Section 6. The comparative study which focuses on search performance is covered in Section 7. Finally, Section 8 discusses the main findings and presents the conclusions of this study. Appendices at the end of this document contain supplementary information with regard to implementation details of the considered search algorithms, performance measures, test instances, the methodology followed for the statistical significance analyses, and the utilized experimental platform.
Section snippets
Single-objective and multi-objective optimization
Without loss of generality, a single-objective optimization problem can be formally stated as follows:where x is a solution vector; denotes the feasible set, i.e., the set of all feasible solution vectors in the search space , ; and is the objective function to be optimized. The aim is thus to find the feasible solution(s) yielding the optimum value for the objective function; that is, to find such that .
Similarly, a
Constraint-handling in the HP model
In the literature, two basic directions have been taken to address the self-avoidance constraint which relates to the feasibility of protein conformations in the HP model of the PSP problem. On the one hand, the search can concentrate on the feasible space; that is, considering only solutions encoding self-avoiding protein conformations. This is usually accomplished (i) by adapting the variation operators to iterate until new feasible conformations are generated, i.e., infeasible conformations
Handling constraints in the HP model by multi-objective optimization
It is the authors׳ belief that considering infeasible protein conformations during optimization can boost the performance of metaheuristics for solving PSP under the HP model (arguments on this respect have also been given in the literature [5]). Therefore, it is important to devise new constraint-handling mechanisms, which allow these algorithms to exploit the vast amount of infeasibility that the HP model involves, as a means of steering the search process in a more effective manner.
The use
Fitness landscape transformation
Whereas infeasible solutions are usually regarded and treated as inferior, or even as inadmissible solutions during the search process, such a distinction between feasible and infeasible solutions is not captured when handling constraints by multi-objective optimization. As discussed in Section 4, the multi-objective strategy allows infeasible solutions to become incomparable, under certain conditions, with respect to feasible ones. Such an effect of the problem transformation leads to an
Introducing a search bias
By defining trade-offs between the quality and feasibility of solution candidates, the multi-objective (MO) approach to handle constraints allows for the exploitation of useful information from infeasible areas of the fitness landscape. Despite the potential advantages of the MO strategy in terms of the landscape transformation, as analyzed in Section 5.3, its lack of a proper search bias may also lead to detrimental effects on the ability of search algorithms for locating promising regions of
Impact on search performance
This section investigates the suitability of the multi-objective optimization (MO) strategy for handling constraints in the HP model. To this end, the MO strategy is evaluated and compared with respect to two different constraint-handling approaches usually adopted in the specialized literature, namely, the rejection of infeasible protein conformations and the application of penalties. These approaches are to be referred to as the reject (RJ) and penalty function (PF) strategies and are
Conclusions
The multi-objective (MO) approach to constraint-handling has been investigated in the context of the HP model for protein structure prediction (PSP). The HP model was reformulated as an unconstrained multi-objective problem by treating constraints as an additional objective function. Rather than discriminating feasible from infeasible solutions, the MO strategy defines trade-offs between quality (original objective) and feasibility. This gives infeasible solutions the opportunity to be
Acknowledgment
The first author acknowledges support from CONACyT through a scholarship to pursue graduate studies at the Information Technology Laboratory, CINVESTAV-Tamaulipas. Also, the authors acknowledge support from CONACyT through projects 205060 and 99276.
References (97)
- et al.
Constraint-handling in nature-inspired numerical optimizationpast, present and future
Swarm Evol Comput
(2011) - et al.
Fitness landscape of the cellular automata majority problemview from the olympus
Theor Comput Sci
(2007) - et al.
A study of the neutrality of boolean function landscapes in genetic programming
Theor Comput Sci
(2012) - et al.
A survey of techniques for characterising fitness landscapes and some possible ways forward
Inf Sci
(2013) - et al.
A multiple minima genetic algorithm for protein structure prediction
Appl Soft Comput
(2014) - et al.
Helper-objective optimization strategies for the job-shop scheduling problem
Appl Soft Comput
(2011) - et al.
Neutrality in fitness landscapes
Appl Math Comput
(2001) - et al.
An improved PSO for dynamic load dispatch of generators with valve-point effects
Energy
(2009) An efficient constraint handling method for genetic algorithms
Comput Methods Appl Mech Eng
(2000)Theory for the folding and stability of globular proteins
Biochemistry
(1985)
A lattice statistical mechanics model of the conformational and sequence spaces of proteins
Macromolecules
Stochastic ranking for constrained evolutionary optimization
IEEE Trans Evol Comput
Using multi-objective evolutionary algorithms for single-objective optimization
4OR
Search biases in constrained evolutionary optimization
IEEE Trans Syst Man Cybern C: Appl Rev
Cours d׳Economie Politique
Principles that govern the folding of protein chains
Science
Clustered memetic algorithm with local heuristics for ab initio protein structure prediction
IEEE Trans Evol Comput
Spiral search: a hydrophobic-core directed local search for simplified PSP on 3D FCC lattice
BMC Bioinform
Protein conformation of a lattice model using tabu search
J Global Optim
An ant colony optimization algorithm for the 2d and 3d hydrophobic polar protein folding problem
BMC Bioinform
An immune algorithm for protein structure prediction on lattice models
IEEE Trans Evol Comput
Particle swarm optimization approach for protein structure prediction in the 3D HP model
Interdiscip Sci: Comput Life Sci
Enhanced hybrid search algorithm for protein structure prediction using the 3D-HP lattice model
J Mol Model
A differential evolution approach for protein folding using a lattice model
J Comput Sci Technol
Protein folding in simplified models with estimation of distribution algorithms
IEEE Trans Evol Comput
Hydrophobic-polar model structure prediction with binary-coded artificial plant optimization algorithm
J Comput Theor Nanosci
A firefly-inspired method for protein structure prediction in lattice models
Biomolecules
On discrete models and immunological algorithms for protein structure prediction
Nat Comput
Cited by (19)
Reconciling crop production and ecological conservation under uncertainty: A fuzzy credibility-based multi-objective simulation-optimization model
2023, Science of the Total EnvironmentA novel adaptive control strategy for decomposition-based multiobjective algorithm
2017, Computers and Operations ResearchGenetic algorithm with advanced mechanisms applied to the protein structure prediction in a hydrophobic-polar model and cubic lattice
2016, Applied Soft Computing JournalCitation Excerpt :A memetic algorithm with the following novel features is introduced in [20]: a modified fitness function, a systematic generation of population that automatically prevents infeasible conformations, a generalized non-isomorphic encoding scheme that implicitly eliminates a generation of symmetrical conformations, population clustering and the identification of a meme according to the genotype, and a 2-stage mechanism for migrating domain knowledge between different basins of attraction. The multi-objective approach to constraint-handling was investigated in [21]. For that purpose the HP model was reformulated as an unconstrained multi-objective problem by treating constraints as an additional objective function.
Multi-objective constraint handling method for solving berth allocation and quay crane assignment problem
2023, Kongzhi Lilun Yu Yingyong/Control Theory and ApplicationsCan HP-protein Folding Be Solved with Genetic Algorithms? Maybe not
2023, International Joint Conference on Computational IntelligenceProtein Folding Optimization Using Butterfly Optimization Algorithm
2023, Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST