Constraint-handling through multi-objective optimization: The hydrophobic-polar model for protein structure prediction

doi:10.1016/j.cor.2014.07.010

Computers & Operations Research

Volume 53, January 2015, Pages 128-153

https://doi.org/10.1016/j.cor.2014.07.010 Get rights and content

Abstract

In the multi-objective approach to constraint-handling, a constrained problem is transformed into an unconstrained one by defining additional optimization criteria to account for the problem constraints. In this paper, this approach is explored in the context of the hydrophobic-polar model, a simplified yet challenging representation of the protein structure prediction problem. Although focused on such a particular case of study, this research work is intended to contribute to the general understanding of the multi-objective constraint-handling strategy. First, a detailed analysis was conducted to investigate the extent to which this strategy impacts on the characteristics of the fitness landscape. As a result, it was found that an important fraction of the infeasibility translates into neutrality. This neutrality defines potentially shorter paths to move through the landscape, which can also be exploited to escape from local optima. By studying different mechanisms, the second part of this work highlights the relevance of introducing a proper search bias when handling constraints by multi-objective optimization. Finally, the suitability of the multi-objective approach was further evaluated in terms of its ability to effectively guide the search process. This strategy significantly improved the performance of the considered search algorithms when compared with respect to commonly adopted techniques from the literature.

Introduction

Evolutionary computation methods and other metaheuristic algorithms have been successfully used to solve complex optimization problems which arise in a diversity of scientific and engineering applications. Often, however, optimization involves not only to reach the best value for a given objective function (or set of objective functions), but also to satisfy a certain set of predefined requirements called constraints. Therefore, additional mechanisms need to be implemented within metaheuristic algorithms in order to search effectively through this kind of constrained solution spaces.

The hydrophobic-polar (HP) model [1], [2] is an abstract formulation of the protein structure prediction (PSP) problem, where hydrophobicity is assumed to be the main stabilizing force in the protein folding process. Under this model, PSP is defined as the problem of finding a self-avoiding embedding of the protein chain on a given lattice, such that the interaction among hydrophobic amino acids is maximized. From the computational point of view, the HP model entails a challenging problem in combinatorial optimization [3], [4]. One of the main sources of difficulty in this problem lies in the fact that, using the existing problem representations, a significant portion of the solution space encodes infeasible (non-self-avoiding) protein structures. Hence, it is important to devise effective mechanisms for handling the constraints that this problem presents. Two main research directions have been adopted to cope with this issue. On the one hand, the search can be confined to the space of only feasible, self-avoiding protein conformations. On the other hand, infeasible protein conformations can also be taken into consideration, which has been achieved in the literature by implementing a penalty strategy. From the literature, however, it is not possible to identify a clear consensus on which of the two directions, i.e., to avoid or to consider infeasible conformations, could lead to the development of more efficient metaheuristics for solving this problem [5], [6], [7], [8], [9].

Premised upon the belief that infeasible conformations can provide valuable information for guiding the search process, this research work inquires into the use of multi-objective optimization as an alternative constraint-handling strategy for the HP model. Particularly, constraints in the HP model are treated as a supplementary optimization criterion, leading to an unconstrained multi-objective problem.¹ Using such an alternative formulation of the HP model, infeasible solutions can become incomparable with respect to feasible ones, having thus better opportunities for participating throughout the search process. In contrast to the penalty strategy, which represents one of the most widely used techniques in the constraint-handling literature, in essence the multi-objective (MO) method does not require the fine-tuning of the penalty parameters²; in the penalty strategy, finding the right balance between objective function and penalty values has been regarded to be a difficult optimization problem itself [10], [11]. The use of multi-objective optimization for handling constraints is not a novel idea; recent reviews on this topic can be found in [11], [12]. Nevertheless, it was not until recently that the preliminary results of this research reported for the first time, to the best of the authors׳ knowledge, the application of the MO constraint-handling strategy to the particular HP model of the PSP problem [13].

Building further on this research, the primary aim of this study is to contribute to the general understanding of the functioning of the MO constraint-handling technique. First, a detailed analysis is conducted in order to investigate the potential effects of the problem transformation from the perspective of the fitness landscape. More specifically, it is evaluated how the use of the MO problem formulation impacts on an important property of the fitness landscape: neutrality. It has been argued that the MO approach to constraint-handling could be rather ineffective if a search bias towards the feasible region is not introduced [14]. Therefore, the second part of this document concerns the study of different mechanisms which can be employed for providing the MO strategy with such a search bias. The last part of this research work extends the comparative analysis reported in [13], where the MO approach is evaluated with respect to commonly adopted techniques from the specialized literature. While the preliminary results presented in [13] assumed a fixed biasing scheme for the MO method and focused only on the performance of a population-based algorithm, the different biasing mechanisms analyzed in the second part of this study, as well as both single-solution-based and population-based algorithms, have been included in the present study. Likewise, only 15 test instances for the two-dimensional HP model (based on the square lattice) were used in [13]. In contrast, the present study covers also the three-dimensional case (based on the cubic lattice) and a total of 30 test cases have been considered.

The remainder of this document is organized as follows. Section 2 provides background concepts and sets the notation used in this study. Section 3 reviews related work on constraint-handling methods for the HP model as well as on the topic of single-objective to multi-objective transformations. The studied MO constraint-handling approach is described in Section 4. Section 5 presents the analysis with regard to the fitness landscape transformation. The search bias issue is addressed in Section 6. The comparative study which focuses on search performance is covered in Section 7. Finally, Section 8 discusses the main findings and presents the conclusions of this study. Appendices at the end of this document contain supplementary information with regard to implementation details of the considered search algorithms, performance measures, test instances, the methodology followed for the statistical significance analyses, and the utilized experimental platform.

Section snippets

Single-objective and multi-objective optimization

Without loss of generality, a single-objective optimization problem can be formally stated as follows: $Minimize f (x), subject to x \in X_{F},$ where x is a solution vector; $X_{F}$ denotes the feasible set, i.e., the set of all feasible solution vectors in the search space $X$ , $X_{F} ⊊ X$ ; and $f : X \to R$ is the objective function to be optimized. The aim is thus to find the feasible solution(s) yielding the optimum value for the objective function; that is, to find $x^{⁎} \in X_{F}$ such that $f (x^{⁎}) = \min {f (x) | x \in X_{F}}$ .

Similarly, a

Constraint-handling in the HP model

In the literature, two basic directions have been taken to address the self-avoidance constraint which relates to the feasibility of protein conformations in the HP model of the PSP problem. On the one hand, the search can concentrate on the feasible space; that is, considering only solutions encoding self-avoiding protein conformations. This is usually accomplished (i) by adapting the variation operators to iterate until new feasible conformations are generated, i.e., infeasible conformations

Handling constraints in the HP model by multi-objective optimization

It is the authors׳ belief that considering infeasible protein conformations during optimization can boost the performance of metaheuristics for solving PSP under the HP model (arguments on this respect have also been given in the literature [5]). Therefore, it is important to devise new constraint-handling mechanisms, which allow these algorithms to exploit the vast amount of infeasibility that the HP model involves, as a means of steering the search process in a more effective manner.

The use

Fitness landscape transformation

Whereas infeasible solutions are usually regarded and treated as inferior, or even as inadmissible solutions during the search process, such a distinction between feasible and infeasible solutions is not captured when handling constraints by multi-objective optimization. As discussed in Section 4, the multi-objective strategy allows infeasible solutions to become incomparable, under certain conditions, with respect to feasible ones. Such an effect of the problem transformation leads to an

Introducing a search bias

By defining trade-offs between the quality and feasibility of solution candidates, the multi-objective (MO) approach to handle constraints allows for the exploitation of useful information from infeasible areas of the fitness landscape. Despite the potential advantages of the MO strategy in terms of the landscape transformation, as analyzed in Section 5.3, its lack of a proper search bias may also lead to detrimental effects on the ability of search algorithms for locating promising regions of

Impact on search performance

This section investigates the suitability of the multi-objective optimization (MO) strategy for handling constraints in the HP model. To this end, the MO strategy is evaluated and compared with respect to two different constraint-handling approaches usually adopted in the specialized literature, namely, the rejection of infeasible protein conformations and the application of penalties. These approaches are to be referred to as the reject (RJ) and penalty function (PF) strategies and are

Conclusions

The multi-objective (MO) approach to constraint-handling has been investigated in the context of the HP model for protein structure prediction (PSP). The HP model was reformulated as an unconstrained multi-objective problem by treating constraints as an additional objective function. Rather than discriminating feasible from infeasible solutions, the MO strategy defines trade-offs between quality (original objective) and feasibility. This gives infeasible solutions the opportunity to be

Acknowledgment

The first author acknowledges support from CONACyT through a scholarship to pursue graduate studies at the Information Technology Laboratory, CINVESTAV-Tamaulipas. Also, the authors acknowledge support from CONACyT through projects 205060 and 99276.

References (97)

E. Mezura-Montes et al.
Constraint-handling in nature-inspired numerical optimizationpast, present and future
Swarm Evol Comput
(2011)
S. Verel et al.
Fitness landscape of the cellular automata majority problemview from the olympus
Theor Comput Sci
(2007)
L. Vanneschi et al.
A study of the neutrality of boolean function landscapes in genetic programming
Theor Comput Sci
(2012)
K. Malan et al.
A survey of techniques for characterising fitness landscapes and some possible ways forward
Inf Sci
(2013)
F. Custódio et al.
A multiple minima genetic algorithm for protein structure prediction
Appl Soft Comput
(2014)
D. Lochtefeld et al.
Helper-objective optimization strategies for the job-shop scheduling problem
Appl Soft Comput
(2011)
C. Reidys et al.
Neutrality in fitness landscapes
Appl Math Comput
(2001)
X. Yuan et al.
An improved PSO for dynamic load dispatch of generators with valve-point effects
Energy
(2009)
K. Deb
An efficient constraint handling method for genetic algorithms
Comput Methods Appl Mech Eng
(2000)
K. Dill
Theory for the folding and stability of globular proteins
Biochemistry
(1985)

K. Lau et al.

A lattice statistical mechanics model of the conformational and sequence spaces of proteins

Macromolecules

(1989)

Berger B, Leighton T. Protein folding in the hydrophobic–hydrophilic (HP) model is NP-complete. In: International...

Crescenzi P, Goldman D, Papadimitriou C, Piccolboni A, Yannakakis M. On the complexity of Protein Folding. In: ACM...

Krasnogor N, Hart W, Smith J, Pelta D. Protein structure prediction with evolutionary algorithms. In: Genetic and...

Duarte-Flores S, Smith J. Study of fitness landscapes for the HP model of protein structure prediction. In: IEEE...

Cotta C. Protein structure prediction using evolutionary algorithms hybridized with backtracking. In: Artificial neural...

de Almeida C, Gonçalves R, Delgado M. A hybrid immune-based system for the protein folding problem. In: Evolutionary...

Santos J, Diéguez M. Differential evolution for protein structure prediction using the HP model. In: Foundations on...

T. Runarsson et al.

Stochastic ranking for constrained evolutionary optimization

IEEE Trans Evol Comput

(2000)

C. Segura et al.

Using multi-objective evolutionary algorithms for single-objective optimization

4OR

(2013)

Garza-Fabre M, Toscano-Pulido G, Rodriguez-Tello E. Handling constraints in the HP model for protein structure...

T. Runarsson et al.

Search biases in constrained evolutionary optimization

IEEE Trans Syst Man Cybern C: Appl Rev

(2005)

V. Pareto

Cours d׳Economie Politique

(1896)

Wright S. The roles of mutation, inbreeding, crossbreeding and selection in evolution.In: Proceedings of the sixth...

Stadler P. Fitness landscapes. In: Biological evolution and statistical physics. Lecture notes in physics, vol. 585....

Pitzer E, Affenzeller M. A comprehensive survey on fitness landscape analysis. In: Recent advances in intelligent...

C. Anfinsen

Principles that govern the folding of protein chains

Science

(1973)

Lopes H. Evolutionary algorithms for the protein folding problem: a review and current trends. In: Computational...

Krasnogor N, Blackburne B, Burke E, Hirst J. Multimeme algorithms for protein structure prediction. In: parallel...

M. Islam et al.

Clustered memetic algorithm with local heuristics for ab initio protein structure prediction

IEEE Trans Evol Comput

(2013)

M. Rashid et al.

Spiral search: a hydrophobic-core directed local search for simplified PSP on 3D FCC lattice

BMC Bioinform

(2013)

P. Pardalos et al.

Protein conformation of a lattice model using tabu search

J Global Optim

(1997)

A. Shmygelska et al.

An ant colony optimization algorithm for the 2d and 3d hydrophobic polar protein folding problem

BMC Bioinform

(2005)

Nardelli M, Tedesco L, Bechini A. Cross-lattice behavior of general ACO folding for proteins in the HP model. In: ACM...

V. Cutello et al.

An immune algorithm for protein structure prediction on lattice models

IEEE Trans Evol Comput

(2007)

N. Mansour et al.

Particle swarm optimization approach for protein structure prediction in the 3D HP model

Interdiscip Sci: Comput Life Sci

(2012)

C. Zhou et al.

Enhanced hybrid search algorithm for protein structure prediction using the 3D-HP lattice model

J Mol Model

(2013)

H. Lopes et al.

A differential evolution approach for protein folding using a lattice model

J Comput Sci Technol

(2007)

R. Santana et al.

Protein folding in simplified models with estimation of distribution algorithms

IEEE Trans Evol Comput

(2008)

B. Chen, L. Li, J. Hu, A novel EDAs based method for HP model protein folding. In: IEEE congress on evolutionary...

X. Cai et al.

Hydrophobic-polar model structure prediction with binary-coded artificial plant optimization algorithm

J Comput Theor Nanosci

(2013)

B. Maher et al.

A firefly-inspired method for protein structure prediction in lattice models

Biomolecules

(2014)

Patton A, Punch III W, Goodman E. A standard GA approach to native protein conformation prediction. In: International...

Unger R, Moult J. Genetic algorithm for 3d protein folding simulations. In: International conference on genetic...

Chira C, Horvath D, Dumitrescu D. An evolutionary model based on Hill-Climbing search operators for protein structure...

Chira C. A hybrid evolutionary approach to protein structure prediction with lattice models. In: IEEE congress on...

V. Cutello et al.

On discrete models and immunological algorithms for protein structure prediction

Nat Comput

(2011)

Lesh N, Mitzenmacher M, Whitesides S. A complete and effective move set for simplified protein folding. In:...

Cited by (19)

Reconciling crop production and ecological conservation under uncertainty: A fuzzy credibility-based multi-objective simulation-optimization model
2023, Science of the Total Environment
To cope with the problems of agricultural water conflicts and secondary soil salinization in arid regions, a fuzzy credibility-based multi-objective simulation-optimization model is proposed for optimizing irrigation water allocation and crop area planning under uncertainty. This model combines simulation module of enabling to quantify daily physical process of water-salt movement among soil water, crop root zone and groundwater aquifers, optimization module of managing water resources and fuzzy credibility-constrained programming into a general framework. It's applied to a case study in the Jiefangzha Irrigation Subarea in Hetao Irrigation District, Northwest China. Three objectives encompassing maximizing net economic benefits, maximizing nutritional water productivity and minimizing carbon emissions from agricultural system are interconnected through decision variables. Four credibility scenarios of fuzzy constrains including β = 0.6 to 0.9 are presented for obtaining decision-making solutions. Through NSGA-III, such a high-dimensional multi-objective problem is solved. This study uses the multi-objective constraint-handling strategy to handle constraints, which exploits the effective information within infeasible solutions. Moreover, it emphasizes the importance of soil water-salt movement processes in determining optimal solutions and helps decision makers weigh the system outputs and risk level of violating constraints. Results illustrate that when β increases from 0.6 to 0.9, net economic benefit decreases from 1.742 × 10⁹ Yuan to 1.706 × 10⁹ Yuan, nutritional water productivity decreases from 9136.0 kcal/m³ to 8819.6 kcal/m³, and carbon emissions increase from 439.6 × 10⁶ kg. C to 441.6 × 10⁶ kg.C, which shows that an increasing credibility level leads to lower system benefits and conservative system outputs. The results can provide valuable information for managing irrigation water resources and controlling salinity accumulation. Furthermore, dynamic decisions related to water-salt movement processes can be readily generated. These findings show that the developed approach is globally applicable for managing irrigation water in arid and semiarid regions that face similar problems.
A novel adaptive control strategy for decomposition-based multiobjective algorithm
2017, Computers and Operations Research
Recently, evolutionary algorithm based on decomposition (MOEA/D) has been found to be very effective and efficient for solving complicated multiobjective optimization problems (MOPs). However, the selected differential evolution (DE) strategies and their parameter settings impact a lot on the performance of MOEA/D when tackling various kinds of MOPs. Therefore, in this paper, a novel adaptive control strategy is designed for a recently proposed MOEA/D with stable matching model, in which multiple DE strategies coupled with the parameter settings are adaptively conducted at different evolutionary stages and thus their advantages can be combined to further enhance the performance. By exploiting the historically successful experience, an execution probability is learned for each DE strategy to perform adaptive adjustment on the candidate solutions. The proposed adaptive strategies on operator selection and parameter settings are aimed at improving both of the convergence speed and population diversity, which are validated by our numerous experiments. When compared with several variants of MOEA/D such as MOEA/D, MOEA/D-DE, MOEA/D-DE+PSO, ENS-MOEA/D, MOEA/D-FRRMAB and MOEA/D-STM, our algorithm performs better on most of test problems.
Genetic algorithm with advanced mechanisms applied to the protein structure prediction in a hydrophobic-polar model and cubic lattice
2016, Applied Soft Computing Journal
Citation Excerpt :
A memetic algorithm with the following novel features is introduced in [20]: a modified fitness function, a systematic generation of population that automatically prevents infeasible conformations, a generalized non-isomorphic encoding scheme that implicitly eliminates a generation of symmetrical conformations, population clustering and the identification of a meme according to the genotype, and a 2-stage mechanism for migrating domain knowledge between different basins of attraction. The multi-objective approach to constraint-handling was investigated in [21]. For that purpose the HP model was reformulated as an unconstrained multi-objective problem by treating constraints as an additional objective function.
This paper presents a genetic algorithm applied to the protein structure prediction in a hydrophobic-polar model on a cubic lattice. The proposed genetic algorithm is extended with crowding, clustering, repair, local search and opposition-based mechanisms. The crowding is responsible for maintaining the good solutions to the end of the evolutionary process while the clustering is used to divide a whole population into a number of subpopulations that can locate different good solutions. The repair mechanism transforms infeasible solutions to feasible solutions that do not occupy the lattice point for more than one monomer. In order to improve convergence speed the algorithm uses local search. This mechanism improves the quality of conformations with the local movement of one or two consecutive monomers through the entire conformation. The opposition-based mechanism is introduced to transform conformations to the opposite direction. In this way the algorithm easily improves good solutions on both sides of the sequence. The proposed algorithm was tested on a number of well-known hydrophobic-polar sequences. The obtained results show that the mechanisms employed improve the algorithm's performance and that our algorithm is superior to other state-of-the-art evolutionary and swarm algorithms.
Multi-objective constraint handling method for solving berth allocation and quay crane assignment problem
2023, Kongzhi Lilun Yu Yingyong/Control Theory and Applications
Can HP-protein Folding Be Solved with Genetic Algorithms? Maybe not
2023, International Joint Conference on Computational Intelligence
Protein Folding Optimization Using Butterfly Optimization Algorithm
2023, Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST

View all citing articles on Scopus

View full text

Constraint-handling through multi-objective optimization: The hydrophobic-polar model for protein structure prediction

Abstract

Introduction

Section snippets

Single-objective and multi-objective optimization

Constraint-handling in the HP model

Handling constraints in the HP model by multi-objective optimization

Fitness landscape transformation

Introducing a search bias

Impact on search performance

Conclusions

Acknowledgment

Swarm Evol Comput

Theor Comput Sci

Theor Comput Sci

Inf Sci

Appl Soft Comput

Appl Soft Comput

Appl Math Comput

Energy

Comput Methods Appl Mech Eng

Theory for the folding and stability of globular proteins

Biochemistry

A lattice statistical mechanics model of the conformational and sequence spaces of proteins

Macromolecules

Stochastic ranking for constrained evolutionary optimization

IEEE Trans Evol Comput

Using multi-objective evolutionary algorithms for single-objective optimization

4OR

Search biases in constrained evolutionary optimization

IEEE Trans Syst Man Cybern C: Appl Rev

Cours d׳Economie Politique

Principles that govern the folding of protein chains

Science

Clustered memetic algorithm with local heuristics for ab initio protein structure prediction

IEEE Trans Evol Comput

Spiral search: a hydrophobic-core directed local search for simplified PSP on 3D FCC lattice

BMC Bioinform

Protein conformation of a lattice model using tabu search

J Global Optim

An ant colony optimization algorithm for the 2d and 3d hydrophobic polar protein folding problem

BMC Bioinform

An immune algorithm for protein structure prediction on lattice models

IEEE Trans Evol Comput

Particle swarm optimization approach for protein structure prediction in the 3D HP model

Interdiscip Sci: Comput Life Sci

Enhanced hybrid search algorithm for protein structure prediction using the 3D-HP lattice model

J Mol Model

A differential evolution approach for protein folding using a lattice model

J Comput Sci Technol

Protein folding in simplified models with estimation of distribution algorithms

IEEE Trans Evol Comput

Hydrophobic-polar model structure prediction with binary-coded artificial plant optimization algorithm

J Comput Theor Nanosci

A firefly-inspired method for protein structure prediction in lattice models

Biomolecules

On discrete models and immunological algorithms for protein structure prediction

Nat Comput