Scholar iON
Academic Synthesis
This collection of scholarly papers explores various computational approaches to studying protein folding, highlighting both methodological innovations and theoretical insights. Dokholyan et al. (1998) and Klimov & Thirumalai (1998) emphasize novel computational techniques, such as discrete molecular dynamics and lattice models with side chains, to improve the efficiency and accuracy of folding simulations. Veitshans et al. (1996) and Garel (2003) delve into the kinetic and thermodynamic aspects of protein folding, focusing on sequence-dependent properties and the heteropolymeric nature of proteins. A key theme across these studies is the correlation between folding parameters, such as the folding transition temperature (T_F), and cooperativity measures, suggesting a unified framework for understanding folding kinetics and thermodynamics. Collectively, these works underscore the complexity of protein folding and the potential for advanced modeling techniques to unravel its mechanisms.
Background: Many attempts have been made to resolve in time the folding of model proteins in computer simulations. Different computational approaches have emerged. Some of these approaches suffer from the insensitivity to the geometrical properties of the proteins (lattice models), while others are computationally heavy (traditional MD).
Results: We use a recently-proposed approach of Zhou and Karplus to study the folding of the protein model based on the discrete time molecular dynamics algorithm. We show that this algorithm resolves with respect to time the folding --- unfolding transition. In addition, we demonstrate the ability to study the coreof the model protein.
Conclusion: The algorithm along with the model of inter-residue interactions can serve as a tool to study the thermodynamics and kinetics of protein models.
Different aspects of protein folding are illustrated by simplified polymer models. Stressing the diversity of side chains (residues) leads one to view folding as the freezing transition of an heteropolymer. Technically, the most common approach to diversity is randomness, which is usually implemented in two body interactions (charges, polar character,..). On the other hand, the (almost) universal character of the protein backbone suggests that folding may also be viewed as the crystallization transition of an homopolymeric chain, the main ingredients of which are the peptide bond and chirality (proline and glycine notwithstanding). The model of a chiral dipolar chain leads to a unified picture of secondary structures, and to a possible connection of protein structures with ferroelectric domain theory.
The folding kinetics of a number of sequences for off-lattice continuum model of proteins is studied using Langevin simulations at two values of the friction coefficient. We show that there is a remarkable correlation between folding times, $Ο_{F}$, and $Ο= (T_{ΞΈ} - T_{F})/T_{ΞΈ} $, where $T_{ΞΈ}$ and $T_{F}$ are the equilibrium collapse and folding transition temperatures, respectively. The microscopic dynamics reveals several scenarios for the refolding kinetics depending on the values of $Ο$. Proteins with small $Ο$ reach the native conformation via a nucleation collapse mechanism and their energy landscape is characterized by single dominant native basin of attraction. Proteins with large $Ο$ get trapped in competing basins of attraction, in which they adopt misfolded structures. In this case only a small fraction of molecules $Ξ¦$ access the native state rapidly, the majority of them approach the native state by a three stage multipathway mechanism. The partition factor $Ξ¦$ is determined by $Ο$: smaller the value of $Ο$ larger is $Ξ¦$. The qualitative aspects of our results are found to be independent of the friction coefficient. Estimates for time scales for folding of small proteins via a nucleation collapse mechanism are presented.
We consider equilibrium folding transitions in lattice protein models with and without side chains. A dimensionless measure, $Omega_{c}$, is introduced to quantitatively assess the degree of cooperativity in lattice models and in real proteins. We show that larger values of $Ξ©_{c}$ resembling those seen in proteins are obtained in lattice models with side chains (LMSC). The enhanced cooperativity in LMSC is due to the possibility of denser packing of side chains in the interior of the model protein. We also establish that $Ξ©_{c}$ correlates extremely well with (Ο= (T_ΞΈ -T_{f} )/T_ΞΈ), where (T_ΞΈ) and (T_{f}) are collapse and folding transition temperatures, respectively. These theoretical ideas are used to analyze folding transitions in various real proteins. The values of $Ξ©_{c}$ extracted from experiments show a correlation with $Ο$. We conclude that the degree of cooperativity can be expressed in terms of the single parameter $Ο$, which can be estimated from experimental data.
We demonstrate that the recently proposed pruned-enriched Rosenbluth method PERM (P.~Grassberger, Phys.~Rev.~{\bf E 56} (1997)
3682) leads to very efficient algorithms for the folding of simple model proteins. We test it on several models for lattice heteropolymers, and compare to published Monte Carlo studies of the properties of particular sequences. In all cases our method is faster than the previous ones, and in several cases we find new minimal energy states. In addition to producing more reliable candidates for ground states, our method gives detailed information about the thermal spectrum and, thus, allows to analyze static aspects of the folding behavior of arbitrary sequences.
An exactly solvable model based on the topology of a protein native state is applied to identify bottlenecks and key-sites for the folding of HIV-1 Protease. The predicted sites are found to correlate well with clinical data on resistance to FDA-approved drugs. It has been observed that the effects of drug therapy are to induce multiple mutations on the protease. The sites where such mutations occur correlate well with those involved in folding bottlenecks identified through the deterministic procedure proposed in this study. The high statistical significance of the observed correlations suggests that the approach may be promisingly used in conjunction with traditional techniques to identify candidate locations for drug attacks.
mRNA technology has revolutionized vaccine development, protein replacement therapies, and cancer immunotherapies, offering rapid production and precise control over sequence and efficacy. However, the inherent instability of mRNA poses significant challenges for drug storage and distribution, particularly in resource-limited regions. Co-optimizing RNA structure and codon choice has emerged as a promising strategy to enhance mRNA stability while preserving efficacy. Given the vast sequence and structure design space, specialized algorithms are essential to achieve these qualities. Recently, several effective algorithms have been developed to tackle this challenge that all use similar underlying principles. We call these specialized algorithms "mRNA folding" algorithms as they generalize classical RNA folding algorithms. A comprehensive analysis of their underlying principles, performance, and limitations is lacking. This review aims to provide an in-depth understanding of these algorithms, identify opportunities for improvement, and benchmark existing software implementations in terms of scalability, correctness, and feature support.
This paper presents a two-phase protein folding optimization on a three-dimensional AB off-lattice model. The first phase is responsible for forming conformations with a good hydrophobic core or a set of compact hydrophobic amino acid positions. These conformations are forwarded to the second phase, where an accurate search is performed with the aim of locating conformations with the best energy value. The optimization process switches between these two phases until the stopping condition is satisfied. An auxiliary fitness function was designed for the first phase, while the original fitness function is used in the second phase. The auxiliary fitness function includes an expression about the quality of the hydrophobic core. This expression is crucial for leading the search process to the promising solutions that have a good hydrophobic core and, consequently, improves the efficiency of the whole optimization process. Our differential evolution algorithm was used for demonstrating the efficiency of two-phase optimization. It was analyzed on well-known amino acid sequences that are used frequently in the literature. The obtained experimental results show that the employed two-phase optimization improves the efficiency of our algorithm significantly and that the proposed algorithm is superior to other state-of-the-art algorithms.
The computer artificial intelligence system AlphaFold has recently predicted previously unknown three-dimensional structures of thousands of proteins. Focusing on the subset with high-confidence scores, we algorithmically analyze these predictions for cases where the protein backbone exhibits rare topological complexity, i.e. knotting. Amongst others, we discovered a $7_1$-knot, the most topologically complex knot ever found in a protein, as well several 6-crossing composite knots comprised of two methyltransferase or carbonic anhydrase domains, each containing a simple trefoil knot. These deeply embedded composite knots occur evidently by gene duplication and interconnection of knotted dimers. Finally, we report two new five-crossing knots including the first $5_1$-knot. Our list of analyzed structures forms the basis for future experimental studies to confirm these novel knotted topologies and to explore their complex folding mechanisms.
Despite considerable progress, ab initio protein structure prediction remains suboptimal. A crowdsourcing approach is the online puzzle video game Foldit, that provided several useful results that matched or even outperformed algorithmically computed solutions. Using Foldit, the WeFold crowd had several successful participations in the Critical Assessment of Techniques for Protein Structure Prediction. Based on the recent Foldit standalone version, we trained a deep reinforcement neural network called DeepFoldit to improve the score assigned to an unfolded protein, using the Q-learning method with experience replay. This paper is focused on model improvement through hyperparameter tuning. We examined various implementations by examining different model architectures and changing hyperparameter values to improve the accuracy of the model. The new model hyper-parameters also improved its ability to generalize. Initial results, from the latest implementation, show that given a set of small unfolded training proteins, DeepFoldit learns action sequences that improve the score both on the training set and on novel test proteins. Our approach combines the intuitive user interface of Foldit with the efficiency of deep reinforcement learning.