Chapter IV: An efficient representation for statistical mechanics of multi-
4.1 Introduction
Biology runs on multimeric complexes of proteins. Across all domains of life, less than half of proteins are found as monomers, with the rest existing as dimers or higher-order complexes [1]. And this accounting is merely of proteins’ “de- fault” configurations. It neglects the even larger transient complexes in which they participate, which are often theirraison d’etre- the preinitiation complexes for tran- scription and replication, spliceosomes, and GPCRs are prominent examples. In addition to protein-protein interactions, proteins’ small-molecule binding partners are just as important to consider. Witness the power of allostery, as codified in, e.g., the MWC model [2, 3], where the binding of a regulatory small molecule at one site can change a protein’s binding affinities or enzymatic rates at other active sites by orders of magnitude, effectively “transmuting” a protein from one species to an- other. From a theoretical view, the proliferation of possible biochemical complexes threatens to be overwhelming, and existing mathematical methods, at best, struggle to meet this challenge. How do we begin to model such complexity?
The chemical master equation is one natural starting point for modeling many bio- chemical systems, when our goal is a coarse-grained picture that neglects atomistic- level details. An exact solution specifies the probability distribution over available states of the system as a function of time, given some initial condition, but finding such a solution is rarely possible. Generating function methods can provide such a solution, for instance, as we used in Chapter 2, but generally apply only for limited classes of master equations [4]. In some cases, though an exact calulation of the full distribution may be impossible, an exact analytical calculation of the distribution’s moments is possible (e.g., [5, 6]). And if merely specifying mean quantities of the steady-state distribution is sufficient, the King-Altman diagram method, inde- pendently discovered by Hill, intuitively solves this problem by representing the problem in graphical form [7, 8].
For systems with variable number of (classical) identical particles, one novel alter- native formulation of a master equation uses Fock spaces, the vector spaces normally associated with quantum field theories, which offer a natural representation of the variable particle number. Several flavors of such Fock space formalisms have been developed. Doi offered the original formulation as a second-quantized description of reaction-diffusion systems [9, 10]. Several variants followed [11, 12], as well as a path integral description [13]. In particular, the methods of Grassberger and Scheunert [11] as well as Peliti [13] have been widely adopted [4, 14], especially in studies of diffusion-limited processes [15]. A sampling of more recent work has covered such diverse topics as population and epidemic models [16], predator-prey interactions [17], stem cell differentiation [18], and neural networks [19] with self- organized criticality [20] or transitions between metastable states [21]. Many such studies have taken advantage of the path-integral formulation to address the dynamic phase transitions, and their critical behavior, arising in such models.
However, for treating chemical systems that generate large multi-particle complexes, these formalisms rapidly become intractable. Many interesting chemical systems in biology and polymer physics are capable of generating vast (or even infinite) numbers of distinct complexes from a relatively small number of components and interaction rules. In principle, existing Fock space methods can be used to describe such systems but in practice, this is not feasible because existing formalisms treat each distinct multi-particle complex as its own species of particle, requiring manual enumeration of all possible complexes, along with their corresponding free energies, formation and decay rates, and so on. But often these complexes are built up from smaller molecules, which these methods fail to take advantage of, leading to enormous redundancy in merely defining the system. Some authors have undertaken the Herculean effort to enumerate lists of complexes, sometimes running to hundreds of entries [22, 23]. To our knowledge, there has been only one attempt to treat more than three species with the Doi-Peliti formalism, and the resulting field theory had so many fields that the only tractable approach was a computer simulation of the field theory, rather than the underlying master equations [18].
The intractable proliferation of complexes is a well-recognized problem in the con- text of molecular systems biology [24–26]. To address this issue computationally, formal grammars [27–31] and accompanying software [32–36] have been developed that enable “rule-based” simulations of biochemical systems. While such techniques are undoubtedly useful, we hold out hope for an intuitive “rule-based mathematics”
that allows one to work with such systems analytically, which has yet to be described.
Aside from the inelegance of enumerating all complexes, as the number of compo- nents grows there is a real danger that some complexes will be overlooked in the combinatorial explosion. While their neglect may or may not cripple the resulting theory, it is impossible to knowa prioriwhat the effect might be. Such situations demand a theory that can self-consistently account for all the necessary complexes.
Such a formalism is the goal of the present work.
Sketch of the method
Our formalism introduces a Fock space similar in spirit though not in detail to Doi’s [9] and Park and Park’s [12]. Although our method is motivated these Fock space methods for nonequilibrium problems, and although a treatment of the full non-equilibrium problem would be the ultimate goal, we have found that even a simpler equilibrium formalism remains full of subtleties and surprises. Therefore, in this work we present an equilibrium treatment only and defer a non-equilibrium formalism for future work. The present work introduces two powerful innovations.
The first key is to model every particle in this formalism as existing in one of a large number of internal states. These internal states uniquely identify each particle and are essential for representing multi-particle complexes in terms of their components.
These degrees of freedom are not in themselves of interest. They merely serve to enforce a classical notion of distinguishability while retaining the point particle idealization. Therefore, multiple molecules of the same type can exist at the same
“point” in space, forming bound complexes with each other and with other species.
This leads us to the second key idea, which is to track binding sites on, interactions between, and conformational states of molecules separately from the molecules themselves. In other words, while one field creation operator “creates” the molecule, another field’s creation operator might “occupy” a binding site on it, another might represent pairwise bonds between two such molecules, and yet another could trans- form its conformational state. This approach allows complexes to be assembled naturally by specifying intuitive “rules,” rather than introducing whole complexes as entirely new species produced by reactions. For the present work, we will view monomoers and complexes as zero-dimensional point-like idealizations, but this restriction can be relaxed. We believe occupied volume and steric interactions can be incorporated in the theory as another auxiliary field, though we also leave that for future work.
We emphasize two important features of our approach relative to existing work.
First, previous Fock-space formalisms are only capable of creating and destroying particles, so complexes must be created “whole-cloth.” In contrast, our approach naturally and intuitively allows complexes to be assembled, disassembled, or have conformational states modified. Second, our approach offers a natural way to coarse-grain multi-particle complexes, as we will see in detail below.
Carrying out calculations directly with our operator formalism remains algebraically daunting for all but the simplest examples. To assist, we have developed a conve- nient diagrammatic approach that is equivalent, yet far more transparent. It should be noted that our diagram techniques, though obviously inspired by the familiar Feynman diagrams from quantum field theory, are not equivalent to them, unlike the diagrams arising from the Doi-Peliti approach. This will be made more clear below.
4.2 Formalism