I. INTRODUCTION
Biophysical education is usually understood in terms of how to teach certain topics to students enrolled in biophysics majors or graduate programs. However, as an interdisciplinary subject par excellence, biophysics is often studied at later stages in a research career. Students join the field coming from biology, chemistry, or physics, sometimes already as postdoctoral researchers. We hope this article will be useful for them as well.
As an instructional module, the material presented here is most appropriate for first-year graduate students in biophysics, biochemistry, or physical chemistry. We suggest teaching it in 2 stages. The approach to binding of a ligand to one site in terms of probabilities of the free and bound states can be introduced early, in the junior year of an undergraduate degree. Binding to 2 or more sites would be approached qualitatively at this point, emphasizing thinking about affinities of the ligand for binding sites beyond the first. In most undergraduate biophysics or biochemistry programs, this topic could be covered in a biophysics, biochemistry, or interdisciplinary lecture or in a lecture–lab combo. If taught in a biophysics class, the material should be presented in a way that the connections to physical structure and mechanism are direct. Therefore, binding should be approached in terms of microscopic models. The second stage would be in a course in biophysics or biophysical chemistry in the first year of a graduate program or the senior year of a biophysics or biochemistry major. Here, the focus would shift to the interactions between binding sites in a more quantitative way. The pH titration of ethylenediaminetetraacetic acid (EDTA) is useful because of its familiarity to students. At this point, however, students would already be acquainted with proteins and enzymes, from which additional examples could be drawn.
The concept of binding is of central importance in biophysical and biochemical education (1). In most biophysics and biochemistry undergraduate programs, the problem of ligand binding to macromolecules is typically taught first in the third year, in an introductory biochemistry or biophysics course. The classical approach to ligand binding, as described in standard textbooks, such as Lehninger Principles of Biochemistry (2), is essentially an algebraic approach, where a binding curve is derived from the expression for the equilibrium constant. Textbooks more geared to biology programs, such as Stryer’s Biochemistry (3), entirely avoid the algebra or any mathematic representation of binding curves until a later chapter on enzyme kinetics, where substrate binding to enzymes is introduced. That text is clearly inadequate for a biophysics program. More complex texts, such as the classic Cantor and Schimmel’s Biophysical Chemistry (4) or van Holde’s Physical Biochemistry (5), use a more complex mathematical formalism, still based on algebra and calculus, but are clearly not appropriate for an undergraduate course: the concepts tend to be lost in the mathematics.
Furthermore, those approaches run into difficulties when expanding the concept of binding to include interactions. An entirely different way of thinking is to start with the mechanism on the basis of structure and approach binding from this perspective (6). This idea is consistent with the new pedagogic emphasis on causal mechanistic reasoning (7). Fully grasping the problem of binding of a ligand to a macromolecule opens the gates to understanding essential phenomena in several areas of biophysics and biochemistry: in enzyme kinetics, as substrate binding to an enzyme; in receptor theory of signal transduction, as the binding of a ligand elicits a signal; in the regulation of protein or enzyme activity by activators and inhibitors; in the assembly of protein complexes; and in interactions of proteins with nucleic acids, for example, in DNA replication and transcription, or in the translation of mRNA. Those topics can be approached with variable emphasis, either more on the biochemical side, in terms of reactions and regulation, or on the biophysical side, in terms of mechanism and interaction, and the translation to mathematical expressions that can be quantitatively compared with experiment. This is the approach we advocate. Recently, ligand binding to a protein was the topic of choice in building a causal mechanistic way of reasoning in chemical education (7). Novel approaches have been tried and adapted to teaching the concept of binding, given its central importance and broad applicability. For example, the idea of saturation in binding has been recently tackled as playing a game of cards (8). Several technology-based aids have been developed to help students understand binding through visualization (9, 10).
II. SCIENTIFIC AND PEDAGOGIC BACKGROUND
Here, we discuss a different way of teaching binding. We advocate approaching the problem from a probabilistic and mechanistic point of view instead of algebra. To be clear, we mean algebra in the technical sense, as the field of mathematics that bears its name, not in the common sense where it is essentially synonymous with math or calculations. EDTA and even simpler molecules are used to introduce the ideas. EDTA is not the most exciting molecule, but it is useful because the mechanism of the mutual influence of binding sites is easy to grasp, such as when 2 positive charges repel each other. Any molecules that a student may easily relate to would be adequate as well. For example, if the students had prior acquaintance with a protein and one of its ligands, that example could be used, either as the subject itself or as motivation.
Ligand binding to macromolecules, for example, binding of oxygen to hemoglobin or signal transduction by a membrane receptor upon binding its ligand, is one of the topics that have occupied biophysicists most extensively. Typically, we teach binding beginning with the chemical equation for the association of a ligand (L) to a protein (P) to form a complex (PL) (1)
The next step is to derive the equation for the binding curve. The standard approach to solve this problem is to use algebra (11). For one binding site, this is simple enough. When more ligands bind or the same ligand binds to several sites on the protein, the problem is supplemented with additional binding reactions, and additional equations for the equilibrium constants, mass balance, and charge balance. The problem is then set up as a system of n equations for n unknowns, which are the equilibrium concentrations of the different chemical species (free ligand, free protein, and protein with 1, 2, or more ligands bound). Solving the algebraic problem that results is tedious and not easy for most students. The presentation that follows delineates an alternative pathway of teaching binding to a molecule in the presence of defined interactions. This pathway is formulated as a sequence of ideas. Each idea suggests a motivation or focus concept.
A. Idea: begin with familiar representations
The proton binding equilibrium of a weak polyprotic acid, such as EDTA, provides an example where interactions are easy to understand, and to which students can relate at earlier stages of their learning, because this is one of the most studied problems in analytical chemistry. To explain how to teach these concepts is the first motivation for this article. When 2 ligands, such as 2 substrate molecules or a substrate and an inhibitor, bind to a macromolecule, such as a protein, they often interact. By interact, we mean that, in general, binding of one ligand changes the binding of the other. Classical examples of these interactions are the binding of oxygen and protons (H+) to hemoglobin, manifested as the Bohr effect (12–14); binding of lactate and NAD+ to lactate dehydrogenase (15, 16); and binding of the various nucleotides to aspartate transcarbamylase (17). These interactions are crucial to the function of biochemical systems. However, explaining them to advanced undergraduate or even beginning graduate students using those biochemical systems may be overwhelming. It is always advantageous to teach new concepts in a familiar context, where a student can make an intuitive connection to the system. Many of the concepts required to understand complex cases are already present in simpler ones. Therefore, instead of teaching ligand interactions by using the large and unfamiliar structure of a protein, we can start with a simple molecule, whose structure can be drawn on a piece of paper. The ability to write the molecular structure on a piece of paper makes a direct connection to how most students learned organic chemistry.
B. Idea: think in terms of probabilities of microscopic states
A second motivation is to present an approach that is conceptually simpler and more powerful than the traditional one. The usual approach to proton or calcium binding to EDTA is as a problem in algebra. The chemical constraints, that is, the binding constants, mass, and charge balance, are expressed as a set of equations to be solved for the set of unknowns of interest, which are the concentrations of the various chemical species. Here, instead, we use the concept of the probability of observing the various chemical species, or microscopic states, of EDTA. To make the connection to structure, we emphasize microscopic, or molecular, interactions rather than phenomenological dissociation constants. This approach has great advantages from the point of view of understanding and teaching. Once mastered, it becomes intuitive, and can easily be extended to more complex equilibria, which students will encounter in a biochemistry course (6). To explain the concepts involved in this approach, we begin with the simple case of a diprotic acid. This system illustrates all the important aspects of proton binding to any molecule or, indeed, of ligand binding to a macromolecule. This stepwise approach, from simple to complex molecules, can be combined with a laboratory experiment (10).
C. Idea: the titration curve is the number of protons bound as a function of pH
The most informative way of displaying a pH titration is by plotting the average number ( ) of protons (H+) bound to a polyprotic acid molecule as a function of pH (Fig 1). This function is known by different names, such as formation curve, difference plot, Bjerrum plot, binding isotherm, or adsorption isotherm, depending on the culture of the area of chemistry (18). We will simply call it a binding curve.
In general, a binding curve is a plot of the average number of ions or molecules bound to another molecule as a function of the chemical potential (μ) of the species that binds (Eq. 2). In this case, that species is H+. (2)
The pH is the negative of the chemical potential of the proton. Inverting Eq. 2, approximating activities by concentrations, and using , we obtain (3)
One unfortunate point of confusion arises from the language in different cultures of chemistry. Inorganic chemists call ligand the larger molecule, such as EDTA, to which ions bind. Biochemists and biophysicists call ligand the smaller species, such as a substrate that binds to a protein. The approach described here, however, is completely general. It can be used to determine the binding constants of a ligand to a macromolecule by a determination of the average degree of binding as a function of the concentration, or the chemical potential, of the ligand.
Figure 1 shows titration curves of a diprotic acid with different proton dissociation constants, labeled according to the corresponding pKa values. A proton dissociation constant, also called an acidity constant, is designated here by Ka (lowercase a as subscript). It defines the p . The curve in black is calculated for a diprotic acid with microscopic pKa of 3.00 and 6.00. If we draw a line at an average protonation level of , the first protonation is half completed. If we draw a line at , the second protonation is half completed. It is usually stated that the intersection of those horizontal lines with the binding curve allows for the determination of the apparent pKa of the acid (18). That is true but only exact if the 2 pKa differ by at least 2 pH units. In Figure 1, the red line is calculated with microscopic pKa of 4.00 and 6.00. The intersections of the horizontal lines with the binding curve occur at pH 3.98 and 6.01, which still provide accurate estimates of the 2 pKa . However, when the separation of the 2 pKa decreases, the discrepancy between the estimate from the line intersection and the real pKa increases significantly, as shown by the blue and turquoise curves. In those cases, to obtain the true pKa of the acid, it is necessary to fit a theoretic equation directly to the experimental titration curve.
D. Idea: teach new concepts in familiar scenarios
To write the proton binding curve, that is, the average degree of protonation ( ) as a function of pH, we need to derive expressions for the concentrations of the various chemical species of the acid molecule, which differ in the number of protons bound. Traditionally, this is approached as an algebraic problem. However, a much deeper understanding is obtained by using the central concept of the binding partition function from statistical thermodynamics (19, 20). This requires thinking in terms of the probabilities of observing the various chemical species of the acid (6). Those probabilities are then combined to obtain the partition function, differentiation of which yields the average degree of protonation. This method is completely general; it can be used for any molecule that binds any number of protons or other ligands.
Typically, the first time undergraduate students are exposed to the concept of the partition function is in a physical chemistry or quantum chemistry course. That course may be taken at the junior or senior level, depending on the program. The partition function is usually introduced as an integral over translational, vibrational, or electronic energy levels. Those are fairly advanced concepts in the undergraduate curriculum. On the other hand, pH titrations of acids and bases are introduced much earlier, often in high school, and students relate to those ideas in a more familiar way. When introducing a new concept, it is always advantageous to do so in a familiar context. Hence, the advantage of introducing the partition function in the context of something familiar, such as pH titrations.
E. Idea: use probability instead of algebra
We now present the essence of the approach. In doing so, we stress the importance of the graphical description, not just illustration, shown in Figure 2 for a diprotic acid in aqueous solution. Let us assume that the acid is weak and that the conjugate base of the acid is also weak. This means that proton binding is reversible under experimental conditions. The base can bind a maximum number of 2 protons. In general, the 2 binding sites A and B for the proton are different. An example is the amino acid glycine, which can bind a proton on its amino group or on its carboxylic acid group. The affinity of the proton for each site is characterized by a microscopic binding constant, KA or KB . They are called microscopic because they are specific to the molecular site to which the proton binds. They are identified here by capital letters as subscripts.
Let us now treat proton binding with the partition function method. Assume that binding sites A and B are independent, which means that binding H+ to one site does not change the binding affinity to the other. We want to know how the acid molecules are partitioned among the possible protonation states as a function of the concentration of H+, that is, the pH. There are 4 possible states of the acid molecule: with no protons bound (X2−); with one proton bound to site A (HAX−, where the subscript indicates site A); with one proton bound to site B (HBX−); and with 2 protons bound (H2X). The partition function will contain 4 terms that are essentially the probabilities of observing each of those 4 states. Those probabilities are directly proportional to the corresponding concentrations. Now, we choose the completely deprotonated state (X2−) as the reference state. The probabilities of the other states will be expressed relative to this one. However, instead of writing them immediately relative to the reference, we first write them relative to the preceding state in a protonation sequence. Figure 2 shows how each state is obtained from the preceding one. To calculate the probabilities, we use the binding constants. The equilibrium constant KA controls binding to site A and KB controls binding to site B (4) (5)
The probabilities of HAX− and HBX– relative to X2− are (6) If the first proton binds to site A, the probability that a second proton binds to site B is obtained from the equilibrium constant KB . (7) because binding to site A and B is assumed to be independent. The probability of relative to is then (8)
We could also have reached the state by binding first to site B and then to site A. To express the probability of relative to the common reference state, , recall that the combined probability of 2 independent events is the product of the probabilities of the 2 events (9)
Let us return now to Figure 2. The factors and connect the probability of each state to that of the preceding one in the diagram. They are written over the branches that connect states. To reach from , we multiply the relative probability of , which is 1 (reference) by , which is the relative probability of . To obtain H2X from HAX, multiply the relative probability of HAX−, which is by ; this yields for H2X. We can summarize those relative probabilities as a set of relations that express correspondence: (10) (11) (12) (13)
Note that we obtain the last state by going through the upper or the lower branch of the diagram of Figure 2 but not both simultaneously. The terms in Eqs. 10–13 are the relative probabilities of the 4 states, relative to the reference state X2−.
F. Idea: the partition function is a list of the probabilities of all microscopic states of the molecule
The partition function is the sum of the relative probabilities of all states available in the chemical system. We choose to express those probabilities relative to the completely deprotonated state (X 2−). (14)
The absolute probabilities of each protonation state are then calculated by dividing each term of Q by the entire Q. Those are the fractions of each state that would be observed experimentally, (15)
If we do not know, or do not need to specify, whether the first proton is bound to site A or B, we can write (16)
G. Idea: observable, macroscopic binding constants are combinations of microscopic constants
Usually, we only know how many protons are bound to the molecule but do not know if they are bound to site A or site B. We abbreviate the acid with 2 protons bound as H2X: the intermediate, amphiprotic form, with one proton bound as HX−, and the fully deprotonated base as X2−. In a typical experiment, we can measure the equilibrium binding constant K 1 for the first proton. However, because we do not actually know if, microscopically, the proton is bound to site A or to site B, we call this observable a macroscopic binding constant. Because the proton can bind to either site, . That is why the combination appears in the partition function (Eq. 14). The second proton has to bind to the empty site available. We usually characterize the system by 2 macroscopic equilibrium binding constants, K 1 and K 2, corresponding to the 2 equilibria, (17) (18)
The 2 macroscopic equilibrium constants are given by (19) (20)
When they are written in this manner, we are not specifying to which of the 2 binding sites each proton binds. If we want to refer to binding to specific molecular binding sites, we use the microscopic binding constants KA and KB . We can describe binding equally well by using microscopic or macroscopic binding constants, but it is essential to be clear about whether we are referring to one or the other type of equilibrium constants. If K 1 is the first macroscopic binding constant and K 2 is the second, the binding diagram can be drawn as in Figure 3, where we again write the binding constants and the proton concentrations above the branches representing the 2 equilibria.
Let us now write the partition function by using macroscopic binding constants. This time, we are going to use the diagram of Figure 3 directly, because we already learned what it represents and how it works. The factors over the branches connect the probability of a state to that of the preceding one, and the relative probability of the deprotonated state is 1. Thus, we have the set of correspondences between concentrations and relative probabilities: (21)
The partition function is the sum of the relative probabilities, (22) and the absolute probabilities, or the observable fractions, of each state are (23)
Now, compare the 2 partition functions, written with microscopic and macroscopic constants, in Eqs. 14 and 22. We see that (because the first binding can occur to either site) and . From the latter, we can see that . Finally, note that if , the macroscopic constants are identical to the microscopic ones, and .
H. Idea: is the average number of protons bound
The average degree of protonation, abbreviated as , is the average number of protons bound per macromolecule. It is a weighted average of the contributions of each state, where the weights are the number of protons bound in each state: 0, 1, or 2, for X2−, HX−, and H2X, respectively. (24) (25) (26)
As the number of binding sites increases, the previous calculation becomes tedious. For example, EDTA binds 6 protons, 2 on the nitrogen atoms and 4 on the carboxylic acids. However, taking the derivative of the partition function with respect to the proton concentration, dividing by Q, and multiplying by immediately yields , (27)
This is the origin of the designation difference plot for versus pH: it arises by differentiation of Q. This equation can be written more succinctly if we use the formula for differentiation of a logarithm (28) where x is or Q. Rearranging Eq. 28, we obtain (29)
Now, we apply Eq. 29 to [H+] and Q in Eq. 27 to obtain the compact form (30)
For the case that we are considering, with 2 protons, Eq. 27 yields, (31) which is easily extended to complex systems, where the algebraic approach (Eqs. 24–26) would be tedious.
I. Idea: mathematical expressions are necessary to compare the model with experiment
Typically, to determine the pKa of an acid, we perform a potentiometric titration, in which the fully protonated form of the acid ( ) is titrated with a strong base, such as NaOH. An experimental titration curve of EDTA is shown by the dotted black line in Figure 4. It represents the dependence of pH on the moles of titrant, or the number of strong base equivalents, designated by (32)
Eq. 31 gives the pH dependence of , not of νB . However, the titration curve can be transformed into the average degree of protonation as a function of pH. Either curve can be used to determine the pKa of the acid, in a graphical manner (from the inflection points of the curve), provided the pKa values differ by 2 or more pH units, or by a fit of the mathematical expression to the experimental titration curve. Therefore, we need to derive a mathematical expression that connects (Eq. 31) to νB (Eq. 32). The detailed derivation, which is similar to that of Kraft (18), is provided in the Supplemental Material. The result is (33)
This equation can be rearranged to represent νB as a function of pH (34)
The titration curve is usually shown as a plot of pH as a function of νB (Fig 4), which is just a switch of the 2 axes, or of the dependent and independent variables in Eq. 34.
Finally, a fundamental simplification of Eq. 34 is warranted. If the acid and the base are fairly concentrated, the first term on the right side of Eq. 34 is much smaller than the second (35)
Typically, the order of magnitude of the terms and in the numerator of Eq. 34 is smaller than , and their difference is even smaller. In the denominator, the factor is of order 1. The initial acid concentration is typically larger than M, usually about 0.1 M. The second term in Eq. 34, , is of order 1. Therefore, the inequality (Eq. 35) almost always holds, and we can greatly simplify Eq. 34 to (36)
Figure 4 shows the range of validity of this approximation for EDTA. The approximate titration curve (red dashed line) is compared with the full expression (turquoise solid line). Clearly, it is valid for pH . In this interval, the approximation given by Eq. 36 is valid, and the pH dependence is entirely contained in . For a diprotic acid, is given by Eq. 31, so we have (37)
Thus, Eq. 37 is the explicit binding model we sought. It is usually called a binding curve or binding isotherm. The model represents the dependence of on the binding constants and [H+], and can easily be written from the partition function.
J. Idea: estimate binding constants and interactions from simpler compounds
The fully protonated structure of EDTA is shown in Figure 5. It has 6 binding sites for H+: 4 equivalent carboxylic acid groups and 2 equivalent amino groups (21, 22), thus, 6 pKa values. Here, KA is the microscopic binding constant of H+ to the amino group of EDTA, and KB is the microscopic binding constant to the carboxylate group. The parameter α represents the interaction between 2 adjacent, protonated carboxylic acids. The parameter β represents the interaction between adjacent, protonated amines and carboxylic acids. The parameter ω represents the interaction between the 2 adjacent, protonated amines. To estimate those interaction parameters, we use the simpler model compounds glutaric acid, ethylenediamine, and glycine.
The simpler compounds in Figure 5 resemble different parts of the EDTA molecule. They provide a way to estimate KA , KB , α, β, and ω. The experimental pKa values of EDTA are listed in Table 1 (23–26). Also listed are the 6 macroscopic binding constants that are related to each pKa by expressions such as p for the first site, and similarly for the other sites. Note that is a dissociation constant, whereas the corresponding K 1 is a binding constant ( ).
The binding constants for the amino and carboxylic acid groups are KA and KB . The pKa for primary alkylamines is about 10, whereas the pKa for primary alkyl carboxylic acids is about 5. Thus, the binding constant is much larger for the amino group than for the carboxylic acid. Aliphatic primary amines have pKa of 10.6; in ethylenediamine, the first pKa is 10.0, and in glycine, it is 9.74. We will use p for the amino group, which corresponds to a microscopic binding constant . The pKa of carboxylic acids varies between 4 and 5 and increases with chain length. For acetic acid, pKa = 4.76, and for octanoic acid, pKa = 4.9 (24). The longer chains probably reflect more closely the intrinsic value. Therefore, we use p for the carboxylic acid group, which corresponds to a microscopic binding constant .
K. Idea: Interaction parameters reflect molecular structures
Now, we estimate α, β, and ω. Note that the 3 simpler model compounds in Figure 5 contain, separately, the same types of interactions present in EDTA. Ethylenediamine is used to estimate ω, glutaric acid to estimate α, and glycine to estimate β. The interactions represented by α, β, and ω are essentially electrostatic in origin (27). They arise as a repulsion between 2 H+ bound to 2 nearby sites on the same molecule. They repel each other with an energy E, which results in a decrease in the probability of both protons being bound by a factor of the form .
L. Idea: write the partition function based on the structure
It is important to have estimates of the binding constants and interaction parameters before writing the partition function (Q) because they allow us to simplify Q. To write the exact partition function for proton binding to EDTA is laborious because all possible protonation states need to be taken into account. The full expression is provided in the Supplemental Material. However, it is not necessary. Instead, we derive a simpler Q, which is easy to write and an excellent approximation. By inspection of the structures of the 3 model compounds (Fig 5), we can write the partition functions. This is also an exercise applying what we have learned so far.
Glutaric acid has 2 identical carboxylic acid groups, so there are 2 identical binding sites for H+, with the same microscopic binding constant, . Experimentally, there are 2 macroscopic pKa values, 5.4 and 4.3 (Table 1). Why are there 2 pKa values and why the difference? We can draw a diagram for glutaric acid similar to that of Figure 2, except that KA and KB are identical, and the factor α will lower binding affinity of the second proton. The first term of the partition function is 1, representing the probability of the “empty” state (reference: no protons bound). The second term is and represents the state with 1 proton bound. The larger the binding constant and the proton concentration, the larger the product and the higher the probability of having a proton bound. The factor of 2 occurs because the first proton can bind to either of the 2 carboxylates: this state has a multiplicity of 2. The third term corresponds to protons bound to both sites. There is a factor of for each proton, hence, the square. The new feature is that when both sites are occupied, the protons interact, essentially by electrostatic repulsion. That brings in the factor , which lowers the probability of both protons binding. Hence, the microscopic partition function for glutaric acid, or indeed for a dicarboxylic acid, in general, is (38)
Now, the macroscopic partition function for a diprotic acid was given by Eq. 22
Therefore, the first macroscopic binding constant is , which corresponds to the macroscopic p ( ). Taking logarithms, we have
With and ( ), we obtain p , very close to the experimental value of 5.42 (Table 1). Its larger-than-normal value for a carboxylic acid is entirely explained by the multiplicity of 2 for the singly protonated state. The second macroscopic binding constant is obtained from . With p for the second proton, this yields . The value of α is then obtained by comparing the last terms of Q in the micro- and macroscopic versions of the partition function (Eqs. 22 and 38), which show that Using , this yields
Using the values of K 2 and KB for glutaric acid, we obtain . An interesting question is how far apart the 2 carboxylic acid groups would need to be in order for the electrostatic repulsion between their protons to be negligible, that is, for α = 1. It is probably necessary to increase the chain to 10 carbons or more. In heptane-, octane-, nonane-, and decanedioic acids, p (24–26), indicating that electrostatic repulsion is still present ( ). In decanedioic acid, p and p (24). We can probably attribute to experimental error the value of p , slightly larger than the expected 5.3. If we use , as done previously, and p , we now obtain , still <1. This would be an interesting problem to propose to students that could be presented at different levels of difficulty.
Next, we use ethylenediamine to estimate ω. The problem is exactly the same. Formally, the partition function is identical to that of glutaric acid but with different values for the constants, (39)
The 2 experimental pKa values are about 10.0 and 7.25 (Table 1), yielding and , for binding the first and second protons to the 2 amino groups. Using the same reasoning as for glutaric acid, we obtain .
Finally, we use glycine to estimate β. Again, we use the diagram of Figure 2 to guide us. The problem is similar, except that the 2 binding sites are distinct, with very different binding constants. One is an amino group (site A, p ), and the other is a carboxylate group (site B, p , ). However, because the binding constant , we can simplify the treatment and consider that the first proton always binds to site A and the second, to site B. Thus, we can safely assume that there is never a glycine molecule with a protonated COOH and a deprotonated NH2 group, and only the top branch of the graph of Figure 2 applies. Therefore, the partition function for glycine is (40) where β represents the repulsive interaction between the 2 protons when both are bound. This repulsion explains why the p for this carboxylic acid is much smaller than the typical p . Following the same reasoning as for ethylenediamine and glutaric acid, we compare the macroscopic and microscopic partition functions. Here, , and . Using p for the carboxylic acid, . Thus, , which yields the estimate .
M. Idea: Go from chemical structure to mathematics
We can think of the structure of EDTA as being built from the 3 units represented by the model compounds glutaric acid, ethylenediamine, and glycine (Fig 5). Similarly, the partition function can be conceived as built from the 3 independent partition functions. However, the full partition function of EDTA is complicated because no proton binding is truly independent. The problem lies in the interaction represented by β, which provides a communication between the central ethylenediamine and the 2 glutaric acid-like groups of EDTA. If it were not for β, the ethylenediamine and the 2 glutaric acid-like groups would be independent. The probability of independent events is the product of the probabilities of each event. If that were the case, the partition function of EDTA could be simply approximated as the product of those of ethylenediamine and 2 glutaric acid units (Eqs. 38 and 39), (41) (42)
That is not the case, though, because ≪ 1, and it will have to be included in any states where a proton is bound to a nitrogen and to a carboxylic acid on the same side of the EDTA molecule. The exact partition function can be obtained by enumeration of all possible states. However, it is much more instructive to write it using the approximations that can be reasonably made, which means excluding states that are very unlikely. Indeed, proton binding to the amino groups of ethylenediamine is much stronger than to any of the carboxylate groups. Therefore, we can assume that the first 2 protons bind to the amino groups, and we can exclude all states with a protonated carboxylic acid but at least one amino group deprotonated. Figure 6 shows all the states of EDTA that are likely to be observed as [H+] increases. In the first row, we have 3 states. The first is the empty state, our reference, which contributes a term of 1 to the partition function (Eq. 43). The second is a state with 1 proton on a site of type A (amine), which contributes the factor . The factor of 2 occurs because the proton can be on either of the 2 nitrogens. We say that the multiplicity of this state is 2. The third term represents a state with both amines protonated. This corresponds to a factor for each proton, hence, the square. Because they interact repulsively, the interaction factor ω must be included, which reduces the probability of this state. There is only one way of producing it, so the multiplicity is 1. (43)
In the second row, we have 3 more states. In the first, there is a proton on one of the type B sites (carboxylic acid). This brings in the factor . In addition, the repulsive interaction between a proton on site B and the one on the adjacent site A brings in the repulsive interaction factor β. There are 4 possible structures like this (the proton can be on any of the 4 carboxylic acids), hence, the multiplicity of 4. Next, we have a state with 2 protons bound to B sites on either side of the molecule, each bringing in the factor , hence, the square. There are 4 possible ways of having 2 protons on B sites on opposite sides of the molecule, hence, the multiplicity of 4. Last on this row, we have a state with 2 protons bound to B sites on the same side of the molecule. The 2 protons interact repulsively, bringing in factor α, which was not present in the previous term. Both protons can be on the right or the left, hence, the multiplicity of 2. In the last row, we have a state with 3 protons on B sites, 2 of which interact (α factor), bringing in the factor and the state with all protons bound and all the interactions between them. Finally, collecting terms in equal powers of [H+], (44) where each row in Eq. 44 corresponds to a row in the diagram of Figure 6.
III. RESULTS AND DISCUSSION
A. Parameters are determined by comparing the model with experiment
In science, understanding always arises from comparison of theory with experiment (28). In our case, the binding constants and interaction parameters are determined by comparing the model with experimental EDTA titration data. Thus, we must first obtain an explicit binding model, or binding isotherm, from the partition function. For example, the binding model for 2 sites was given by Eq. 37. Then, it is necessary to compare the model with the experimental binding data. This comparison is performed by fitting the model to the data, which also determines the parameters. A favorable comparison with experiment lends support to the model. Note, however, that it does not prove it; in fact, a model can be disproved but not proved (28, 29).
Let us recall how to obtain the binding isotherm from the partition function by using Eqs. 30 and 31. First, we differentiate the partition function of EDTA (Eq. 44) with respect to [H+], by using Eq. 30, to obtain the model for the binding of protons to EDTA as a function of pH. Second, we fit the resulting binding isotherm to the experimental titration data. The result is shown in Figure 7. From the fit, the values of the 5 parameters are obtained: (amino group); (carboxylic acid group, fixed); ; and . The apparent, macroscopic pKa values are p , p , p , p , p , and p , in very good agreement with the experimental values. The results are summarized on the left side of Table 2.
The values of the Gibbs energy corresponding to the microscopic binding constants (KA , KB ) or the interaction parameters between bound protons (α, β, and ω) were calculated from (or KB ) or (or β or ω) and are also listed on the right in Table 2. The interaction between 2 protons bound to carboxylic acid groups, kcal/mol, is very small. However, those involving protons bound to nitrogen are large and comparable, kcal/mol and kcal/mol, indicating a stronger electrostatic repulsion between these bound protons (see Fig 5).
Note that the partition function method allows us to write binding models, not predict the location of binding sites on a chemical structure, such as EDTA, or on a macromolecular structure of a protein, for example. We advocate the use of structural knowledge, if available, to write the model: Go from chemical structure to mathematics. In EDTA, we know the type of proton binding site (carboxylic or amino group), which gives information about the values of the binding constants. However, to write the partition function, it does not matter how or where the ligand binds, but just that it binds and to how many sites. That is how the chemical structure comes into play in every case. In more complex cases, typically in proteins, we may not know exactly how or how well the ligand binds. The binding strength is usually not available from the structure itself, even if we know which amino acid residues interact with the ligand and how, from the 3-dimensional structure of the protein. Sophisticated methods can be used to estimate binding strength, but this information is usually not available.
We have provided an example of how to understand and teach binding reactions on the basis of the origins of the molecular interactions involved. The task of deriving an exact model to describe binding is formulated in terms of the probabilities of occurrence of the various molecular species, instead of as a problem in algebra. The particular case of EDTA was chosen because it illustrates the important points of this approach and provides an example that is already familiar to most students. The method can easily be extended to ligand binding to proteins, including complex cases. For example, oxygen binding to hemoglobin, which includes a conformational transition and cooperativity, can be treated in exactly the same way (6). The important concepts that are relevant at each stage of the approach are highlighted here. The advantage of this approach is that its complexity does not scale with the size and complexity of the molecule to which it is applied. Therefore, it remains conceptually simple when used to understand binding interactions in macromolecules, such as proteins or nucleic acids.
B. Reflection
We used this approach in a first-year biochemistry graduate course that focused on proteins, about 10 years ago. At the time, student perceptions that could be published were not collected. Therefore, we can only offer a personal, and probably biased, reflection. In general, the students reacted well to the approach. It proved important, before proceeding to applications, that the students grasped the fundamental idea of the partition function method: the system is partitioned between several possible states, which are populated according to probabilities. Simple 2-state systems are ideal starting points, even those that do not involve binding, such as protein unfolding, or spin one-half systems.
We are currently experimenting with teaching ligand binding to macromolecules in a biochemistry course at the junior undergraduate level by using both the traditional and the partition function approaches. We ask students to solve a problem by both methods and collect the assessment of each and how they compare. In addition, we plan to use the partition function approach from the start in a junior-level course on biophysical chemistry, to be taught in a bachelor’s degree program in biochemistry. This experience is just beginning. The results will be the subject of future communications.