Chemical Reaction Networks

This chapter covers:

Chemical reaction networks (CRNs) to describe species and reactions among them
Stochastic Markov chain kinetics of CRNs
Determinstic ODE kinetics of CRNs
The Langevin equation

CRN formalism

We have previously looked at molecules as moving within a potential derived from forces between atoms (and potentially an abstracted continuous field modeling solutions). In this chapter we will go one step further and abstract molecules, and even larger constructs to to whole cells, even further.

A chemical reaction network (CRN) abstracts a biochemical process and describes their interactions and transformations in the form of reactions. A CRN is described by a set \(\mathcal{S}\) of species and a set \(\mathcal{R}\) of reactions. A reaction is a triple \((\mathbf{r}, \mathbf{p}, \alpha)\) where \(\mathbf{r}, \mathbf{p}\in \mathbb{N}_0^{\mathcal{S}}\) and \(\alpha\in\mathbb{R}_{\geq0}\).

The species with positive count in \(\mathbf{r}\) are called the reaction’s reactants and those with positive count in \(\mathbf{p}\) are called its products. Intuitively they describe what is consumed and what is produced in a reaction: they encode the state change of the system.

The parameter \(\alpha\) is called the reaction’s rate constant. Intuitively it tells how fast a reaction happens: it encodes the timing of a reaction.

Example: gene expression

The gene expression system has species \(\mathcal{S} = \{\text{DNA}, \text{mRNA}, \text{protein}\}\) and reactions \(\mathcal{R} = \{R_1, R_2, R_3, R_4\}\) (ordering species as listed):

	\(\mathbf{r}\)	\(\mathbf{p}\)	\(\alpha\)	Biological meaning
\(R_1\)	\((1,0,0)\)	\((1,1,0)\)	\(\beta_m\)	transcription
\(R_2\)	\((0,1,0)\)	\((0,0,0)\)	\(\gamma_m\)	mRNA degradation
\(R_3\)	\((0,1,0)\)	\((0,1,1)\)	\(\beta_p\)	translation
\(R_4\)	\((0,0,1)\)	\((0,0,0)\)	\(\gamma_p\)	protein degradation

For instance, transcription has DNA as its only reactant (\(\mathbf{r}_{\text{DNA}}=1\), all others zero) and produces one mRNA while returning the DNA (\(\mathbf{p}_{\text{DNA}}=\mathbf{p}_{\text{mRNA}}=1\)).

Instead of writing \(\mathbf{r}\) and \(\mathbf{p}\) vectors, the arrow notation known from chemistry is typically used. The first reaction can thus be written as:

\[\text{DNA} \xrightarrow{\beta_m} \text{DNA} + \text{mRNA}.\]

From Molecular Collisions to Mass-Action Kinetics

For a simple reaction

\[\text{A} + \text{B} \xrightarrow{\alpha} \dots\]

The rate of this reaction is proportional to the product \(A\cdot B\). But why should it be true? Why should the rate depend on the product of the concentrations? Why not the sum, or some other combination?

The answer, as it turns out, comes from a very simple physical picture that we are already familiar with from teh previous chapters: molecules bouncing around and occasionally bumping into each other. Let’s see if we can understand this from the ground up.

The Essential Physics: Collisions with Activation

When we say that A and B react to form C, what we really mean is something quite specific. Two molecules, one of type A and one of type B, have to get close enough to each other that they can rearrange their chemical bonds. But it’s not enough for them just to touch gently — they need to collide with enough energy to get over what we call the “activation energy barrier.”

Think of it like this: imagine you’re trying to push a boulder over a hill. You can’t just nudge it gently; you need to hit it hard enough to get it to the top. Once it’s over the top, it rolls down the other side (that’s your product C forming). But if you don’t hit it hard enough, it just rolls back to where it started.

So a chemical reaction requires two things: a collision, and enough energy in that collision. For now, let’s ignore the energy part — we’ll assume that if molecules collide, they react. The question becomes: how often do molecules collide?

Two Balls in a Box: The Simplest Possible Case

Let’s start with the simplest case we can imagine. Take two balls, let’s call them A and B, and put them in a square box. They’re bouncing around, hitting the walls, ricocheting all over the place.

In this billiard-ball model molecules are abstracted as hard-sphere collisions: particles pass straight through each other until they touch, then bounce. While we have seen that real molecules interact continuously via a potential that both attracts at long range and repels at short range, we deliberately make this abstraction.

The question is: when will the balls A and B hit each other?

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib import animation
from IPython.display import HTML

L, R, r = 0.05, 0.95, 0.06  # box bounds and ball radius
dt = 0.01 # time step
POS0 = np.array([[0.30, 0.30], [0.65, 0.62]])  # initial ball centers
VEL0 = np.array([[ 0.40,  0.30], [-0.30, -0.20]])  # initial ball velocities


def make_animation(energy_panel: bool = False, frames: int = 1000, interval: int = 20) -> animation.FuncAnimation:
    """Two-ball simulation.
    Set energy_panel=True to add a KE-vs-time subplot."""
    # fresh state so repeated calls are independent
    pos = POS0.copy()
    vel = VEL0.copy()

    # Newton physics step
    def step() -> None:
        # forward Euler: x(t+dt) = x(t) + v(t)*dt
        pos[:] += vel * dt

        # boundary reflection: clamp centre and flip perpendicular velocity
        for i in range(2):
            for d in range(2):   # d=0: x-wall, d=1: y-wall
                if pos[i, d] - r < L:
                    pos[i, d] = L + r
                    vel[i, d] = abs(vel[i, d])
                elif pos[i, d] + r > R:
                    pos[i, d] = R - r
                    vel[i, d] = -abs(vel[i, d])

        # ball-ball elastic collision
        delta = pos[0] - pos[1]            # vector from ball 1 to ball 0
        dist  = float(np.linalg.norm(delta))
        if dist < 2 * r and dist > 1e-10:
            # collision between balls
            n   = delta / dist             # unit normal along line of centres
            dvn = float(np.dot(vel[0] - vel[1], n))
            if dvn < 0:
                # approaching balls: swap normal velocity components
                vel[0] -= dvn * n
                vel[1] += dvn * n
            # position correction: push apart to exactly touching distance
            pos[0] += (2 * r - dist) * 0.5 * n
            pos[1] -= (2 * r - dist) * 0.5 * n

    # plotting
    if energy_panel:
        fig, (ax_box, ax_e) = plt.subplots(
            1, 2, figsize=(5.5, 3), gridspec_kw={"width_ratios": [1, 1.2]},
        )
    else:
        fig, ax_box = plt.subplots(figsize=(3, 3))

    ax_box.set_aspect("equal")
    ax_box.set_xlim(0, 1)
    ax_box.set_ylim(0, 1)
    ax_box.set_axis_off()
    ax_box.add_patch(patches.Rectangle(
        (L, L), R - L, R - L, linewidth=1.5, edgecolor="black", facecolor="white"
    ))
    circles = [
        ax_box.add_patch(patches.Circle(
            pos[i].copy(), r, linewidth=1.5, edgecolor="black", facecolor="white"
        ))
        for i in range(2)
    ]

    if energy_panel:
        # E = m v^2 / 2, with m=1 and v^2 = vx^2 + vy^2
        total_ke = 0.5 * float(np.sum(vel ** 2))
        ax_e.set_xlim(0, frames * dt)
        ax_e.set_ylim(0, total_ke * 1.2)
        ax_e.set_xlabel("time")
        ax_e.set_ylabel("kinetic energy")
        ax_e.axhline(total_ke, color="gray", lw=0.8, linestyle="--", label="total")
        e_lines = [
            ax_e.plot([], [], color=f"C{i}", label=f"ball {i + 1}")[0]
            for i in range(2)
        ]
        ax_e.legend(fontsize=7, loc="upper right")
        times:  list[float]       = []
        e_data: list[list[float]] = [[], []]

    plt.tight_layout()

    # animation loop
    def update(frame: int) -> list:
        # advance the simulation by one time step
        step()
        # update the plot
        for i, c in enumerate(circles):
            c.set_center(pos[i])
        if energy_panel:
            times.append(frame * dt)
            for i in range(2):
                e_data[i].append(0.5 * float(np.dot(vel[i], vel[i])))
            for i, line in enumerate(e_lines):
                line.set_data(times, e_data[i])
            return circles + e_lines
        return circles

    anim = animation.FuncAnimation(fig, update, frames=frames, interval=interval)
    plt.close(fig)
    return anim

HTML(make_animation(frames=500).to_jshtml())

Now, you might think this is easy to answer, but it’s not! It depends completely on where they start and how fast they’re going in which directions.

If they start right next to each other and they’re moving toward each other, they’ll collide almost immediately. If they start at opposite corners and they’re both moving in the same direction at the same speed, they might never collide at all! One will always be chasing the other around the box.

So the collision time could be anywhere from nearly zero to infinity. That doesn’t seem very helpful for chemistry, does it?

The Thermal Equilibrium

Chemical systems are often considered to be in “thermal equilibrium.” What does this mean? It means that the system has been bouncing around for so long that it has “forgotten” how it started. The positions are random, the directions are random, and the speeds follow a particular distribution (Maxwell-Boltzmann) that depends only on temperature.

This is like taking our two balls and shaking the box for a very long time before we start watching. We don’t know where they are or which way they’re going, but we know something about the statistics of their motion.

In this thermal state:

The positions are uniformly distributed (equally likely to be anywhere in the box).
The directions are uniformly distributed (equally likely to be going any direction).
The speeds follow Maxwell-Boltzmann (most likely to have speeds near some typical thermal speed). This is a central result in thermodynamics. We will give an argument of why this is the case in the next section.

This is the key insight: we can’t predict exactly when our two particular balls will collide, but we can predict the random collision time if we have many such boxes, or if we wait long enough and average over many collisions.

Maxwell-Boltzmann distribution for energies

How can we see that speeds are distributed according to a Maxwell–Boltzmann distribution? The result has been shown in several ways, with a well-known proof relying on information theoretic properties. We give a proof that is less general in concept but does not require further notation. The proof is also informative in the way that it clearly shows assumptions and an interesting relation to symmetry in time in Newtonian physics.

Let us assume a set of balls in a volume, with time having passed enough so that the system is an equilibrium with respect to the distribution of energy.

Assumption. We are relying on a huge assumption here that we will justify later: we consider the system as a probabilistic one with transition probabilities.

Pick any two balls that are colliding at some time \(t\). Before the collision, the probability of seeing energies \((E_1,E_2)\) is some

\[ P_{\text{before}}(E_1,E_2)=f(E_1)f(E_2), \]

where the product follows from the fact that the two balls are independent in distribution of each other. This is clearly not true for two balls that have just collided, but is approached as balls collide with many other balls.

After collision, the particles may have energies \((E_1',E_2')\), with

\[ E_1+E_2=E_1'+E_2' \]

since the total energy, present only as a kinetic energy \(E = \frac{m v^2}{2}\) in our simple system, must be maintained. Before we continue let us watch this happening in our system from before.

HTML(make_animation(energy_panel=True, frames=400).to_jshtml())

We clearly see how energy is transferred from one ball to another one upon collisions. You can extend the simulation time in the code to see many more such transfers: they are also not always equal.

We next show that the following equality must hold between the energies before and after the collision:

\[ f(E_1)f(E_2)=f(E_1')f(E_2') \]

Why is this the case? At equilibrium, every allowed microscopic collision has to be balanced by its reverse collision. Imagine the collision

\[ (E_1,E_2)\to(E_1',E_2'). \]

Microscopically, Newtonian physics is reversible: reversing time by letting \(t \mapsto -t\), leads to the same physical laws and an outside observer could not distinguish if we are in a system where time runs forward or backwards. Thus the reverse collision

\[ (E_1',E_2')\to(E_1,E_2) \]

is also possible with the same collision law.

At equilibrium there can be no net flow of probability from one pair of energies to another. With \(W\) being the transition rate for that collision, for the steady-state of \(f\), one has

\[ f(E_1)f(E_2),W((E_1,E_2)\to(E_1',E_2')) = f(E_1')f(E_2'),W((E_1',E_2')\to(E_1,E_2)). \]

By microscopic reversibility,

\[ W((E_1,E_2)\to(E_1',E_2')) = W((E_1',E_2')\to(E_1,E_2)). \]

So the \(W\)’s cancel, leaving

\[ \boxed{f(E_1)f(E_2)=f(E_1')f(E_2')}. \]

Then since collisions conserve total energy,

\[ E_1+E_2=E_1'+E_2'=S. \]

So for fixed \(S\),

\[ f(E)f(S-E) \]

must be the same for every split \((E,S-E)\). Thus

\[ f(E)f(S-E)=A(S). \]

Take logs:

\[ g(E)+g(S-E)=\ln A(S). \]

Differentiate with respect to \(E\):

\[ g'(E)-g'(S-E)=0. \]

\[ g'(E)=g'(S-E). \]

But \(E\) and \(S-E\) can be any two positive energies. Therefore \(g'\) must be constant:

\[ g'(E)=-\beta. \]

Hence

\[ g(E)=C-\beta E. \]

Since \(g(E)=\ln f(E)\),

\[ f(E)=e^{C-\beta E}=Ae^{-\beta E}. \]

\[ \boxed{f(E)\propto e^{-\beta E}} \]

This is the Boltzmann distribution with \[ \beta=\frac{1}{k_BT}. \] Here, we do not further describe how to find this relation of \(\beta\) with temperature \(T\).

Finally, from the fact that the surface must be \(1\), one obtains

\[\boxed{f(E) = \frac{1}{k_BT}e^{-E/k_BT}}\]

Side note: We have the above proof in the context of molecular dynamics of billiard balls. Similar concepts, however, hold for many probabilistic systems with energy transfers per species, a fixed global energy, and microscopically reversible dynamics.

Think for example of larger molecules that collide or a society that transfers money.

From the above distribution of energy one can also derive the distribution of velocity vector lengths (i.e., speeds) \(|v|\): you can use the fact that \(E = \frac{mv^2}{2}\) and given the distribution of \(E\), derive the distribution of \(|v|\). Interestingly it depends on the dimensionality of the space, so it is different for 2D and 3D.

Simplifying Even Further: The Point and the Target

To make progress, let’s simplify even more. Instead of two moving balls of different sizes, let’s consider one ball that’s just a point (no size at all) and another ball that sits perfectly still in the center of the box and has radius \(r\).

You might object: “But that’s not realistic! Both molecules are moving!” Don’t worry, we’ll fix that later. There’s a beautiful trick where the motion of two objects can be converted into the motion of one object relative to the other.

Also, let’s assume our point moves at unit speed (speed \(= 1\)) for simplicity. Again, we’ll fix this later.

So now we have a very clean problem: a point bouncing around randomly in a box, and we want to know when it will hit a target of radius \(r\) sitting in the center.

The Scaling Laws: How Big, How Fast?

How does the expected collision time depend on the parameters of our problem?

First, let’s think about the radius \(r\). If we make the target bigger, it should be easier to hit, right? So the collision time should go down. In fact, if you double the radius, you double the “cross-sectional area” that the point can hit (in 2D, that’s \(2\pi r\)), so you should halve the collision time. The collision time is inversely proportional to \(r\).

Second, what about the area \(A\) of the box? If we make the box bigger, the point has to search through more space to find the target. If you double the area, you should roughly double the collision time. The collision time is proportional to \(A\).

Third, what about the speed? If the point moves twice as fast, it should find the target in half the time. The collision time is inversely proportional to the speed.

Putting it together: \[\langle t \rangle \propto \frac{A}{r \cdot v}\]

where \(\langle t \rangle\) is the expected collision time, \(A\) is the area, \(r\) is the radius, and \(v\) is the speed.

Why does this work? It comes from a piece of mathematics called “equidistribution theory.” The point’s trajectory, if it bounces around long enough, visits every part of the box with equal probability. It’s like the trajectory is “filling up” the area uniformly. The bigger the area to fill, the longer it takes. The bigger the target, the easier it is to hit.

What About the Probability Distribution?

Now we know the expected collision time, but what’s the full probability distribution? Is it Gaussian? Uniform? Something else?

Here’s where things get really interesting. Let’s think about the following question: suppose our point has been bouncing around for time \(s\) and hasn’t hit the target yet. What’s the probability it will hit in the next time interval \(t\)?

If the motion is truly random and “memoryless,” then this probability should be the same as the probability of hitting in the first time interval \(t\). The past doesn’t matter — only the present state.

In mathematical terms, this means: \[P(\text{hit in next } t | \text{no hit in last } s) = P(\text{hit in first } t)\]

This is called the “Markov property” or “memorylessness.”

The Shadow Problem

But wait — is our system really memoryless?

Suppose the point hasn’t hit the target in the last \(s\) time units. Does this tell us anything about where the point is now?

Yes, it does! There’s a “shadow” region around the target where the point cannot be. If the point were in this shadow region, it would have hit the target sometime in the last \(s\) time units.

The shadow region is roughly a circle of radius \(r + v \cdot s\) around the target. Any point in this region that was moving toward the target would have collided within time \(s\).

This means the position distribution is not uniform anymore — it’s uniform everywhere except in the shadow. So the system is not perfectly memoryless.

However, if the target is very small (\(r \to 0\)), then the shadow region becomes very small compared to the total area. In this limit, the distortion of the uniform distribution is negligible, and the system becomes approximately memoryless.

The Exponential Distribution

If a system is truly memoryless, then its waiting times must follow an exponential distribution. This is a theorem, not just a guess.

The exponential distribution with rate \(\lambda\) is characterized by:

\[P(\text{wait time} > t) = e^{-\lambda t}\]

This is the “survival function” of the exponential distribution. The probability density function is \(\lambda e^{-\lambda t}\), and the expected value is \(1/\lambda\).

Since we already know the expected collision time is proportional to \(A/(r v)\), we can identify: \[\lambda \propto \frac{r v}{A}\]

So our collision times are approximately exponentially distributed with this rate parameter.

Bringing Back the Moving Target

Now let’s be more realistic. Both molecules are moving, not just one.

Here’s the beautiful trick we mentioned earlier: instead of tracking two moving objects, we can work in the “relative coordinate system.” We ask: what’s the motion of molecule A relative to molecule B?

In this relative system, molecule B appears stationary (sitting at the origin), and molecule A moves with the relative velocity \(\vec{v}_{\text{rel}} = \vec{v}_A - \vec{v}_B\).

The wonderful thing is that the geometry is exactly the same as our point-and-target problem! The collision occurs when the relative distance equals \(r_A + r_B\).

For molecules in thermal equilibrium, the relative velocity also has a Maxwell-Boltzmann distribution, but with a different parameter. If each molecule has typical speed \(\langle v \rangle\), then the relative motion has typical speed \(\sqrt{2} \langle v \rangle\). (This comes from adding two random velocity vectors.)

So all our previous results carry over, just with \(v\) replaced by \(\langle v_{\text{rel}} \rangle\) and \(r\) replaced by \(r_A + r_B\).

Multiple Targets: The Magic of the Exponential

Now for the final step: what if we have many B molecules that our A molecule could hit?

This is where the exponential distribution shows its magic. Suppose there are \(n\) different B molecules, and the collision time with the \(i\)-th one is exponentially distributed with rate \(\lambda_i\). What’s the distribution of the time until the first collision (with any of the B molecules)?

It turns out that this is also exponentially distributed, with rate \(\lambda_{\text{total}} = \lambda_1 + \lambda_2 + \cdots + \lambda_n\).

This is a beautiful property of exponential distributions: the minimum of independent exponentials is exponential with the sum of the rates.

Why is this true? Think of it this way: the probability of avoiding all \(n\) targets for time \(t\) is the product of the probabilities of avoiding each one: \[P(\text{avoid all for time } t) = e^{-\lambda_1 t} \cdot e^{-\lambda_2 t} \cdots e^{-\lambda_n t} = e^{-(\lambda_1 + \cdots + \lambda_n)t}\]

This is exactly the survival function of an exponential with rate \(\lambda_1 + \cdots + \lambda_n\).

The Rate Addition Rule

If each individual B molecule gives a collision rate of \(\lambda_1\) (which is proportional to \(r/A\) from our earlier calculation), then \(n_B\) identical B molecules give a total collision rate of: \[\lambda_{\text{total}} = n_B \cdot \lambda_1 \propto \frac{n_B}{V}\] where \(V\) is the (3-dimensional) volume.

But by the same reasoning, if we have \(n_A\) different A molecules that could each hit the B molecules, the total rate becomes: \[\lambda_{\text{total}} \propto \frac{n_A \cdot n_B}{V}\]

The Bigger Picture

This derivation shows us something profound: the macroscopic law of mass-action kinetics emerges naturally from the microscopic physics of molecular motion. We don’t need to postulate it; we can derive it from first principles.

Of course, real chemistry is more complicated. Molecules aren’t hard spheres, they have complex shapes, reactions need activation energy, there might be spatial correlations… But the basic scaling law — the product rule — survives all these complications for elementary reactions.

This is the beauty of statistical mechanics: simple microscopic rules lead to universal macroscopic behavior. The same ideas we’ve used here apply to everything from chemical reactions to stellar collisions to radioactive decay.

Stochastic CRN kinetics

Equipped with the understanding for mass-action kinetics, we are ready to complete the CRN formalism and specify not only what happens but also how fast.

We start with an example reaction \[ A + B \rightarrow C. \] With mass-action kinetics and rate constant \(\gamma\) in units of \(\text{L} s^{-1}\), the reaction has rate \[ \gamma \cdot [A] \cdot [B] \] in units of \(\text{L}^{-2} \cdot \text{L} \text{s}^{-1} = \text{L}^{-1} \text{s}^{-1}\). Concentrations may as well be given in molar units.

By contrast, the propensity of the reaction is defined as \[ \gamma' \cdot A \cdot B \] in units of \(\text{s}^{-1}\), where \(\gamma' = \gamma / \text{vol}\) is in units of \(\text{s}^{-1}\) and vol is the volume of the compartment in which the reactions happen.

We are now in the position to define the general propensities and rates for CRNs.

Stochastic (CME) propensity. The natural semantics of a CRN is stochastic and speed is given in transition rates for reactions. For a reaction \((\mathbf{r}, \mathbf{p}, \alpha) \in \mathcal{R}\) the propensity is defined as

\[ a(\mathbf{x}) = \alpha \prod_{s \in \mathcal{S}} \binom{x_s}{\mathbf{r}(s)}, \]

where \(x_s\) is the current copy number of species \(s\). The binomial coefficient \(\binom{x_s}{\mathbf{r}(s)}\) counts the number of distinct subsets of \(\mathbf{r}(s)\) molecules of type \(s\) that can participate. For \(\mathbf{r}(s) = 1\) this is just \(x_s\), but for \(\mathbf{r}(s) = 2\) it is \(x_s(x_s-1)/2\), not \(x_s^2\).

Mind that we did not explicitly make use of the volume \(V\) for the propensity. We implicitly fixed a \(V\) and used counts of species within this volume. This is why technically \(a(\mathbf{x})\) is not a rate, but a propensity.

Example: transcription and mRNA degradation

Consider the two-reaction network with species \(\mathcal{S} = \{\text{DNA},\text{mRNA}\}\):

\[ \text{DNA} \xrightarrow{\beta} \text{DNA} + \text{mRNA}, \qquad \text{mRNA} \xrightarrow{\gamma} \emptyset. \]

DNA is present at fixed copy number 1 (a single gene), so the propensities are

reaction	propensity
transcription	\(\beta\,\binom{1}{1} = \beta\)
mRNA degradation	\(\gamma\,\binom{m}{1} = \gamma m\)

Example: dimerisation \(A + A \xrightarrow{\alpha} A\)

Here \(\mathbf{r}(A) = 2\), so the stochastic propensity is

\[ a = \alpha\,\binom{A}{2} = \alpha\,\frac{A(A-1)}{2}. \]

This counts the number of distinct pairs of \(A\) molecules that can react.

The Chemical Master equation

Given a CRN with mass-action propensities, the stochastic dynamics of the CRN are a famous mathematical object: a so-called continuous-time Markov chain. Exactly like our CRN, a Markov chain is described by its states and exponential transitions rates between the states. States are exactly the integer copy-number states from the CRN. The so-called Markov property, that the transition rates from a state depend only on the this state, holds because the propensity at time \(t\) depends only on the current state, not on the history of states.

Formally, denoting by \(P(\mathbf{x}, t)\) the probability of being in state \(\mathbf{x}\) at time \(t\), the Chemical Master Equation (CME) is the ODE

\[ \frac{\mathrm{d}P(\mathbf{x},t)}{\mathrm{d}t} = \sum_{(\mathbf{r},\mathbf{p},\alpha)\in\mathcal{R}} \Bigl[ a\bigl(\mathbf{x}-(\mathbf{p}-\mathbf{r})\bigr)\,P\bigl(\mathbf{x}-(\mathbf{p}-\mathbf{r}),t\bigr) - a(\mathbf{x})\,P(\mathbf{x},t) \Bigr]. \]

Each reaction contributes an inflow term (arriving from the predecessor state) and an outflow term (leaving the current state).

Example: transcription and mRNA degradation (continued)

The state is \(m \in \{0, 1, 2, \ldots\}\) for the number of mRNAs. Transcription increases \(m\) by 1, degradation decreases \(m\) by 1. The CME is

\[ \frac{\mathrm{d}P(m,t)}{\mathrm{d}t} = \beta\,P(m-1,t) + \gamma(m+1)\,P(m+1,t) - (\beta + \gamma m)\,P(m,t), \]

with \(P(m,t) = 0\) for \(m < 0\). The resulting Markov chain is shown below.

1D Markov chain for transcription and mRNA degradation

The full gene expression system adds a protein dimension \(p\). Its Markov cain is shown below.

Markov chain for the full gene expression system

Macroscale equations (deterministic, ODE semantics)

CRNs are not only considered under stochastic, but also under deterministic behavior. This is typically done for large copy numbers and large volumes, where the ratio \(s/V\) for a species \(s\) is kept constant and \(s,V \to \infty\).

Under mass-action kinetics the ODE for species \(s\) is given by

\[ \frac{ds}{dt} = \sum_{(\mathbf{r},\mathbf{p},\alpha)\in\mathcal{R}} (\mathbf{p}(s) - \mathbf{r}(s))\,\alpha \prod_{\tilde s \in \mathcal{S}} \tilde s^{\mathbf{r}(\tilde s)}. \]

Example: transcription and mRNA degradation (continued)

For the transcription and mRNA degradation, one obtains the linear ODE \[ \frac{\mathrm{d}m}{\mathrm{d}t} = \beta - \gamma m. \]

The Chemical Langevin Equation

The Chemical Master Equation can be difficult to solve, which is why several approxmations were developed. One of these approximations is the Chemical Langevin Equation, which we will now derive. It is perfectly useful on its own, in particular from a computational perspective. We present it here for another reason as well: to show that the ODE kinetics of a CRN are an approximation to the expected stochastic kinetics, at least for linear CRNs.

It starts

\[ X_i(t + \tau) = X_i(t) + \sum_{j=1}^M \nu_{j,i} K_j(X(t), \tau) \] where \(\nu_{j,i}\) is the net change in the count of species \(i\) in reaction \(j\) and \(K_j(X(t), \tau)\) is the random variable that specifies how many times reaction \(j\) occurs in the time interval \([t, t+\tau]\).

Assumption 1 of the chemical Langevin equation is that the propensity functions do not change significantly during the time interval \([t, t+\tau]\), i.e., \(a_j(X(t')) \approx a_j(X(t))\) for all \(t' \in [t,t+\tau]\).

Then, we can rewrite the above equation as \[ X_i(t + \tau) = X_i(t) + \sum_{j=1}^M \nu_{j,i} \mathcal{P}_j(a_j(X(t)) \cdot \tau) \] where the \(\mathcal{P}_j(a_j(X(t)) \cdot \tau)\) are independent Poisson variables with parameter \(a_j(X(t)) \cdot \tau\).

Assumption 2 of the chemical Langevin equation requires the expected number of occurrences of each reaction to be big, i.e., \(a_j(X(t))\cdot \tau \gg 1\).

Note that the two assumption require a tradeoff: assumption 1 wants \(\tau\) to be small whereas assumption 2 wants \(\tau\) to be big. It is very well possible that no choice of \(\tau\) satisifies both assumptions for a given system.

It is well-known that the Poisson distribution with parameter \(\lambda\) is well approximated by the normal distribution \(\mathcal{N}(\lambda,\lambda)\) with expected value \(\mu = \lambda\) and variance \(\sigma^2 = \lambda\) for large values of \(\lambda\).

Assumption 2 thus suggests using this approximation to get \[ \mathcal{P}_j(a_j(X(t)) \tau) \approx a_j(X(t))\tau + \sqrt{a_j(X(t))\tau} \cdot \mathcal{N}(0,1) \] where \(\mathcal{N}(0,1)\) is a standard normally distributed random variable.

Now, switching to \(X(t)\) being real-valued and setting \(\tau = dt\), in the limit \(dt \to 0\), we finally get the chemical Langevin equation: \[ \frac{dX_i(t)}{dt} = \sum_{j=1}^M \nu_{j,i} a_j(X(t)) + \sum_{j=1}^M \nu_{j,i} \sqrt{a_j(X(t))} \Gamma_j(t) \] where the \(\Gamma_j\) are independent standard Gaussian white noise processes.

From Stochastic to Deterministic Kinetics

If the assumptions of the chemical Langevin equation hold sufficiently well, one can make the connection between the stochastic and the deterministic kinetics of CRNs. In this case, taking the expected value (over an ensemble of sample paths) on both sides of the equation, we get: \[ \frac{d \mathbb{E} X_i(t)}{dt} = \sum_{j=1}^M \nu_{j,i} \mathbb{E} a_j(X(t)) \]

If the propensities follow a mass-action law, the expected value \(\mathbb{E} a_j(X(t))\) is sometimes an approximation to \(a_j(\mathbb{E} X(t))\). If there are only unary reactions, they are equal for instance.

The detour via the chemical Langevin equation is not the only way to show the above formula. For a large-volume (or large-species-count) limit, it can sometimes be derived directly from the Chemical Master Equation.