Theory

Cross-correlation of point processes

In fluorescence correlation spectroscopy (FCS) the (normalized) cross-correlation function (CCF) of two continuous signals \(I_1(t)\) and \(I_2(t)\) is defined as:

\[G(\tau) = \frac{\langle I_1(t)\; I_2(t) \rangle} {\langle I_1(t)\rangle\langle I_2(t) \rangle}\]

The auto-correlation function (ACF) is just a special case where \(I_1(t) = I_2(t)\).

In actual experiments, signals are not continuous but come from single-photon detectors that produce a pulse for each photon. These pulses are usually timestamped with ~10ns resolution. The series of photon arrival times is used as input for ACF or CCF computations.

In principle, timestamps can be binned to produce a discrete-time signal. In signal processing, the (non-normalized) cross-correlation of two real discrete-time signals \(\{A_i\}\) and \(\{B_i\}\) is defined as

\[c[k] = \sum_{i=0}^{N} A[i]\ B[i+k].\]

The previous formula is implemented by ucorrelate() and numpy.correlate. The difference is that ucorrelate() only computes positive lags and allows setting a max lag for efficiency.

Binning timestamps to obtain timetraces would be very inefficient for FCS analysis where time-lags span may orders of magnitude. It is much more efficient to directly compute the cross-correlation function from timestamps. The popular multi-tau algorithm allows computing the correlation directly from timestamps on a fixed arrangement of quasi-log-spaced bins. More generally, Laurence algorithm (Laurence et al. Optics Letters (2006)) allows computing cross-correlation from timestamps on arbitrary bins of time-lags, with similar performances as the multi-tau. Computing cross-correlation \(C(\tau)\) from timestamps is fundamentally a counting tasks. Given two timestamps arrays t and u and considering the k-th time-lag bin \([\tau_k, \tau_{k+1})\), \(C(k)\) is equal to the number of pairs where:

\[\tau_k \le t_i - u_j < \tau_{k+1}\]

for all the possible i and j combinations. The full expression for \(C(k)\) is:

(1)\[C(k) = \frac{n(\{(i,j) \ni t_i < u_i - \Delta\tau_k\})}{\Delta\tau_k}\]

where n({}) is the operator counting the elements in a set, \(\Delta\tau_k\) is the duration of the k-th time-lag bin and T is the measurement duration. For FCS we normally want the normalized CCF, that is:

(2)\[G(k) = \frac{n(\{(i,j) \ni t_i < u_i - \Delta\tau_k\})} {n(\{i \ni t_i \le T - \Delta\tau_k\})\:n(\{j \ni u_j \ge \Delta\tau_k\})} \frac{(T-\Delta\tau_k)}{\Delta\tau_k}\]

Both eq. (1) and (2) are implemented by pcorrelate(). You can choose between the normalized and unnormalized version with the input argument normalize.

Note

In Laurence 2006, due to a typo, the expression for G(k) (which they call \(C_{AB}(\tau)\)) is missing the term \(\Delta\tau_k\) in the denominator.

References