Light Microscopy at Maximal Precision

Microscopy is the workhorse of the physical and life sciences, producing crisp images of everything from atoms to cells well beyond the capabilities of the human eye. However, the analysis of these images is frequently little better than automated manual marking. Here, we revolutionize the analysis of microscopy images, extracting all the information theoretically contained in a complex microscope image. Using a generic, methodological approach, we extract the information by fitting experimental images with a detailed optical model of the microscope, a method we call Parameter Extraction from Reconstructing Images (PERI). As a proof of principle, we demonstrate this approach with a confocal image of colloidal spheres, improving measurements of particle positions and radii by 100x over current methods and attaining the maximum possible accuracy. With this unprecedented resolution, we measure nanometer-scale colloidal interactions in dense suspensions solely with light microscopy, a previously impossible feat. Our approach is generic and applicable to imaging methods from brightfield to electron microscopy, where we expect accuracies of 1 nm and 0.1 pm, respectively.


I. INTRODUCTION
Microscope technology has progressed to near perfection. Crisp images speak of precisely engineered microscope components: large-aperture and nearly aberrationfree lenses, high-frame-rate and low noise cameras, powerful and uniform light sources. Nanometer-scale details boast of super-resolution techniques thought impossible mere decades ago: PALM [1], STORM [2], STED [3]. The continued development of ever more powerful techniques -SIM [4], Lattice-light sheet microscopy [5] -reassures that resolution will continue to improve.
However, our ability to extract quantitative information from microscopy images has not kept pace. In fields from electron microscopy to super-resolution localization, current methods mimic human perception with heuristic approaches, such as looking for the centers of bright spots or regions of contrast in an image [6][7][8][9][10][11]. The simplicity of these methods necessarily ignores physical complexities in the image formation. As a result, systematic errors and inefficient estimates plague these techniques [12,13].
In this paper, we present a universal method of scientific image analysis that extracts all the information theoretically contained in a complex image. Our method, dubbed parameter extraction from reconstructing images (PERI), uses a detailed model of the physics of image formation to fit experimental images. From the fit, we then extract information about the image at the informationtheoretic limit. We illustrate this approach on confocal images of colloidal spheres, measuring each particle's position and radius to within 3 nm, a 100x improvement over current methods. We use this extreme accuracy to measure colloidal interactions at the nanometer scale, measuring deviations from hard-sphere interactions for * These two authors contributed equally to this work. † Work done while at Cornell, currently affiliated with Google Inc.
the first time with light microscopy. Our method does not require modifying the microscope or the image acquisition. As a result, any researcher with a microscope can readily apply our technique to push their data to the information-theoretic limit. How precisely can an object be located in an image? The fundamental limitation in locating an object arises from statistical noise in the image formation, not directly from diffraction or optical limitations [14]. This limit is determined through the interplay of the image signal and noise, as described by the Cramér-Rao Bound. Specifically, the Cramér-Rao Bound states that the covariance matrix of the estimated parameters is always larger than the inverse of the Fisher information matrix of the noise distribution [15]. For an image with Gaussian white noise of variance σ 2 , sampled at points x k , the minimum uncertainty in the parameters θ measured from the image is where I(x) is the image that would be measured in the absence of noise. We can use this equation to estimate the minimum uncertainty in measuring a colloidal sphere's radius and position from a three-dimensional confocal image. For a particle of radius R blurred by diffraction over a width w, the derivatives with respect to particle radius in equation (1) are only nonzero on a shell at the particle's edge of approximately 4πR 2 w voxels. At the particle's edge, the intensity changes from a characteristic brightness ≈ I to ≈ 0 over a width ≈ w, and the derivatives are thus of magnitude ≈ I/w. Substituting these values gives a minimum uncertainty in a particle's radius as σ R ∼ w/4πR 2 /SNR, where SNR = I/σ is the signalto-noise ratio. Likewise, changing the particle's position only affects the edge voxels in the direction of the par-0.1 px FIG. 1: PERI overview -A demonstration of model information recovered from real confocal microscope images of a = 1.343 (8) µm colloidal spheres at a volume fraction of φ = 0.130 (5). ticle's motion. The positional derivatives will thus be of magnitude ≈ I/w only on a projected shell of ≈ πR 2 w voxels, giving the minimal uncertainty in the particle's position as σ x ∼ w/πR 2 /SNR. For a colloidal particle of diameter 1 µm, imaged with a confocal microscope with voxel size of 100 nm and diffractive blur of w ≈ 200 nm at an SNR = 25, these uncertainties correspond to σ R ≈ 1.5 nm and σ x ≈ 3 nm, a fantastically high precision.

II. RESULTS
Actually achieving this localization without serious systematic errors requires a detailed knowledge of the image formation process. To incorporate this knowledge, we create a generative model of the microscope image based on the physics of the light interacting with the sample and with the microscope's optical train. We then fit every parameter in the model by comparing the image produced by the model to the experimental image. Our model describes the physics of image formation in the order that it occurs: (1) fluorescent dye is distributed unevenly throughout the sample, (2) the dyed sample is illuminated unevenly by the laser, (3) the resultant image is blurred due to diffraction, and (4) the final image is noisy.
Dye Distribution: To reconstruct the image, we start with the continuous distribution of the fluorescent dye in the sample. For the image in Fig. 1, the dye is distributed everywhere except in a slab, representing the glass cover-slip slide, and in a collection of spherical lacunae, representing the colloidal particles. To represent this continuous dye distribution on a pixelated grid, we draw these objects in real-space using a function that is tuned to match the exact Fourier representation of a sphere (see SI for an extensive discussion of this and the rest of the generative model). We call this correctly-aliased representation on a pixelated grid the Platonic image. While we focus on featuring only spheres in this work, PERI is flexible enough to include any parameterizable object in the generative model, such as ellipsoidal [16,17], rodlike [18], or polyhedral [19] particles.
Illumination field and background: This distribution of dye is illuminated by a scanned laser. Due to imperfections and dirt in the optics, the illumination is not uniform but instead varies in space. For instance, our line-scanning confocal's illumination field is highly striped, as any imperfections in the line illumination are dragged across the field of view. We describe this spatially-varying illumination as a continuous field that varies throughout the image. Empirically, we find that combining a Barnes interpolant along the scan direction and Legendre polynomials in the perpendicular directions accurately describes both the rapidly-varying stripes and the slowly-varying changes in the illumination of our line-scan confocal. Additionally, the microscope always registers a non-zero background signal, which we include in our model. We parameterize this background similarly to the illumination field.
Point spread function: Diffraction prevents the illuminated dye from being imaged exactly onto the detector. Instead, each dye molecule in the sample projects a comparatively large blur, known as the point-spread function (PSF), onto the imaging camera. As a result, the image captured on the camera is a convolution of the illuminated Platonic image with the PSF, and not simply the illuminated dye itself. While complicated, this PSF has been calculated exactly by many researchers for different geometries [20][21][22][23][24][25][26][27]. For microscope samples with a refractive index different from what the optical train is designed for, the PSF worsens with depth, becoming significantly broader and more aberrated. We use an adaptation of these exact PSF calculations for a line-scanning confocal as our PSF model, optimizing over parameters such as the numerical aperture of the lens and the index mismatch of the sample to the optics.
Putting these components together as shown in Fig. 1a, our model image M sampled at pixels x is described by (2) where I is the illumination field, B is the background, Π is the platonic image, and P is the spatially-varying PSF; we include a constant offset c to partially capture rapidlyvarying variations in the background. The model image is highly realistic, as shown by the comparison with real data in Fig. 1b.
Noise: Finally, noise degrades the image recorded on the camera. We treat the noise using a Bayesian framework, and look for the maximum-likelihood model given the microscope data, complete with possible priors on parameter values. Since the noise is empirically Gaussian (see SI), the most likely model is the least-squares fit of the model to the microscope image.
To find the most likely model, we least-squares fit every parameter in our generative model to find the correct particle positions, radii, illumination field, and point-spread function, as illustrated in Fig. 1c. A typical confocal image contains a few times 10 3 particles, each with 4 fit parameters (x, y, z, R). In addition, there are a few hundred global parameters to optimize, such as the illumination and PSF parameters and the lens's z-step size along the optical axis, resulting in ≈ 10 4 parameters per image -a daunting optimization problem. We begin with an initial guess for the positions using standard particle locating techniques [28], and we simultaneously fit the particle positions and the global variables using a Levenberg-Marquardt algorithm modified for large parameter spaces [29][30][31][32]. From here, we ensure that we have correctly identified every particle in the image by automatically adding and subtracting particles based on the the difference between the model and the microscope image. After finding the best-fit parameters, we sample from the log-likelihood using standard Monte Carlo techniques [33] to estimate the errors in the image reconstruction. (See SI for a detailed description of the fitting method and numerical optimizations.) It is important to note that this fit is over all the pixels in the image -to get a meaningful extraction of parameters, every pixel must be described accurately. Imperfectly fit regions -due to e.g. deformed particles or PSF leakage from objects outside the image -can bias the extracted positions of particles in the region and even affect the entire image reconstruction through the influence on image-scale variables.
Using PERI to measure positions with nanometer accuracy requires rigorous checks on our method, with both generated and experimental data. We first generate images with a detailed physical model, employing an exact, spatially-varying point-spread function [20], experimentally-measured spatially-varying illumination, dense collections of particles with varying radii, and a realistic amount of noise. PERI successfully fits these generated data, converging to the global fit minimum in the extremely large dimensional parameter space despite a host of possible numerical complications, such as local minima in the fit space or a failure of the fit to converge. From this fit, PERI extracts both the particle positions and radii at the Cramér-Rao bound (≈ 2 nm and ≈ 1 nm, respectively). In contrast, current heuristic-based algorithms cannot measure the particle positions to better than 60 nm on realistically generated datasets. (See SI for a detailed comparison of PERI to other featuring algorithms.) Emboldened by this success, we next test PERI on real experimental data. We take fast, three-dimensional movies of a suspension of 1.34 µm diameter silica spheres suspended in a glycerol and water mixture and feature these images using PERI. By analyzing each frame in the movie independently, we can extract systematic errors from PERI's featuring.
We first analyze the residuals of our fits to the experimental data. Fig. 2(a,b) shows these residuals in both real-and Fourier-space. If our fit to the experimental image were perfect, the residuals would be perfectly Gaussian white noise. Instead, while the overall probability distribution of the residuals is nearly Gaussian in both domains (see SI), in Fourier-space there are distinct wave vectors above the noise floor. Comprising roughly 10 −5 of the power in the experimental image, the extremely small size of this remaining signal demonstrates the quality of our generative model. The deviations of our model from the experimental data occur at length scales slightly larger than the particle diameter but smaller than typical illumination variations. These unexplained residuals most likely arise from approximations in models of linescanning point spread function, excess aberrations in the microscope, and the artificially finite but large size we use in our PSF calculation to speed up optimization. Additionally, sharp peaks at high wave-vectors can be seen in one slice of the Fourier-space residuals, which arise from noise in the scanning of the lens and the line illumination. The remaining question is how much these residuals affect the parameters of interest, the particle positions and radii.
We can use the extracted particle positions and radii over time to test the accuracy of PERI. During the movies, the particles diffuse about, sampling different regions of the spatially-varying illumination and pointspread function and changing the configuration of neighboring particles. However, the true particle radii remain FIG. 2: Fitting the generative model to experimental data -(a) A representative image (lower left), its best-fit model (center), and the difference between the two (upper right), shown as cross sections in the xy, xz, and yz planes. The residuals show nearly perfect Gaussian white noise at the expected signal-tonoise ratio, demonstrating the quality of the generative model. (b) The Fourier power spectrum of the same residuals, displayed as three orthogonal slices in the qxqy, qxqz, and qyqz planes. In addition to scanning noise, visible as the stripes along qx = 0 and qz = 0 as well as the isolated poles, excess power is visible at scales larger than the particles themselves but smaller than the features given by the ILM. These residuals are associated with the incomplete description of the point spread function. (c) PERI measures the particle radius within an uncertainty of 3 − 4 nm, as estimated from changes in featured particle radii with time (red histogram). Improving the description of the PSF would allow for radii to be featured at the CRB (green histogram), with a precision of 1 nm. (d) The experimental mean-squared displacement ∆x 2 (black dots; error bars smaller than symbol size) provides an estimate of PERI's average positional errors. Extrapolating the fitted mean-square displacement (red curve; the shaded band denotes the fit uncertainty) to t = 0 gives a positional error indistinguishable from 0 nm. constant in time. Measuring individual radii fluctuations over time provides a stringent model-independent measurement of errors in PERI, as the changing configuration of the particles includes all the possible sources of systematic error. Tracking these radii fluctuations over time suggests that we can measure the particle radius to within 3-4 nm (Fig. 2c), a fantastically high precision compared to the 672 nm particle radius and even the 125 nm pixel size. A better understanding of the image formation in the microscope could increase this precision even further, to the 1.5 nm minimal error from the Cramér-Rao Bound. We can also constrain the positional errors. Since the particle positions undergo Brownian motion, their mean-square displacement grows linearly in time ∆x 2 (t) = 2Dt [34]. Any featuring error that is uncorrelated with the particle position will manifest itself as a nonzero intercept when the fitted mean-square displacement is extrapolated to t = 0. By extrapolating to zero (panel d), we find that PERI's positional errors are indistinguishable from zero and are less than 10 nm, with this constraint being limited only by statistics. Additionally, we check PERI on a dataset of 2 µm diameter particles fixed in place via strong interactions -a less demanding test since immobilizing the particles also fixes most of the sources of systematic error. In this data, we find x and y errors of 1-2 nm, z errors of 3 nm, and radii errors of 0.8 nm (see SI). Combined, these measurements demonstrate that we are able to measure particle positions and radii to within 3 nm.
Why is PERI able to measure particle positions and radii so accurately while heuristic methods fail? Heuristic methods produce poor measurements with large systematic errors simply because they ignore complexities of the image formation, such as the spatially-varying illumination and point-spread function. In contrast, PERI includes these complexities. Fitting the entire image ensures that all the complexities are accounted for -any portion of the image formation not included in the model will manifest itself as strong residuals in the fit, declaring that the model is incomplete and suggesting what additional effect must be included. This process of model selection is described in detail in the SI.
This extraordinary accuracy in measuring particle positions from microscopy images creates a new window into nanometer-range particle interactions in dense suspensions. When colloidal particles are suspended in an aqueous solution, the particles charge, as the polar solvent dissociates ions on the particles' surface groups. This charge results in an electrostatic repulsion, which is in turn screened by counterions in the bulk [35,36]. The screening creates an interparticle potential that deviates from a hard-sphere potential only at nanometer separations. This potential ever so slightly biases the distribution of particle positions away from that expected for a hard sphere suspension.
Previous efforts measured these interactions only in idealized, isolated surfaces such as a between two surfaces [37] or a single colloidal particle interacting with a wall [38,39]. However, by their nature these idealized measurements frequently cannot include possible complications present in a real suspension, such as manybody interactions, realistic surface asperities, or increases in dissolved ion concentration from dissociated surface groups on multiple particles. Measuring the interaction potential in a dense colloidal suspension includes these and many other possible complications in the interaction.
We measure these nanometer-scale interactions by using PERI to analyze a large set of images of 1.3 µm silica spheres suspended in a water-glycerol mixture. To prevent kinetic effects from confounding our measurements, we allow the sample to fully sediment for an hour. This produces an open layer of sediment approximately 2-3 particle layers deep, shown in Fig. 3a. We then image this suspension repeatedly over the course of several hours, extracting simulation-level detail of ≈ 720,000 particle positions and radii over all the images. The particle interactions determine the structure of the suspension. We quantify this structure with the probability P s (δ) of finding a pair of particles with surface-to-surface separation δ, accounting for radii polydispersity and sedimentation in a manner preferable to the usual pair-correlation function. To reconstruct the interparticle potential, we use the extracted particle radii and particle number from the data and we simulate the particle dynamics using Brownian dynamics. We incorporate both gravitational settling and the interparticle potential, which we model as an exponentially-decaying electrostatic repulsion. We then fit the potential by simulating, reconstructing P s (δ) from the simulation at each set of potential parameter values, and iterating to find the best P s (δ) that matches experiment (Fig. 3b).
The P s (δ) from the best-fit simulation and from the experimental data analyzed by PERI agree excellently, at both large and small separations. At small separations, P s (δ) rises rapidly over the first ≈ 0.1 µm near contact in both the simulation data and the data extracted by PERI, as shown in figure 3b. At longer distances (inset), the probability grows due to the increased volume where particles can be located, with slight oscillations reflecting second-and third-nearest-neighbor interactions. In contrast, previous centroid-based methods produce a P s (δ) with nonsensical features, such as significant overlaps, that cannot be fit by a simulation.
We use the extracted P s (δ) to measure nanometerscale interactions in dense colloidal suspensions for the first time. The P s (δ) measured by PERI is well-fit by an exponentially-decaying repulsive potential, as expected from electrostatic repulsion in standard colloidal theory [35] (figure 3c). From the fit, we measure the potential's screening length as 10.1 ± 2.5 nm and the repulsion strength near contact as 100 ± 30 kT, corresponding to surface potentials and screening lengths similar to that previously measured from the interaction of a single particle with a wall [38]. Our data strongly excludes hardsphere interactions as the interparticle potential. Importantly, this resolving power between potentials results from the values of P s (δ) near contact. Without the accurate localization provided by PERI, it is impossible to measure the potential at these separations.

III. DISCUSSION
Our technique and the ideas within it provide more than just a description of colloidal interactions. Nanometer accuracy in locating colloidal particle positions would revolutionize fields as diverse as the study of colloidal glasses and the measurement of biological forces with force-traction microscopy. With our open-source code 1 , other researchers can immediately analyze existing images of these systems. Moreover, the principle of accurately reconstructing an image to extract parameters applies to a wide range of fields. Extending PERI to analyze brightfield microscopy images would provide nanometerscale precision for a simpler and more widespread imaging setup than confocal microscopy. Applying these ideas to imaging modalities such as STEM or STM will usher in a new era of precision measurements, for objects whose sizes range from microns to angstroms.

IV. MATERIALS AND METHODS
The microscope is a Zeiss LSM 5 Live inverted confocal microscope, used in conjunction with an infinity-corrected 100x immersion oil lens (Zeiss Plan-Apochromat, 1.4 NA, immersion oil with index n = 1.518). The LSM 5 Live confocals operate by linescanning. Rather than rastering a single point at a time T i m e We use PERI to analyze a large ensemble of three-dimensional images of a dilute suspension of ≈1200 Brownian particles; a small section of these images is shown in the upper left. From this data, we extract an experimental Ps(δ). (b) We use molecular dynamics to create a simulated Ps(δ), and we iteratively update the interaction potential V (δ) to find the Ps(δ) that best fits the experimental data. (c) The extracted Ps(δ) from PERI (gray dashed line) and from the fitted potential (solid cyan line) agree excellently. In contrast, the Ps(δ) from a strictly hard-sphere potential (red line) does not fit the data. The difference between these potentials depends on resolving particle separations at the nanometer level. Previous centroid-based methods [40] (purple line) produce a Ps(δ) with nonsensical features, such as significant overlaps, that cannot be fit by a reasonable interaction potential. (d) From the best-fit simulation, we extract the interparticle potential V (δ). The shaded bands show the uncertainty in the potential, with the teal band describing uncertainty in the fit and the gray band the uncertainty due to systematic errors (see SI for further discussion).
to form the image, a line-scanning confocal images an entire line at once. An image of a line is focused onto the sample, and the sample fluorescence is detected on a line CCD. Rastering this line allows images to be collected extremely rapidly; the data in the text was taken at 108 in-plane frames per second. However, the different line-scanning optics worsen the point-spread function compared to a point-scanning confocal and cause illumination imperfections such as dirt to be smeared out over one direction in the image. Importantly, our confocal is outfitted with a hyper-fine piezo scanner which gives precise z-positioning of the lens. This precise zpositioning is important for accurate reconstruction of images -with the less-precise standard positioning our image reconstruction and results suffer considerably.
Our experimental images consist of ≈ 1.3 µm silica particles (MicroPearl) suspended in a mixture of glycerol and water. The glycerol/water mixture is tuned to match the refractive index of the particles by minimizing the sample scattering. For these particles we find the optimal refractive index is n ≈ 1.437 corresponding to ≈ 76% glycerol and 24% water. Since glycerol is hygroscopic, we controlled the concentration of glycerol and water by measuring the index of refraction rather than by measuring out the glycerol and water. We match the index of refraction of the spheres and the suspending fluid to within a few parts per thousand, resulting in practi-cally zero scattering by the spheres of either the laser or fluorescent light. The glycerol has the additional advantage of creating a very viscous suspension, slowing down the Brownian motion of the particles. We add fluorescein sodium salt to dye the suspending fluid, at a concentration of 0.4 mg/mL. The fluorescein diffuses rapidly compared to the particles, and is effectively uniformly distributed throughout the regions occupied by the fluid. By using a considerable amount of dye and a low laser power, we minimize photobleaching during our experiments. Fluorescein sodium salt (molar weight 376.27) consists of two sodium ions bound to a dye molecule. Thus, this dye concentration corresponds to ≈ 2 × 10 −3 moles/L of monovalent sodium ions and 10 −3 moles/L of divalent fluorescein ions. To this solution we added the 1.3 µm silica particles (MicroPearl) at a concentration of 6.8 mg particles per 1 mL of solution. These particles are placed in a 100 µm deep sample cell; since the particles sediment the experimental volume fraction is determined equally by settling and the sample cell height as opposed to simply the density of particles in the original suspension. We allow the suspension to sediment for several hours to achieve equilibrium before taking any measurements. The data is collected over the course of a 1-2 hours; we do not observe any change in the P s (δ) from the earlier samples to the later ones.

I. OVERVIEW
In this supplemental material we describe the details of our method for extracting parameters from experimental confocal images at the highest resolution possible without modifying the microscope itself. To achieve maximal resolution, we build a generative model which aims to describe the value of every pixel in the experimental image. That is, we create simulated images by explicitly modeling every relevant aspect of image formation including particle positions and sizes, the location of dirt in the optics, amount of spherical aberration in the lens, and the functional form of the point spread function. We describe each of these model components in detail in Section III and how we decided on these particular components in Section IV. In order to fit this model to the experiment, we adjust all model parameters until the features present in the true experimental image are duplicated in the simulated one. We decide when the fit is complete and extract errors of the underlying parameters by using a traditional Bayesian framework which is described in general terms in Section II. This high dimensional optimization is in general very difficult and so we describe our algorithmic improvements and particular techniques in Section V. Finally, we assess the accuracy of this method in extracting underlying parameters and compare its performance with traditional featuring methods in Section VI.
Overall, this document is meant to provide a roadmap for other researchers to follow when adapting this technique to other types of microscopy and other types of samples in order to extract the maximal amount of information from their experimental images.

II. BAYESIAN FRAMEWORK
When fitting a model to noisy data, it is useful to adopt a Bayesian framework in which we rigorously treat the noise as part of our model. In the case of our featuring method, we fit a model of each image pixel M i to experimental data d i , which can be described as a combination of signal and noise d i = S i + η i . This noise is present due to the detection of a finite number of photons by the microscope sensor, noise in the electronics, etc. and can be well described for our system by uncorrelated η i η j = 2σ 2 δ ij , Gaussian noise η i ∼ N (0, σ) (see Section III).
In a Bayesian framework, the likelihood that an individual pixel is correctly described by our model is given by the Gaussian likelihood, For uncorrelated pixel noise, the entire likelihood of the model given the image is given by the product over all pixels, We are ultimately interested in the probability of the underlying parameters given the image we record. According to Bayes' theorem, we can write this as where P (θ) are priors that allow us to incorporate extra information about the parameters θ. These priors can be as simple as the fact that the particle radius is positive definite or that a group of images share similar PSFs. For example, an overlap prior P overlap ( where H is the Heaviside step function, can be used to impose the physical constraint that particles cannot overlap. However, we found that the overlap prior only becomes relevant when the free volume of a particle is small compared to the average sampling error volume (when a particle is caged by ∼ 1 nm on all sides) and so we ignore it most of the time. We primarily work with the log-likelihood function log L because the number of pixels in the image can be very large, on the order 10 7 . For Gaussian noise, the log-likelihood is precisely the square of the L 2 norm between the model and the data. Therefore, we are able to maximize this log-likelihood using a variety of standard routines including linear least squares and a variety of Monte-Carlo sampling techniques. After optimizing, we use the covariance J T J to determine errors in the parameters or standard Monte-Carlo algorithms to sample from the posterior probability distribution to extract full distributions of the model parameters. In this way, any quantity of interest that is a function of particle distribution can be calculated using Monte-Carlo integration by Here, θ i is a parameter vector sampled fairly from the posterior probability distribution and O(θ i ) is an observable such as the pair correlation function, packing fraction, or mean squared displacement. Calculating higher-order moments provides estimated errors and error correlations on these observables. This is one of the more powerful aspects of this method -one can generate a probability distribution for each parameter and directly apply these distributions to any observable that can be inferred from the parameters.
Given this Bayesian framework, the main idea of this work is to create a full generative model for confocal images of spherical particles and provide algorithmic insights in order to implement the model on commodity computer hardware.

III. GENERATIVE MODEL
Most of the difficulty in our method lies in creating a generative model that accurately reproduces each pixel in an experimental image using the fewest number of parameters possible. Our model is a physical description of how light interacts with both the sample and the microscope optics to create the distribution of light intensity that is measured by the microscope sensor and rendered as an image on the computer. In this section, we describe the model which we use to generate images similar to those acquired by line-scanning confocal microscopy of spherical particles suspended in a fluorescent fluid.
Our generative model aims to be an accurate physical description of the microscope imaging; it is not a heuristic. Creating this model requires a detailed understanding of image formation of colloidal spheres in a confocal microscope. In the simplest view, our samples consist of a continuous distribution of dye distributed throughout the image. If the fluid is dyed (as for the images in this work), due to diffusion the dye is uniformly distributed through the fluid. The fluid-free regions, such as those occupied by the particles, are perfectly dye-free. The sample is illuminated with a laser focused through an objective lens. This focused laser excites the fluorescent dye only in the immediate vicinity of the lens's focus. An objective lens captures the dye's emitted light, focusing it through a pinhole to further reject out-of-focus light. The collected light passes through a long-pass or band-pass filter, which eliminates spurious reflected laser light before collection by a detector. This process produces an image of the sample at the focal point of the lens. Finally, rastering this focal region over the sample produces a three-dimensional image of the sample.
However, the actual image formation is more complex than the simple view outlined above. Excessive laser illumination can cause the dye to photobleach. Due to dirt and disorder in the optical train, the sample is not illuminated uniformly. Diffraction prevents the laser light from being focused to a perfect point and prevents the objective lens and pinhole from collecting light from a single point in the sample. Aberrations are present if the sample's refractive index is not matched to the design of the objective lens, broadening the diffractive blur deeper into the sample. Both the illuminating and fluorescing light can scatter off refractive index heterogeneities in the sample due to the particles.
Some of these complications can be eliminated by careful sample preparation. In practice, we eliminate photobleaching by using an excessive amount of dye in our samples and illuminating with a weak laser light. We eliminate scattering by matching the refractive index of the particles to the suspending fluid -it is fairly easy to match the refractive indices to a few parts in 10 3 . Since the scattering is quadratic in the index mismatch, the effect of turbidity due to multiple-scattering is very weak in our samples. However, the rest of these complications must be accurately described by the generative model.
Based on this physical setup, we can describe the confocal images through three main generative model components: • Platonic image Π(x) -the physical shape of the dye distribution in the sample (unmodified by perception of light).
• Illumination field I(x) -the light intensity as a function of position, including both laser intensity variation from disorder in the optics and intensity attenuation into the sample.
• Point spread function P (x; x ) -the image of a point particle due to diffraction of light, including effects from index mismatch and finite pinhole diameter.
plus three minor additional fit model components: • Image Background c, B(x) -the overall exposure of the image c and the background values corresponding to a blank image without dye, B.
• Rastering Step Size z scale -the displacement distance of the lens as it rasters along the optical axis.
• Sensor noise σ -the noise due to shot noise from finite light intensity reaching the sensor or electronic noise at the sensor.
These components are combined to form the image through convolution which is sampled at discrete pixel locations to give the final image M i = M(x i ).
Here, we describe each part of our model in detail along with our explanations and motivations behind any simplifications. In subsequent sections we will also discuss other aspects of image formation which may result in other model choices and why we omit them from the final form of the model.

A. Platonic image
The Platonic image must accurately represent the continuous distribution of fluorescent dye in the sample on the finite, pixelated image domain. The colloidal sample consists of a collection of spherical particles embedded in the solvent, with either only the particles or only the solvent dyed. Our Platonic image should then consist of the union of images of individual spherical particles, with their corresponding radii and positions. Thus, if we have a method to accurately represent one colloidal sphere, we can easily construct the Platonic image in our generative model.
A naïve way to generate the Platonic image of one sphere would be simply to sample the dye distributions at the different pixel locations, with each pixel being either 0 (if it is outside the sphere) or 1 (if it is inside the sphere) with no aliasing. This method will not work, since a pixel value in the Platonic image can only change when a sphere's position or radii has shifted by one pixel. This method of Platonic image formation would produce a generative model that does not adequately distinguish between particle locations separated by less than 1 pixel or 100 nm! Simply multiplying the resolution and corresponding coarse-graining of the boolean cut by a factor of N in each dimension increases the resolution of this method to 1/N pixels. However, calculating these high resolution platonic spheres is computationally expensive, requiring 10 9 operations to draw spheres capable of determining positions within 0.01 px.
To find the correct representation of a Platonic sphere, we examine the mechanism of image formation in Eq. S2. The final image results from a convolution of the Platonic image with the point-spread function P (x − x ; x). Thus, we need a representation of a sphere that will produce the correct image after being convolved with the point-spread function. To do this, we recall that a convolution is a multiplication in Fourier space. However, creating the image of the sphere in Fourier space is problematic since there will be undesirable ringing in the Platonic image due to the truncation from the finite number of pixels (i.e. Gibbs phenomenon). Moreover, each update of one particle requires updating all the pixels in the image, which is exceedingly slow for large images.
Instead, we look for a functional form in real space that approximates the numerically-exact truncated Fourier series, where the truncation arises due to a finite number of pixels. For a sphere with radius a at position p, this truncated Fourier series is given byΠ(q; p, a) = 4πa 3 (j 1 (q)/q)e iq·p , where q is sampled only at frequencies in the image. We can view the truncation operation as a multiplication in Fourier space by a boxcar where q is the variable inverse to position, measured in px −1 . By the convolution theorem, this truncation corresponds to a convolution in real space with sinc(x) sinc(y) sinc(z), using the inverse Fourier transform of the boxcar as the sinc function. Thus, the numerically exact image of a sphere would be the analytical convolution of sinc(x) sinc(y) sinc(z) with a sphere of radius a at position p, represented on a discrete grid. However, the convolution with the sinc function is analytically intractable. To circumvent this, we approximate the sinc function by a Gaussian. This gives a representation of the correctly-aliased Platonic image Π(x; a) of a sphere of radius a as is the Heaviside step function, which is either 0 or 1 depending on whether |x − p| > a or < a, and * denotes convolution. The Gaussian widths σ should be approximately 1 px; however, if the ratio of the z pixel size to the xy pixel size z scale = 1, then σ z will not be the same as σ x and σ y . While Eq. S3 does not generally admit a simple solution, there is a closed-form functional form for the symmetric case σ x = σ y = σ z . In the symmetric case (z scale = 1) Eq. S3 takes the form FIG. S1: Platonic sphere generation. A comparison of our approximate platonic sphere generation method to a sphere created by performing a boolean cut Π(x) = pixel dx H(|x−x −p|−a) on a lattice 100× higher in resolution in each dimension compared to the final image. On the left we show the super resolution sphere with fractional volume error δV /V = 10 −6 and an inset displaying the jagged edges caused by discrete jumps in distance. This is in contrast to the iterative approximate platonic sphere with volume error δV /V = 10 −16 drawn at an effective radius with change δa/a = 2 × 10 −4 . The differences between individual pixels along the center of the sphere (right panel) show a high frequency structure with a maximal relative value 0.08. These high frequency features are dramatically reduced later in the image formation process through the convolution with the point spread function.
where r is the distance from the particle's center. The first bracketed group of terms corresponds to treating the sphere as a flat surface, and the second bracketed group corresponds to the effects the sphere's curvature on the integral. In each sub-grouping, the first term that depends on r − a reflects the contribution due to the particle's nearer edge, and the second term that depends on r + a reflects the contribution due to the particle's farther edge. We then fit σ in Eq. S4 to best match the exact Fourier space image of a sphere, giving a value σ ≈ 0.276. Although Eq. S3 does not admit a simple solution for z scale = 1, we can use the exact form for z scale = 1 to construct an approximate solution. Since both erf(x) and e −x 2 approach their asymptotic values extremely rapidly, and since at the best fit σ ≈ 0.276 (a + r)/σ 1 for even moderately small radii, the terms erf((a + r)/σ √ 2) ≈ 0.5 and exp(−(r + a) 2 /2σ 2 ) ≈ 0 to an excellent accuracy. We then write the position vector in terms of its directionx and a vector δx as x ≡ ax + δx, and replace (a − r)/σ in Equation (S4) by (δx/σ x ) 2 + (δy/σ y ) 2 + (δz/σ z ) 2 . Note that this approximation is exact in the limit of infinite sphere radii. Empirically, we find that this approximation works quite well, giving differences in the Platonic image of a few percent from a numerical solution to Eq. S3 as well as high resolution boolean cut real-space spheres (see Fig. S1).
While this implementation of the Platonic image correctly captures most of the effects of finite-pixel size, there are still some minor details that need to be fixed to give unbiased images. By construction, Eq. (S4) conserves volume -its integral over all space is 4/3πa 3 since the Gaussian kernel is normalized. However, when Π(x) is sampled on a pixelated grid, its sum is not exactly 4/3πa 3 but is slightly different, depending on the position of the particle's center relative to a voxel's center. The slight change in volume is important for two reasons. First, the convolution with the PSF in our image generation (see next subsection) suppresses high-frequency portions of the image, but it does not affect the q = 0 component, i.e. the image sum or the particle volume. Since we aim to create a Platonic image that accurately represents the final image, we need the q = 0 component of the Platonic image to be correct. Secondly, as discussed in section IV the real microscope image is actually an integral over a finite pixel area. As such, the image recorded on the detector preserves the particle's volume or the q = 0 component of the image. To circumvent this issue of incorrect particle volume, instead of drawing the particle at its actual radius we draw it with a slightly different radius that preserves the particle's volume, which we accomplish with an iterative scheme. The results of this iterative scheme are shown in Fig. S1 along with the errors it introduces. Incidentally, the effects of image pixelation on image moments higher than 1 , e.g. x and its effects on the particle positions, are much smaller than the noise floor in our data at a moderate SNR (see section IV).
The representation in equation S4 is the best method for forming Platonic spheres on a pixelated grid that we have found. However, there are other, simpler methods which work almost as well as the Platonic sphere. Aside from the important curvature term, equation S4 is basically an erf() interpolation between particle and void at the particle's edge. Other interpolation schemes can provide similar results. For instance, the spheres could be constructed by ignoring the curvature term and replacing the erf with a logistic 1/(1 + exp((r − a)/α)), a linear interpolation between particle and void at the pixel edge, or a cubic interpolation at the pixel edge. We have also implemented these methods for generating Platonic images of spheres, fitting the parameters to match the exact Fourier representation. For the logistic we fit α, for the linear interpolation we fit the slope, and for the cubic we fit one parameter and constrain the other two such that the Platonic image and its derivative are continuous. While all of these methods are functional, they are not significantly faster than the exact Gaussian approximation in equation S4 and result in slightly worse featuring errors (see table I). As a result, we use the exact Gaussian approximation, but include these other options in our package for ease of use with more complicated shapes where the integral in equation S3 might not be analytically tractable.
The Platonic image needs to represent accurately all objects in the image, not just the spheres. In particular, when the solvent is dyed, the image usually contains a dark coverslip or its shadow from the point-spread function. We model this dark coverslip as a slab occupying a half-space. The slab is characterized by a z-position and by a unit normaln denoting the perpendicular to the plane. To capture accurately sub-pixel displacements of the slab, we use the image of a slab convolved with a Gaussian as above for a sphere; for the slab this gives a simple error (erf) function.

B. Illumination field
In order to illuminate the sample, confocal microscopes scan a laser over the field of view using several distinct patterns including point, line, and disc scanning. This illumination laser travels through the optics train and interacts with fluorescent dye in the suspension causing it to emit light in a second wavelength which is then detected. The intensity of this illumination pattern depends on the aberrations in the optics as well as dirt in the optical train which creates systematic fluctuations in illumination across the field of view. Accounting for these variations is important as they can account for most of the intensity variation in an image. In the case of our line scanning confocal microscope, these patterns manifest themselves as stripe patterns perpendicular to the scan direction, as the line-scan drags dirt across the field of view, overlaid on aberrations and optical misalignments which cause the corners of the image to dim.
FIG. S2: Illumination field residuals. A blank confocal image and its fit to the Barnes ILM in equation S7 over varying number of coefficients. Fitting the illumination with a low-order ILM of (3, 3) Barnes points removes the large fluctuations over the image but clearly shows stripes in the image. The notation (n0, n1, n2, ...) corresponds to a Barnes ILM with n0 coefficients in the expansion for P0(y), n1 coefficients for P1(y), etc. Increasing the number of points to (7,7,5,5,5) or (14,9,7,5,5,5) removes the overall modulation in y but leaves clear stripes in the image. Only at high orders of (50, 30,20,12,12,12,12) or (200,120,80,50,30,30,30,30,30,30,30) do these stripes disappear. The residuals shown in the figure are all at the same scale and are averaged over the image z for clarity.
Confocal microscopes image by rastering in z, illuminating each xy plane separately. Ideally, the microscope illuminates each plane identically. In practice, aberrations due to refractive index mismatches cause a dimming of the illumination with depth into the sample [S20]. Since this overall dimming only depends on the depth z from the interface and not on the xy position in the sample, it is natural to describe the illumination field as a product of an xy illumination and a z modulation: Empirically we find that illumination fields of this form can accurately describe our real confocal images, without incorporating any coupling between xy and z. We describe each of the separate functions I xy and I z by a series of basis functions. Since the modulation in z is fairly smooth [S20], we describe I z (z) by a polynomial P z (z) of moderate order ≈ 7-11 for 50-70 z-slices; typically we use a Legendre polynomial as the orthogonality accelerates the fitting process. The in-plane illumination of a confocal is determined by its method of creating images. Our confocal is a line-scanning confocal microscope, which operates by imaging a line illumination parallel to the x axis and simultaneously collecting the line's fluorescent image. This line is then scanned across the image in y. As a result of this scanning, any dirt in the optics is dragged across the field of view, creating the illumination with stripes along the x-direction visible in Fig. S2. To model these stripes, we treat the variation along x and y differently. We write the xy illumination field as where B k (x; c k ) is a Barnes interpolant in x and P k (y) a Legendre polynomial in y. Barnes interpolation is a method of interpolating between unstructured data using a given weight kernel [S41], similar to inverse distance weighting, using a truncated Gaussian kernel to allow for strictly local updates to the high frequency illumination structure.
We use an interpolant with equally spaced anchor points in x throughout the (padded, see section III C) image. The k th Barnes interpolant has a large number of free parameters, described by the vector c k ; the size of c k is equal to the number of anchoring points in the Barnes. To account for the fine stripes in the image, we use a large number of points for the Barnes associated with low-order polynomials, and decrease the number of points for higher-order polynomials. For a typical image of size (z, y, x) = (50, 256, 512) pixels, we use coefficient vectors of 200,120,80,50,30,30,30,30,30,30,30). While this is a large number of coefficients, there are orders of magnitude fewer coefficients than pixels in the image. As a result, all of the ILM parameters are highly constrained (on the order of a few parts in 10 5 , varying wildly with the parameter), and we do not overfit the image.
Putting this all together, we use an ILM given by: This ILM accurately describes measured confocal illuminations, as determined both from blank images and from images with colloidal particles in them. While the Barnes structure of this ILM is optimized for line-scanning microscopes, it can easily be changed. For ease of use for different microscopes or imaging modalities we have implemented various ILMs consisting of simple Legendre polynomial series, as functions P xy (x, y) × P z (z), P xy (x, y) + P z (z), and as P xyz (x, y, z). Other illumination structures -such as a radially or azimuthally striped ILM for spinningdisk confocals -could also easily be incorporated into PERI's framework. How well do these functional forms fit to experimental data for a line-scanning confocal microscope? We acquire blank images of a water-glycerol mixture as a function of depth and fit this data with Barnes illuminations of the form Eq. S7. As a function of the number of Barnes points in x and the polynomial degree in y, we look at the magnitude and patterns of the residuals. In Fig. S2, we see large scale structure in the ILM residuals, suggesting that high-order polynomials and Barnes interpolants with a large number of points are necessary. Fitting out the low-order background reveals the find stripes in x emerge due to the line-scan nature of our machine. Finally, at higher orders of interpolants and polynomials we are able to adequately capture all illumination variation independent of depth into the sample.
Fitting the ILM correctly is essential for finding the correct particle positions and radii. Fig. S3 demonstrates the effect of featuring a real confocal image with an illumination field of insufficient order. In the left panel is an image featured with a high-degree polynomial illumination of 9 th order in the x-direction and of 5 th order in the y-and zdirections. While these polynomials are high-order, they are not high enough to capture all of the structure in the light illumination. There is a clear bias in the featured radii, with particle radii being systematically larger on the edge of the image and smaller in the middle. These biases arise from large stripes in the confocal illumination due to the line-scanning nature of our confocal. Using a higher-order 25 th degree polynomial in the x-direction (upper right panel) eliminates the effect of these stripes, as visible in the featured particle radii plotted as a function of x in the bottom panel. Note that the particle radii may be biased by as much as 1 px or 100 nm due to effects of the spatially varying illumination field.  (9,5,5) order polynomial in (x, y, z). In the foreground are the featured particle radii, color-coded according to their difference from the mean. In the background is the residuals of the featured image. Clear stripes are visible in both the featured radii and the residuals. The particles are systematically much larger on the left side of the image, before decreasing in size in the middle and increasing again in a small stripe on the image's right side. In contrast, when the image is featured with a higher-order (25,5,5) degree polynomial, shown in the upper right, these systematic residuals disappear. The bottom panel shows the particle radii and image residuals for the two illumination fields as a function of the image x direction.

C. Point spread function
Due to diffraction, the illuminating laser light focused from the microscope's lens and the detected fluorescent light collected from the sample are not focused to a single point. Instead, the light is focused to finite-sized diffractionlimited blur. To reconstruct an image correctly we need to account for the effects of diffraction in image formation.
A confocal microscope first illuminates the sample with light focused through the microscope lens. The lens then collects the light emitted from fluorophores distributed in the sample. As a result, the final image of a point source on the detector results from two separate terms: an illumination point-spread function P ilm that describes the focusing of the incoming laser light, and a detection point spread function P det that describes the focused fluorescent light collected from the emitted fluorophores. Since a fluorophore is only imaged if it is both excited by the laser illumination and detected by the camera, the resulting point-spread function for a confocal with an infinitesimal pinhole is the product of the illumination and detection point-spread functions: P (x) = P ilm (x)P det (x). For a confocal with a finite-sized pinhole, this product becomes an convolution over the pinhole area. The two separate point-spread functions (PSFs) P ilm and P det can be calculated from solutions to Maxwell's equations in the lens train [S20-S23]. The PSFs can be written as integrals over wavefronts of the propagating light.
An additional complication arises from the presence of an optical interface. Most microscope lenses are essentially "perfect" lenses, creating a perfect focus in the geometric optics limit. However, refraction through the optical interface destroys this perfect focus and creates an image with spherical aberration. In addition, the refracted rays shift the point of least confusion of the lens from its original geometric focus. For a confocal geometry, this spherical aberration and focal shift depend on the distance of the nominal focal point from the optical interface z int .
All of these effects have been calculated in detail by many previous researchers [S20-S23]. The PSFs depend on several parameters: the wave vectors of the incoming and outgoing light k in and k out , the ratio of the indices of refraction n sample /n lens of the sample and the optical train design, the numerical aperture of the lens or its acceptance angle α, and the distance focused into the sample z int . For completeness, we repeat the key results here. In polar coordinates, the illumination PSF P ilm (ρ, φ, z) for illuminating light with wave vector k in traveling through a lens focused to a depth z int from the interface is [S20] Here τ s (θ ) and τ p (θ) are the Fresnel reflectivity coefficients for s and p polarized light, J n is the Bessel function of order n, and θ 2 is the angle of the refracted ray entering at an angle θ (n 2 sin θ 2 = n 1 sin θ ). To derive this equation from equation (12) in Ref. [S42], we used the additional assumption that all distance scales in the image (including z int ) are small compared to the focal length of the lens. The corresponding detection PSF P det is identical to P ilm except for the removal of the √ cos θ and the replacement of k in by the wave vector of the fluorescent light k out . For an infinitesimal pinhole, the complete PSF is the product of these two point spread functions: The expressions in equations S8-S9 are for a perfect pinhole confocal, whereas our confocal is a line-scanning confocal. While there have been several works describing line-scanning confocals [S25, S26], these authors have treated where the line is focused onto the sample by a cylindrical lens. In our confocal, however, an image of a line is focused onto the sample through the large-aperture objective lens. As such, the illumination PSF in equation S9 is replaced by the integral of the detection PSF over a line in the x direction. We use this model for a line-scanning point spread function with aberrations as our model for our exact PSF, fitting the paramters that enter into equations S8-S9. These parameters are the acceptance angle α of the objective lens, the wavelength of the laser, the ratio of energies of the fluorescent light to the excitation light, the index mismatch n 1 /n 2 of the sample to the optics, the position of the optical interface z int , and the amount that the lens is moved as the scan is rastered in z. In principle, other details could be included -polychromaticity and distribution of the fluorescent light, finite pinhole width of the illuminating line, etc. -but we find that these parameters are both relatively unconstrained by the fit and have little impact on the other reconstructed parameters, such as particle positions and radii.
In addition, for initial featuring we occasionally use a Gaussian approximation to the PSF. Based on calculations of the exact PSF, ≈ 90% of the function can be described by a Gaussian [S22]. We verified this for PSFs calculated from Eq. S8, and found that although the presence of aberrations from the interface worsens the Gaussian approximation, generally a Gaussian accounts for ≈ 90% of the PSF except for in the most aberrated cases (large index mismatch imaging deep into the sample). Our simplest approximation of the PSF is as an anisotropic Gaussian with different widths in x, y, and z, with the widths changing with distance from the interface. We therefore parameterize the Gaussian widths as a function of depth, where each width σ i (z) is described by a polynomial in z, typically a second order Legendre polynomial.
FIG. S5: PSF generated biases. Using an incorrect point-spread function results in significant biases, as PSF leakage affects neighboring particle fits. Moreover, since the PSF gets significantly broader with depth, using a spatially constant PSF, there are systematic biases with depth in both the z positions (left panel) and a characteristic drift in the fitted radii errors with depth (right panel), as shown for the delta-function (identity), an (x, y, z) anisotropic Gaussian, and a depth-varying Gaussian point-spread function. In contrast, using the correct Chebyshev PSF eliminates the errors in both the radii and z positions (data points forming thin orange line). Figure S5 shows the effects of ignoring these details about the point-spread function on the extracted positions. We generate confocal images using a simulated, exact PSF with random distribution of particles up to a depth of FIG. S6: Experimental background image. The measured background from our line-scan confocal microscope captured by adjusting the exposure to a full brightness image, removing the sample, and capturing a set of images with no illumination including room lights. Note that the range of values is from 1 to 7 out of a maximum 255 given by the 8-bit resolution of the CCD. While only a variation of 3%, we have seen in the illumination field section that this can create a bias that significantly alters our inference as a function of the position in the field of view. To remove this bias we fit the background field to a low order polynomial and add it to our model image.
30 µm. Featuring this data using a 3D anisotropic Gaussian, we find a strong depth-dependent bias in the featured z position and radii measurements. Using a low order z-dependent Gaussian PSF decreases this bias only slightly. Interestingly however, ignoring the effects of diffraction completely and replacing the PSF with a Dirac delta-function does not cause significantly worse results than treating the PSF as a spatially-varying Gaussian. As shown by Fig. S5, an exact PSF is required to locate particle's positions and radii to within 20 nm (0.2 px). Therefore, we employ the full line-scan PSF calculation into our model.
The point-spread function defined in equations S8-S9 decays extremely slowly with z and somewhat slowly in ρ. To accurately capture these long-tails of the PSF in our generative model, we calculate the PSF on a very large grid for convolutions, corresponding to ≈ 40×25×30 px or ≈ 6×3×4 µm in extent, which is considerably larger than the size of the 5 px radii particles. The long tails of the PSF bring information about structure far outside the image into the image region. As such, our generative model is defined not only in regions corresponding to the interior microscope image but also in an exterior padded region, which is cropped out when comparing to the model. For completeness, we still define the ILM and Platonic image (including exterior particles) in the exterior padded region; however parameters confined to this exterior region of the image are relatively unconstrained. We make up for this loss in speed due to the increased size by doing an extremely accurate but approximate convolution based on Chebyshev interpolation, as described in a future paper.

D. Background
Due to background, the detector CCD pixels always read a non-zero value even when there is no light incident on them. We incorporate this into our generative model by fitting a nonzero background level to the images. Ideally, this background would be constant at every pixel location. Empirically, however, we find from blank images that this background varies with pixel location in the detector (see Fig.S6). For our confocal microscope, we find the background is slowly-varying in the optical plane, perhaps due to different dwell times for different regions of the line scan and different sensitivies of different pixels; the background does not vary in z. As a result, the background is well-modeled by a low-order polynomial in x and y.
However, due to the long-tails of the PSF, the coverslip slab affects the image in a much larger z region than that of a typical particle. Rather than dealing with this by using an even larger point-spread function, we use the calculated point spread function to capture the effects of the PSF's moderate tails on the particles and slab, and fit a polynomial in z to capture the residual slab correction. This residual correction is mathematically the same as a background level in the detector. As a result, while the "true" background in the image is P (x, y), our model uses a background P (x, y) + P slab (z), as the coverslip is usually oriented along the z direction.

E. Sensor noise
The last feature of the generative model is our understanding of the unrecoverable parts of the image: noise. To study the intrinsic noise spectrum of the confocal microscope, we subtract the long wavelength behavior from the blank image of Fig. S2. After removing the background we find that the noise appears white and is well approximated by a Gaussian distribution (see Fig. S7). There are, however, some highly localized non-Gaussian parts to the noise spectrum, arising due to the specific nature of our confocal. For instance, at high scan speeds slight intensity fluctuations in the laser's power couple to the dwell time on each stripe of line-scanned pixels. This produces periodic stripes across the image with a wavevector mostly parallel to the scan direction, but with a random noisy phase. How can we handle these sources of correlated noise and do they affect the quality of our reconstruction?
In principle, these correlated noise sources can be represented in the Bayesian model by introducing a full noise covariance matrix. That is, instead of writing that log-likelihood as the product of all pixel values, we can write where Λ −1 ij is the covariance matrix between each pixel residual in the entire image. In our optimization, we would form a low dimensional representation for this covariance matrix and allow it to vary until we find a maximum. In doing so, we would reconstruct the image and the correlated noise simultaneously. In practice, this introduces a large computational overhead due to the need for a full image convolution during each update as well as many new free parameters that need to be optimized.
Therefore, when desired we address the effect of correlated noise by working in reverse -we identify the several intense Fourier peaks in the confocal noise spectrum and remove them from the raw data before the fitting process. An example of this noise pole removal is given in Fig. S7. There, we can see that removing only 5 distinct poles ( Fig. S7(d)) removes almost all visible correlated noise structure while changing the overall noise magnitude by a negligible amount. This small shift in estimated noise magnitude only affects the estimate of the errors associated with parameters such as positions and radii in a proportional way. Since these errors are very small and do not bias our inferred parameters, we often ignore the confocal's noise poles in our analysis entirely.

IV. MODEL CONSIDERATIONS
Here, we investigate several complexities of image formation in confocal microscopes and systematically analyze whether or not it is necessary to include them in our generative model. In particular, we will first analyze how much complexity we must introduce into the model elements listed in the previous section, including the platonic image, illumination field, and point spread function. We will also look at elements of image formation which we have not explicitly included in our model. First, confocal microscopes build a 3D image by rastering in 1, 2, or 3 dimensions (see section III). There is noise in this rastering procedure that affects the image formation process. Second, The final image that comes from this scan is a cropped view of a much larger sample; the edges of this cropped image are influenced by the excluded exterior particles. Third, while the actual distribution of light intensity is a continuous field, the detector only measures a pixelated representation of this field. Fourth, while the exposure is made by the camera, particles undergo diffusional motion, blurring their apparent location. In this section, we address each of these image formation complexities and their effects on the inferred parameters.
We would like to systematically investigate at what level omitting a detail of the image formation from the model affects the fitted parameters. We can understand this quantitatively by examining the optimization procedure. Let us assume that the true image formation is completely described by a set of N parameters Θ. Then, near its maximum, the log-likelihood is approximately quadratic: log L = 1 FIG. S7: Noise spectrum (a) Real-space plot of residuals representing the intrinsic noises generated in line-scanning confocal microscopy. This noise spectrum was generated by subtracting the background from a blank sample as in Fig.S2. Notice that while most of the signal appears to be white noise, there is a systematic modulation along the x coordinate and high frequency features in the y scan direction. (b) Fourier power of the noise spectrum given in (a). The high frequency modulation can now be seen as two small 'poles' in the Fourier power spectrum. Note the dark box in the center of the spectrum is created by subtracting the high order polynomial background from the blank image. In (c) and (d) we present the real and Fourier space noise after removing several discrete peaks in the Fourier intensity that represent correlated noise sources. The removed signal can be seen in (e) showing the stripes created by the scanning nature of the confocal microscope.
In (f) we show the histogram of residuals from (a) and (c). In solid red we plot the data and in dashed black lines we plot a Gaussian fit to the residuals with a width σ = 0.0398, showing that the noise spectrum is well approximated by a Gaussian distribution after taking into account long wavelength background features.
FIG. S8: Component complexity residuals. Here we visually demonstrate the results of choosing between different forms of model components as well as different parameterizations of a single component. We generate simulated microscope images using the model components which we employ when fitting experimental data (left column) and fit them with different choices of platonic image (top row), illumination field (middle row), and point spread function (bottom row). Each choice is labeled above its panel showing the residuals and each row is on a common color scale. In the case of the platonic forms, the boolean cut, linear interpolation, and constrained cubic display higher order multipole errors while the logistic function's first correction is to the monopole (volume) term (as shown by the presence of rings). In the illumination field, stripes are present in the residuals until we use a Barnes interpolant with 30 control points. Past that, the ability to capture intensity as a function of depth is the remaining term which we are able to fit with a single extra Legendre polynomial in the z-direction. Finally, in the case of the PSF, we see hard boundaries transitioning to softer boundaries using a Gaussian PSF in both 3D and 3+1D. The residuals all but disappear when the image is fit with our exact line-scan confocal PSF model (Eq. S8) approximated by a Chebyshev polynomial in 3+1D.
M parameters, which for convenience we denote as θ. Thus we can write the log-likelihood as three separate terms: The first term, containing only the parameters θ that we are fitting, is the quadratic in the reduced space, with a maximum at the true parameter values. The unimportant third term reflects the separate contribution to log L of the unknown or ignored portions of the model, and is constant in the θ space. However, the second term mixes both the fitted parameters θ and the unknown parameters Θ j . This mixing results in a linear shift of log L in the θ space away from the true parameters, and causes a systematic bias due to an incomplete model. Minimizing log L with respect to θ gives the fitted values of the parameters gives an equation for the best-fit incomplete model parameters θ: whereH −1 is the inverse of the sub-blockH of the Hessian matrix H that corresponds to the fitted parameters θ. We can use equation S13 to estimate the effect on one of the estimated parameters θ j , if we ignore one aspect of the generative model Θ k . Ignoring the off-diagonal terms in H −1 to capture the scaling gives θ j ≈ H kj Θ k /H jj . Thus, the error in the fitted parameter θ j is proportional to both the coupling H kj between that parameter and the ignored aspect of the generative model, and the magnitude of the error of the generative model Θ k .

A. Component complexities
There are several choices one can make concerning the form and complexity of each of the components of our model image. As discussed in the Section III, we have implemented many forms of the platonic image, illumination field, and point spread function and each one of these forms has a varying number of parameters with which to fit. How do we decide which form to use and at which complexity (number of parameters) to stop? To decide on a per-image basis, we could employ Occam's factor, which is a measure of the evidence that a model is correct given the data [S43]. In practice, however, we are mainly concerned with how these models influence the underlying observables which we are attempting to extract. That is, we wish to use knowledge of the physical system to check which model best predicts the particle locations and sizes. To do so (as mentioned in the main manuscript), we often turn to particle sizes versus time as well as particle overlaps, both physical statements that assert almost no assumptions on our system.
We can also get a sense of the magnitude of the effect these choices have on inferred positions and radii by creating synthetic data and fitting it using a simpler model. In Fig. S8 we show the residuals of such fits for various simplifications made to the platonic form, illumination field, and point spread function. In the left columns of the figure we see the reference image formed using the most complex image model available and in each row the residuals for each choice with a description of that choice above the panel. For all but the last column, in which we fit the image with the exact model once again, we can see systematic errors in the fit. We compute how much these residuals influence the extracted positions and radii and report these errors in Table I. In particular, most choices of platonic image aside from the naive boolean cut do not influence particle featuring below an SNR of 30. However, the complexity of the illumination field always matters until all long wavelength structure is removed from the image. Finally, the choice of PSF is crucial, requiring the use of a calculated confocal PSF to even approach the CRB.

B. Scan jitter
Confocal microscopes operate by taking an image with the lens at a fixed z position to create one layer of the three-dimensional image, then moving the lens up a fixed amount to take the next layer. In our generative model, we assume that these steps of the lens (and the resultant image slices) are perfectly equally spaced by an amount which is fitted internally. However, a real confocal microscope will have some error in the vertical positioning of the lens. As a result, the actual image taken will not be sampled at exactly evenly spaced slices in z, but at slices that are slightly shifted by a random amount.
To test the effect of this z-scan jitter on our parameter estimation, we simulate images taken by a confocal microscope with imperfect z-positioning. Instead of sampling the image at a deterministic z position, we instead sampled the   Fig. S8. Note that while the components with the largest impact on determining underlying parameters are the ILM and PSF, the choice of platonic image cannot be ignored in order to reach the theoretical maximum resolution. Interestingly, in the case of PSF selection, Gaussian(x, y, z, z ) (3+1D) is almost no better at extracting particle positions than Gaussian(x, y) (2D). However, its ability to extract particle sizes increases by 3 since it takes into account the variation of the PSF in space. Additionally, in the case of the ILM, capturing the stripes in the illumination using a 30 control point Barnes increases the resolution by 3 whereas capturing the illumination's dependence in depth causes the resolution to increase 10 fold.
image at a z position shifted from the ideal position by an uncorrelated Gaussian amount of varying standard deviation. A representative image of a 5 px radius particle with a step positioning error of 10% is shown in Fig. S9(a). There is very little difference between this image with z jitter and the perfectly-sampled image, as shown by the difference image in panel b. We then fit an ensemble of these images at varying image SNR levels, over a random sampling of image noise, z-jitter noise, and random shifts of particle positions by a fraction of a pixel. The results of these fits are shown in Fig. S9c, showing the actual error in the featured positions versus the size of the z-positioning noise. For our confocal which is equipped with a hyper-fine z-positioning piezo, we expect the z positioning error to be a few nm, or a few percent of a pixel. For a 3% error in positioning, the signal-to-noise ratio must be ≈ 100 for the effects of z-positioning jitter to be comparable to the theoretical minimum effect from the image noise. This small effect of the error is partially due to the large size of our particle. If each z slice of the image is randomly displaced with standard deviation σ, then we expect roughly a σ/ √ N scaling for the final error in the particle's z-position, where N is the number of z slices the particle appears in. A 5 px diameter particle with a 4 px axial point-spread function occupies ≈ 18 difference slices, decreasing the effect of scan noise by a factor of ≈ 4 and putting it below the CRB for our data.
As the error in z-positioning increases, however, the effect on the featured particle positions increases correspondingly. The error due to a ≈ 10% z jitter is comparable to the CRB for image noises of SN R = 20. For exceptionally large z-jitters of 40% the error due to the lens positioning dominates all other sources of error. However, even with this large error in lens positioning, the error in featured positions is still only 10% of a pixel, or about 10 nm in physical units.

C. Missing and Edge particles
The point spread function delocalizes the particle's image over a region larger than the particle's size. As a result, if two particles are close enough together, their images can overlap. This overlapping is a significant problem for heuristics such as centroid fitting, as the true particle centers do not coincide with the fitted centroid. In contrast, The difference between the image with positioning error and a reference image with zero positioning error. The differences between the images are both random and small, for this image no more than 7% of the perfect image intensity. (c) The effect of lens positioning error on featured particle positions, at signal-to-noise ratios of 20, 50, 200, and 500. The solid symbols and dashed lines show the position error for images with imperfect lens positioning, while the solid lines denote the Cramer-Rao bound for an image with no positioning error. At lens positioning errors of ≈ 10% or larger, the error in featured positions from the z-slice jitter dominates that from the simple image noise, even for an SNR of 20. However, the featuring error due to a z jitter of ≈ 1% is less than the error due to image noise, for any noise level than can be captured by an 8-bit camera.
PERI's accuracy is negligibly affected by the presence of a second, close particle, since PERI correctly incorporates close particles in its generative model. The CRB of two touching, 5 px diameter particles increases by only ≈ 3%, and PERI finds particles to this accuracy when close.
However, large systematic errors can affect PERI when one of these particles is missing in the generative model. This situation is illustrated in its simplest form in Fig. S10. If one of the two touching particles is missing from the generative model, then the second particle will be enlarged and drawn into the first particle's void to compensate, as shown in panel b. As a result, the missing second particle will severely bias the fitted positions and radii of the first particle. Figure S10c shows the magnitude of this effect. For particles separated by 1 px or less, significant biases on the order of 0.4 px appear in the identified particle's featured position. These biases matter at essentially all values of the SNR, only being comparable to the CRB for SNR < 1. As a result, it is essential for PERI to identify all the particles in the image to return accurate results. For this reason, we take extra precaution and thoroughly search the image for missing particles before fitting, as detailed in section V.
The biases caused by missing particles appear whether or not the missing particle is located inside or outside the image. As a result, accurately locating edge particles requires identifying all their nearby particles, even ones that are outside the image! We attempt to solve this problem by padding the Platonic and model images and the ILM by a significant portion, and including this padded extra-image region in both the add/remove and relaxation portions of the PERI algorithm. Nevertheless, it is extremely difficult to locate all the particles outside the image, for obvious reasons. As such, there is the possibility for moderate systematic errors to enter for particles located at or near the edge.
Nevertheless, if the exterior particle is identified, PERI correctly locates the interior particle, as shown in Fig. S11. To demonstrate this, we create simulated images of two particles near the boundary of an image. One particle is placed at z = a so that its edge just touches the boundary while the other is placed at z = −(a + δ) on the other side of the border. We plot the CRB of the interior particle and the measurement errors of both PERI and trackpy [S40] as a function of the exterior particle's coordinate in Fig. S11. While the CRB only changes by a factor of 2 as the particles come within contact, the featuring errors grow drastically for traditional featuring methods due to biases introduced by the exterior particle. For this same data set, PERI featuring errors follow the CRB allowing precise The difference image for a bad generative model that includes only the particle on the left. To minimize the effect of the missing right particle, the left particle is drawn to the right and expanded in radius. This effect is visible as the red and blue ring on the right border of the left particle. (c) The error in position along the separation axis, as a function of true surface-to-surface distance, for a model with a missing particle. When the particles are separated by ≈ 10 px the featured particle is located correctly. However, as the particles get closer than ≈ 2 px significant biases start to appear. These biases saturate at a separation of ≈ 0.1 px, corresponding to a featuring error of ≈ 0.4 px.
unbiased featuring of particles at the edge of images.
This apparent conundrum of edge particles presents an interesting positive side-effect. Missing edge particles affect the fits because they contribute a significant amount to the image. As such, we might expect that a particle outside the field of view can still be located very precisely. This prediction is borne out by a calculation of the Cramér-Rao bound, as shown in Fig. S12. Until the particle and PSF fall off the edge of the image (distance > 1R), the CRB remains constant for all particle parameters. When the particle is centered on the image edge (distance of 0), the CRB is twice that of the bulk, intuitively corresponding to a loss of half of the information about the particle. As the volume of the particle leaves the image, the CRB decreases as 1/δ 2 until the particle is no longer part of the image. Interestingly, Fig. S12 shows that the PSF constrains the particle position to within 0.1 px even when the particle is entirely out of the image! If correctly seeded with a moderate guess for the particle position outside the image, PERI will locate the particle to a precision of the Cramér-Rao bound. However, in practice it is very difficult to seed these particles into PERI, as a slight change of the intensity at the image edge could be either a missing particle outside the image or a slight variation in the ILM near the image edge. Nevertheless, PERI is very good at locating particles that are partially outside the image.

D. Pixel intensity integration
Our generative model considers the image formed on the camera as if the camera pixels had an infinitesimal size. In reality, the camera pixels have a finite extent. As a result, the image at each pixel on the camera is not a discrete sampling of the light intensity, as in our generative model, but is instead an integration in the detector plane over the pixel's size.
To check whether the effect of pixel integration matters, we generated images that were up-sampled by a factor of 8 in the xy-plane. We then numerically integrated these images over the size of each pixel. A representative image is shown in Fig. S13a. There is very little difference between the xy-integrated image and the generative model, as visible in panel b. We then fitted an ensemble of these xy-integrated pixel images, both over an ensemble of noise samples and over an ensemble of particle positions shifted by a random fraction of a pixel. The results are shown in Fig. S13c. We find that there is no discernible effect of pixel integration at a SNR of 200 or less. The error due FIG. S11: Influence of particles outside of the image. Here we place one particle at x = a and a second particle at x = −(a + δ) so that one is completely inside the image and the other outside. We plot the CRB for the x, y, and z positions and radius a of the interior particle as well as measured errors for PERI in triangles and a centroid algorithm (trackpy [S40]) in circles as a function of the position of the second particle. When the exterior particle is further than a pixel outside the image we see that the measurements of the interior particle are constant. However, as the PSF of the exterior particle begins to overlap the interior particle the CRB and all measured errors increase dramatically. While PERI's measured error continues to follow the CRB, trackpy's error increases beyond pixel resolution. Note that pixel separations at the edge are generic in colloidal images especially in dense suspensions.
to neglecting pixel integration becomes comparable to that due to noise only for SN R ≥ 400, which is significantly higher than the maximum allowed by an ordinary 8-bit camera. Thus, the effect of integrating over a pixel size for a colloidal particle essentially always has a negligible effect on the fitted parameters.

E. Diffusional motion
A typical colloidal particle is not fixed in its location, but diffuses about due to Brownian motion. For an isolated colloidal particle, this Brownian motion results in a random walk with mean displacement x = 0 and a mean-square displacement x 2 = 6Dt that is linear in time, with a diffusion constant D = kT /6πηR where η is the solvent viscosity and R the particle radius. As a result, the microscope takes an image not of a colloidal particle at a single position, but of an integrated image of the colloidal particle over the trajectory that it has diffused.
First, at what length-and time-scales is a colloidal particle de-localized due to Brownian motion by a scale that is larger than the resolution? For a 1 µm diameter particle in water to diffuse the 1 nm resolution provided by PERI takes a fantastically small time of t = 1 nm 2 /D ≈ 10µs. Even for our relatively viscous samples of ≈ 80% glycerol and 20% water this time slows down to only ≈ 600µs. These times are orders of magnitude faster than the ≈ 5ms required by our confocal to take a 3D image of the particle, corresponding to a 8 nm displacement. Thus, a freely diffusing particle has always diffused much more than the featuring errors than the uncertainty intrinsic to PERI.
However, this does not mean that the precision past 8 nm is empty. The particle's positions are Gaussian distributed about its mean value during the exposure time. While the extent of the distribution is much larger than the PERI featuring errors, the particle's mean position during the exposure time is well-defined. Moreover, the actual image on the camera from the diffusing particle is a convolution of the particle's trajectory with a single particle image. Since this convolution is like an averaging, we might expect that the small Brownian excursions are averaged out in the image formation, and that the image allows for accurate featuring of the particle's mean position.
We can use the formalism of Eq. S13 to show that Brownian motion does not affect our featuring accuracies. Let the particle's mean position bex 0 , and its Brownian trajectory be x 0 (t). Then the actual image I(x)on the detector FIG. S12: CRB of edge particles. Here we calculate the Cramér-Rao bound of the x, y, and z positions as well as radius (in red, blue, green, purple respectively) for an isolated particle as a function of its distance to the edge of the image. For positive displacement (inside the image) we see very little change with position as expected. As parts of the PSF leak out of the image (displacements close to zero, positive) we see that the expected error increases slightly since information is lost. Finally, as the particle itself leaves the image, information is lost more dramatically as indicated by a sharp rise in the CRB. However, note that even at a displacement of one radius a, the PSF allows us to locate the particle outside of the image to within a pixel. While in practice it is difficult to identify these particles systematically, their presence can greatly influence the measured positions of other edge particles. is where I 0 (x) is the image of one particle at position x and t exp is the camera exposure time. As before, we view the actual image as I(x) = I 0 (x 0 ; θ) + (1 − Θ)∆I, in terms of a group of fitted parameters θ and an additional parameter Θ describing the effects of Brownian motion ∆I. For the true image Θ = 0 but for our model image Θ = 1. Then equation S13 says the error will be θ j ≈ H kj /H jj , where H Θj = ∂ Θ ∂ θj I = ∂ θj ∆I. However, for small displacements the effect of Brownian motion on the image is since ∂I(x 0 )/∂x i does not depend on time. As a result, ∂ θ k ∂ Θ ∆I = 0 and there is no affect of Brownian motion on the image to first order in the displacements, i.e. when the particle displacement is moderately small compared to the radius. Finally, in Fig. S14 we show empirically that the effect of Brownian motion is negligible for our exposure times. To create an image of a diffusing particle captured by a slow camera, we simulated a 200 point Brownian trajectory of a R = 5 px radius particle, generating an image for each point in the particle's trajectory. We then took the average of these images as the noise-free image captured by the microscope. One such image is shown in Fig. S14a. Once again, there is a slight difference (10%, as shown in panel b) between the slow image of a diffusing particle and the reference image taken of a particle at a single location. We then fitted an ensemble of these images, over a variety of both Brownian trajectories and noise samples. Figure S14c shows the results of these fits as a function of the mean displacement during the collection τ exposure /(R 2 D), where τ exposure is the exposure time of the camera and D the particle's diffusion constant. Brownian motion has a negligible effect on the featured positions for our experimental images of freely-diffusing particles (camera exposure time of 100 ms and D = 0.007 µm 2 /s corresponding to a 1 µm particle in 80:20 glycerol:water, corresponding to τ exposure /(R 2 D) ≈ 10 −3 ). Interestingly, however, to achieve a FIG. S13: Pixel Integration (a) The xz cross-section of a simulated image of a 5 px radius colloidal particle, where each pixel contains the light intensity integrated over its area instead of sampled at its center. (b) The difference between the pixel-integrated image and a reference image sampled at the center of the pixels. The differences between the images are small (10%) and centered in a ring which has mean 0 and is positioned at the particle's edge. (c) The effect of pixel integration on featured particle positions as a function of particle radius, at signal-to-noise ratios of 20, 200, and 2000. The solid symbols and dashed lines show the position error for images generated with pixel integration and fit without, while the solid lines denote the Cramer-Rao bound for the images (without pixel integration). Integrating over a pixel area has no effect on the featured positions for any SNR compatible with an 8-bit depth camera. The effect of pixel integration only starts to matter for an SNR ≥ 400 (not shown).
higher localization accuracy at a higher SNR of ≈ 200, Brownian motion must be correctly taken into account in the image formation. Incorporating Brownian motion at these high signal-to-noise ratios would allow the teasing out of information about the particle's trajectory from a single image.

V. IMPLEMENTATION
A typical confocal image is roughly 512 x 512 x 100 pixels in size and contains 10 4 particles meaning that the number of degrees of freedom in our fit is roughly 10 7 described by 10 5 parameters, a daunting space to optimize. On modern hardware using the highly optimized FFTW, the typical time for an FFT the size of a single image is ∼ 1 sec. Given this time, a single sweep through all parameters would take an entire week while a full optimization would consume a year of computer time. However, since particles have finite size, we are able to optimize most of these parameters locally with a small coupling to global parameters (ILM, PSF). Additionally, the finite intensity resolution of microscope sensors, typically 8 or 16 bits, allows us to make further simplifications to our model. Here we describe the practical algorithmic optimizations that we have made as well as the optimization schedule that we have devised to quickly reach the best fit model.

A. Partial image updates
First and foremost, we optimize our fitting procedure by working in image updates and only updating parts of the image that are required at any one time. In order to modify the position of one particle by a small amount, the number of pixels that are affected is simply (2a + w) 3 where a is the particle radius and w is the PSF width, both in pixels. For a typical particle, the ratio of this volume to the entire image volume is typically 10 −2 which represents a speed up of the same factor due to the roughly linear scaling of FFT performance with problem size (N log N ). In addition, since the PSF decreases with distance from a particle's center, a localized object produces only a weak FIG. S14: Brownian Motion (a) The xz cross-section of a simulated image of a 5 px radius colloidal particle undergoing strong Brownian motion τexposure/(R 2 D) = 0.01 during the image formation. (b) The difference between the diffusing-particle image and a reference image without diffusion. The differences between the images are small (10%) and are mostly in a ring with mean 0 at the particle's edge. (c) The effect of Brownian motion on featured particle positions as a function of exposure time, at signal-to-noise ratios of 20, 50, 200, and 500. The image exposure time for our camera is located in the shaded grey band for 20/80 water/glycerol and blue band for pure water. The solid symbols and dashed lines show the error between the fitted positions and the mean position in the particle's trajectory, while the solid lines denote the Cramer-Rao bound for the generated images. At our exposure times and SNR of 20, the effects of Brownian motion are small compared to those from noise in the image. Interestingly, for higher SNR or slower exposure times, Brownian motion starts to have a noticeable effect and must be incorporated into the image generation model. signal in regions far away from it. For confocal microscope PSFs, the distance scale associated with this signal change is only a few tens of pixels. Therefore, we employ a technique common applied to inter-atomic potentials in molecular dynamic simulation -we simply cutoff the PSF at this distance scale allowing for exact partial updates. By cutting off the PSF, we are able to incrementally apply image updates in an exact procedure (up to floating point errors). For example, when moving a single particle from x 0 to x 1 , we must simply calculate the local image change given by cf. equation S2, then calculate M + ∆M only in a small local region around the particle being updated. We are able to use similar update rules for all variables except for those effecting the entire image such as the PSF, offset, z scale , and estimate of the SNR. Additionally, in our code, we generously employ the principle of "space-time trade-off" in which we cache intermediate results of all model components and reuse them later in the computation. In particular, we maintain a full platonic image and illumination field, which we update along with the model image. We also cache the calculated PSF so that we may utilize previous results until the PSF is sampled. In doing so, we are limited in our current implementation by the speed of the FFT, which takes 70% of the total runtime.

B. Optimization of parameters and sampling for error
Once an approximate initial guess is obtained by more traditional featuring methods [S44], we optimize the parameters by fitting using a modified Levenberg-Marquardt routine. Our Levenberg-Marquardt algorithm uses previouslyreported optimization strategies designed for large parameter spaces [S32]. However, a Levenberg-Marquardt minimization requires the matrix J iα ≡ ∂m(x i )/∂θ α , which is the gradient of each pixel in the model with respect to all the parameters. For the ≈ 10 5 parameters and 10 7 pixels in our image, this matrix would be many thousand times too large to store in memory. Instead, we construct a random approximation to J iα by using a random sub-section of pixels x i in the image to compute J. This approach works well for the global parameters (PSF, ILM, etc) but fails for the particles, which appear in a relatively small number of pixels. For the particles, we instead fit small groups of adjacent particles using the full J iα for the local region of affected pixels. As the global parameters and particle parameters are coupled, we iterate by optimizing first the globals, then the particles, and repeating until the optimization has converged.
Once the model is optimized, we can employ two different methods to extract the errors associated with each parameter. Since we calculated the gradients J during the optimization procedure, we can use this to find the covariance matrix (J T J) −1 which gives the correlated sensitivities of each parameter. In practice this is the faster method and yields accurate results and, as such, is our method of choice. However, additionally, we may use Monte Carlo sampling to estimate parameter errors. Our Monte Carlo sampler sweeps over each parameter and updates the particle position, accepting or rejecting based on the change in the log-likelihood of the model. We use slice sampling to produce highly uncorrelated samples, allowing an excellent error estimate from only a few sweeps. Our error sampling doubles as a check for convergence. If the log-likelihood increases after sampling, then the optimization has not converged and either more Monte Carlo sampling or more traditional optimization is needed. In practice, when desired we check with ≈ 5 − 10 Monte Carlo sweeps, and ensure that the log-likelihood remains the same or fluctuates by a few times √ N , where N is the number of parameters in the model.

C. Source code
A complete implementation of this method is provided in a Python package called peri, whose source can be found online 2 along with extensive documentation about the particulars of its implementation. Additionally, it is available at PyPI.org, the central repository for Python packages outside of the standard library.

A. Generated Data
We check our algorithm by benchmarking it against physically realistic image models, as shown in Fig. S15. For maximal realism, we generate these images with every model component in equation S2 as realistic as possible. We use our exact calculation for line-scanning confocal microscopes, with physical parameters expected from an experiment. From the structure of our fitted line-scan confocal images, we re-create a random illumination field that closely mimics the power spectrum of our actual confocal. We position the particles randomly, without placing them preferentially on the center or edge of a pixel. Since real images have particles that are also outside or partially inside the image, we generate the image on a large region before cropping to an internal region, resulting in edge particles and particles outside the field of view. 3 We then fit these algorithms both with PERI and with traditional centroid-based featuring algorithms. When we fit these images with PERI we start with initial guesses that are not near the correct parameter values, to ensure that our method is robust to realistic initial guesses. For the centroid featuring methods, there are several algorithms and variants that can be used. We use the most commonly used of these versions, as implemented by Crocker and Grier [S44] in the IDL language. All of these centroid algorithms require the user to select various parameters, such as a filter size for smoothing of the noisy image and a mask size for finding the centroid positions. As is well-known in the colloid community, using the incorrect parameters can result in significantly poorer results. To overcome any possible limitation from using the incorrect parameters, we fit all the possible parameters 4 in the Crocker-Grier (CG) algorithm and use only the ones that produces the best global featuring of the data, as compared to the correct particle positions. (Centroid methods do not accurately find particle radii). Needless to say, an actual experimenter does not have access to the ground truth or to the optimal parameters for the featuring. Moreover, even with these FIG. S15: Accuracy benchmark. We compare the featuring errors of PERI and a traditional centroid (Crocker Grier or CG) featuring method with the optimal featuring parameters. The panels show the featuring errors vs. particle separation (upper left panel), vs PSF aberration through the index mismatch n2/n1 (upper right panel), vs. particle radius (lower left panel), and vs. the suspension volume fraction (lower right panel).
optimal parameters, the centroid algorithm frequently misses a large fraction of particles, even in simple images. As such, we view the centroid featuring errors as unrealistically optimistic and probably not attainable with centroid methods even by experts. The results of these comparisons are shown in Fig. S15.
When two particles are close, their images overlap due to the breadth of the point-spread function. This overlap causes centroid methods considerable difficulty. To compare the effects of PSF overlap on both PERI and CG featured positions, we generate an ensemble of realistic images with isolated pairs of particles at random orientations and at a fixed particle edge-to-edge separations. The upper-left panel shows these results for edge-to-edge separations from 0.01 px to 2.0 px, with a fixed noise scale of about 0.05 of the illumination amount. As the randomly-generated illumination fields vary from image to image, and the illumination varies from region to region within an image, there is not truly a global SNR for all of the images; the fluctuations in this SNR from image to image are the origins of the fluctuations in featuring error throughout Fig. S15. PERI features particles at the Cramer-Rao bound regardless of their separation. In contrast, even at large separations of 2 px, CG has significant errors due to particle overlaps.
index of refraction of the optics n 1 and of the sample n 2 . Moving the ratio n 2 /n 1 away from 1 increases aberration in the image. While increasing the aberrations in the lens negatively affects PERI's ability to feature particles, the localization accuracy always remains excellent. In contrast, CG methods perform poorly throughout, with extremely poor performance as the aberrations increase.
Since the CRB decreases with particle radius, we expect that increasing the particle radius should result in an increase in localization accuracy. The lower-left panel of Fig. S15 shows that PERI's precision improves with increasing particle radius. In contrast, the Crocker-Grier precision worsens with increasing particle radius. We hypothesize this arises due to the flat intensity profile near the center of a large particle, whereas a centroid method assumes that the intensity is peaked at the particle center. As a result, slight noise can significantly worsen a large particle's localization with centroid methods. Conversely, centroid algorithms improve for small particles, performing only 3× worse than PERI's localization accuracy for particles with radius 2 px. For particles small to the PSF size, the image is essentially a single peak, which centroid methods work well for.
Realistic images taken with confocal microscopes consist of particles randomly distributed, occasionally close together and occasionally far apart. To examine the localization in these images, we use a Brownian dynamics simulation to create a random distribution of particles at volume fractions from φ = 0.1 to φ = 0.6. PERI localizes particle positions and radii excellently in all of these images, as visible in the lower-right panel. In contrast, centroid methods perform uniformly poorly, with localization accuracies of approximately half a pixel. Interestingly, these centroid algorithms do not localize significantly worse for dense suspensions despite the presence of more close particles, although they do frequently fail to identify particles.
Finally, we check how the complexity of our synthetic data affects the accuracy of standard featuring methods. In Table II we see, surprisingly, that there is a non-monotonic relationship between positional error and image complexity, becoming optimal when there is significant striping in the image but little variation in depth. However, the rate of missing particles decreases significantly with simpler models and rising to as much as 40% for our most complex model images. The effective resolution of CG is never much smaller than a single pixel in these synthetic tests, most likely due to pixel edge biases.

B. Fixed Particles
Next, we check PERI on a sample of fixed particles. The sample is prepared by first making a dyed solution of 2 µm silica particles in an index-matching mixture of glycerol and water and loading the sample into a sample cell. At the edge of the sample we then add an equal amount of water-glycerol mixture saturated with salt (NaCl) and allow it to diffuse into the bulk of the sample over the course of a two weeks. As the salt diffuses in, it locally reduces the screening length and causes particles to strongly bind together. By letting the salt diffuse into the sample rather than mixing it in directly, the particles are able to sediment first before becoming fixed to each other, creating the dense sediment shown in figure S16. We then image these particles with a five-second delay between images and analyze the resulting images using PERI. The particle positions fluctuate by 2.9 nm, 1.7 nm, and 1.2 nm (median value) for z, y, and x, respectively, bounding the errors from above. (It is possible that some of the particles are not fixed to less than 2 nm.) We find radii fluctuations of 0.8 nm.
FIG. S16: Featuring Stuck Particles. The raw image of the 2 µm sample of fixed particles (left), and the residuals to the fit (right), shown in xy, yz, and xz cross-sections. Not only is the sample is extremely dense, but as the image is quite deep the index mismatch between the sample and the confocal optics creates strong aberrations deep into the sample. Despite these complications, PERI is able to fit this complex image and to accurately locate particles in it.

VII. EXPERIMENTAL DETAILS
To extract the interparticle potential, we use Molecular Dynamics simulations to find P s (δ) and vary the parameters to find the best-fit P s (δ). Since we know the particles' positions and radii via PERI, we seed the simulation with the featured particle positions and radii and relax the particle positions thoroughly before sampling for P s (δ). Using the extracted particle parameters enforces both the correct amount of particle radii polydispersity and the number density of particles. In the simulation we use a standard DLVO potential, consisting of non-retarded van der Waals attractions and Debye-Huckel repulsion [S45], augmented by gravitational settling. The free parameters we fit are the strength of the attraction, the strength and screening length of the repulsion, and the gravitational settling strength; physically these correspond to the Hamaker constant, a combination of the particle zeta potential and salt concentration, and the average particle density.
Since the P s (δ) is measured from the simulation as a histogram with a finite number of samples, each simulated P s (δ) is somewhat noisy. In light of this noise, we use a Nelder-Mead algorithm to find a good initial estimate of the fit parameters. We then refine this estimate of the fit parameters. First, we fit the ensemble of simulations to an approximate model which is locally linear in the fit parameters. We then use this linear model to estimate a new set of best-fit interaction parameters and refine our estimate of the potential; the curve plotted in Fig. 3 of the main text is the P s (δ) generated from the linear model at the best fit parameters. To estimate uncertainties in the fit, we repeat this process by fitting a random subset of half of the simulations to a model and finding the new set of best-fit interactions. Repeating this 1000 times provides an estimate on the best-fit parameters as the mean of these best-fit parameters, and the uncertainty as the standard deviation of those parameters. Finally, we also obtain an estimate of systematic errors due to mis-featuring of particle positions and radii by fixing each particle's radius to be its mean value throughout all the images it is measured in. Surprisingly, fixing each particle's radius to a value that does not fluctuate in time worsens both the reconstrunction and the experimentally-measured P s (δ), producing about three times as many overlaps. This probably arises because in some sense PERI directly measures the particle separations from the microscope image -changing the separation of two particles slightly will considerably change the fraction of fluorescing dye separating them. Nevertheless, this fixed-radius data gives an order-of-magnitude estimate of any systematics in the experimentally-measured P s (δ). In addition, we fit the inteparticle interactions for several different forms of the potential: hard spheres, electrostatic repulsion only, electrostatic repulsion and van der Waals attraction (DLVO theory), and DLVO theory combined with a short-ranged hydrophilic repulsion. The fitted interaction parameters for 3 sets of interparticle potentials, for the positions and radii extracted from PERI ("fitted a") and for each particle's radius fixed to its mean value over the duration of many frames ("fixed a"). The interparticle potentials are: Pure electrostatic repulsion U (δ) = U el e −δ/λ el , full DLVO theory (i.e. electrostatic repulsion and van der Waals attraction) with Hamaker constant A, and full DLVO with an additional, short-ranged hydrophilic repulsion U hyd e −δ/λ hyd . The uncertainties are the uncertainties in the fit only. Table III shows the extracted potential parameters for all the interparticle interactions. Each interaction potential is fit two ways, by allowing the fitted particle's radius to vary with time and by fixing each individual particle's radius to its average value over the frames. With the exception of a pure hard-sphere potential, all of the various interaction potentials equally well-fit the data. In particular, fitting the data with just an exponentially-decaying electrostatic repulsion fits the data no better than including the van der Waals interaction. However, while our data does not exclude a nonzero Hamaker constant, the data is well-fit by Hamaker constants of a few kT. A hydrophilic repulsion is similarily not necessary to fit the data, but our data can accommodate hydrophilic repulsion of a reasonable strength and length scale. Since there are considerably more overlaps in the fixed radii data, we use the interaction potentials from the dataset with radii fitted by PERI as the best estimate of the fitting parameters, and the difference between the fits as an estimate of the systematic uncertainties from imperfect experimental data.