Direct Phasing of Finite Crystals Illuminated with a Free-Electron Laser

Richard A. Kirian, Richard J. Bean, Kenneth R. Beyerlein, Miriam Barthelmess, Chun Hong Yoon, Fenglin Wang, Flavio Capotondi, Emanuele Pedersoli, Anton Barty, and Henry N. Chapman Center for Free-Electron Laser Science, DESY, Notkestrasse 85, 22607 Hamburg, Germany Department of Physics, Arizona State University, Tempe, Arizona 85287, USA European XFEL GmbH, Albert Einstein Ring 19, 22761 Hamburg, Germany Fermi, Elettra Sincrotrone Trieste, SS 14km 163.5, 34149 Basovizza, Trieste, Italy Univerisity of Hamburg, Luruper Chaussee 149, Hamburg, 22761, Germany (Received 31 July 2014; revised manuscript received 5 November 2014; published 12 February 2015)


I. INTRODUCTION
Serial femtosecond crystallography (SFX) [1][2][3] exploits coherent, femtosecond-duration hard x-ray pulses produced by an x-ray free-electron laser (XFEL) to record diffraction patterns from protein crystals on a time scale that is faster than atomic motion [4].This method of "diffraction-beforedestruction" allows for a delivered radiation dose (energy/ mass) that is orders of magnitude larger than the tolerable dose using longer duration exposures because the pulse terminates before the onset of significant atomic motion [5], thereby bypassing some of the known limitations of conventional synchrotron-based macromolecular crystallography.Such high intensity snapshot diffraction patterns allow for room-temperature structure determination of radiation-sensitive biological macromolecules from crystals of just a few hundred nanometers in size.In this way, SFX has enabled the room-temperature determination of novel structures from natively inhibited in-vivo grown protein microcrystals [6] and G protein-coupled receptor micro-crystals grown in lipidic cubic phase [7], and it permits atomic-resolution structure determination in static [8] and time-resolved studies.
SFX requires the averaging of many snapshot diffraction patterns from different crystals in order to measure reliable diffraction intensity to high resolution.These averaged data suffer from the same phase problem as encountered in conventional crystallography.Existing methods for determining diffraction phases require either a known structure of sufficient similarity to produce initial phase estimates [9] or high-resolution data in conjunction with either (1) small molecular structures (about 1000 atoms or less) [10], (2) isomorphous derivative crystals labeled with heavy atoms [11], (3) diffraction data recorded near resonant conditions [12], or (4) other physical modifications of the structure (e.g., by inducing radiation damage [13]).Most of the aforementioned phasing methods are likely applicable to SFX data [14], though some modifications may be necessary in cases of extremely high x-ray fluence [15].The method we demonstrate here differs from all of them and may be applied to SFX data from submicron crystals without restrictions on resolution.
The genesis of SFX is, in part, due to the advances in x-ray coherent diffractive imaging (CDI) [16,17], which has previously shown that lenseless imaging could be performed on single-shot XFEL images taken before the object was destroyed [18].CDI works on the basis that a fine sampling of the continuous coherent diffraction pattern contains enough information to determine the phases of the recorded far-field diffraction amplitudes [19].This information may be used to produce an image of the illuminated object without the use of a lens, at a resolution limited, in principle, only by the wavelength of the radiation and the maximum recorded diffraction angle.
As Sayre pointed out [20], traditional crystallographic diffraction data only measure the diffraction intensity at reciprocal lattice points that satisfy the Bragg condition, which is a factor of 2 less than the critical sampling rate needed to recover an unaliased autocorrelation function of the crystal unit cell.For that reason, the common iterative phasing strategies associated with CDI have only rarely been applied directly to crystallographic data in situations of especially high solvent fraction [21,22].Work by Perutz carried out decades ago [23] aimed to determine oversampled molecular transforms by physically modifying the unit-cell size of haemoglobin crystals, but this approach has not developed into a practical phasing strategy so far.However, early SFX experiments demonstrated strong diffraction from submicron protein crystals that exhibited measurable intensities sampled between Bragg spots without any need to modify the crystals [1].More recent work has shown that the average over tens of thousands of diffraction patterns, each from a different crystal, into a finely sampled three-dimensional average intensity map contains a recoverable signal between Bragg reflections at medium resolutions [24].This is remarkable evidence that SFX data allow one to bridge the divide between the crystallographic phase problem and the continuous Fraunhofer diffraction phase problem.A variety of iterative phasing strategies that have been developed in the context of CDI [25][26][27] can be applied to this three-dimensional average intensity, as suggested previously and demonstrated in simulations [28].Additional simulation studies have been performed that extend the original concept to include the effects of disorder [29][30][31] and have explored the sensitivity of the method to noise [32,33].Furthermore, the effects of different crystal-edge truncations have been investigated to find that they can greatly influence the diffracted intensity between Bragg spots.This turns out to be an important consideration when attempting iterative phasing, and methods to account for it have been suggested [31,34].
The possibility of directly imaging finite crystals via oversampled diffraction was considered prior to the work of Spence et al. [35].As shown by the simulations of Miao et al., iterative phase-retrieval methods may be applied directly to individual (appropriately sampled) finite-crystal diffraction patterns in the same way as they are applied to noncrystalline targets.In practice, the recovery of a threedimensional structure by such means would require (1) a signal-to-noise ratio in single diffraction patterns that is sufficient to support image recovery, (2) a reciprocal-space sampling frequency corresponding to the size of the whole crystal, (3) a resolution limited by the projection approximation (for given wavelength), and (4) tomographic assembly of many real-space images after iterative phase retrieval is applied to an equal number of diffraction patterns, each of which introduces errors into the final real-space structure.The method we (and colleagues) proposed previously [28] avoids all of the aforementioned challenges.As we discuss in the following section, the key difference in our proposed technique is that we decouple the underlying unit-cell transform from the average finitelattice transform.By decoupling these terms, we open the possibility to average many patterns from size-and shapevarying crystals directly in the translationally insensitive three-dimensional reciprocal space.The phase problem may then be solved for the entire data set in a single phase-retrieval step performed directly on a reciprocalspace volume sampled according to the unit-cell size (not the whole crystal size), with signal-to-noise ratio limited, in principle, only by the number of available diffraction patterns.Recent experimental work has indicated the need to merge tens or hundreds of thousands of XFEL diffraction patterns in order to observe intensities between proteincrystal Bragg reflections at moderate resolutions [24]; the low signal detected in individual patterns from protein crystals continues to be the primary challenge to directly applying iterative phasing on a pattern-by-pattern basis.Radiation damage makes this approach particularly challenging at synchrotron sources [36].
In this paper, we present the first proof-of-principle experimental demonstration of the direct determination of a crystal unit-cell structure via the oversampled average over many coherently illuminated nonidentical crystal diffraction patterns, using soft x-ray FEL diffraction data collected from artificial two-dimensional crystal targets.A theoretical description of the method will be given, followed by a description of the samples, experimental apparatus, and a detailed description of the data reduction procedure and phasing algorithm employed.The paper concludes with an analysis of the recovered structures and a discussion of the results.

II. METHOD FOR PHASING SIZE-AND SHAPE-VARYING FINITE CRYSTALS
Consider a finite crystal with real-space unit-cell density, ρðrÞ, that repeats at the periodically positioned real-space lattice points R j , so that the whole-crystal density of the nth crystal may be written as Under the far-field kinematic diffraction approximation, the complex-valued diffraction amplitudes ρn ðqÞ corresponding to the nth crystal are proportional to the Fourier transform of ρ n ðrÞ: where the scattering vectors are defined as q ¼ 2πðs − s 0 Þ=λ, with s and s 0 the outgoing and incoming wave vectors and λ the monochromatic illumination wavelength, and the tilde symbol denotes the n-dimensional Fourier transform of a function: Note that we have neglected multiplicative terms such as the beam polarization factor, detector solid angle, and incident fluence, since these parameters and functions are known and can be corrected for at the data reduction stage if the need arises.
Under the assumption that the crystal is constructed entirely by repeating a common unit-cell density, the unitcell transform in Eq. ( 2) can be moved out of the summation so that the diffraction amplitude can be written as the product of the unit-cell transform and the finitelattice transform: The measured diffracted intensity is proportional to the modulus square of the diffracted amplitude, which we may write as where If we choose two points q, q 0 in reciprocal space that are shifted with respect to each other by an integer combination of reciprocal-lattice vectors q 0 ¼ q þ g hkl , where , where m is an integer.In other words, the squared modulus of the finite-lattice transform is periodic about the reciprocal-space lattice vectors.Moreover, since there is inversion symmetry about the origin, SðqÞ ¼ Sð−qÞ, it follows from the periodicity that there is inversion symmetry about every lattice point: Sðg hkl þ qÞ ¼ Sðg hkl − qÞ.
We now consider the average over many diffraction patterns, which produces the intensity If a sufficient number of patterns contribute to this average, hS n ðqÞi n will possess the same symmetry properties as S n ðqÞ.Spence et al. [28] suggested that the average squared finite-lattice transform can be determined from the average diffracted intensity directly by taking the average of hI n ðqÞi n over all translations q → q − g hkl : If the terms jρðqÞj 2 and hS n ðqÞi n are uncorrelated with each other, then the average product is equal to the product of averages, and hence Spence et al. further postulated that the periodic average over the isolated unit-cell transform would likely produce a flat function, so one might assume the approximation The operation described by Eq. ( 10) is the periodic average over reciprocal-space intensities, which can be accomplished by computing the average peak profile over all reciprocal-space Wigner-Seitz cells (i.e., the smallest primitive unit cell that can be constructed) and then replicating the average profile throughout reciprocal space.
With the squared finite-lattice determined directly from the diffraction data in this way, the finely sampled squared unit-cell transform may be expressed as for which the phases may be determined via wellestablished iterative phasing algorithms [27].Spence et al.
[28] demonstrated this method through simulated diffraction data-here, we extend the proof of principle for this technique by using experimental data from artificial twodimensional crystals under realistic conditions.It is important to note that Eq. ( 11) takes a different form if the crystal density, defined in Eq. ( 4), does not have identical unit-cell content throughout.As discussed in previous work [30,31], for space groups other than P1, it is possible that incomplete unit cells exist at the boundaries of the crystal.If such defects are present, a reformulation of the reconstruction algorithm is required because such crystals do not have a well-defined physical unit cell.In this situation, a reconstruction algorithm might instead seek to determine the molecular asymmetric unit of the crystal, but conventional phasing algorithms developed in the context of CDI are not effective.While a general solution to this problem has not yet been proposed, an approximate solution that works in limited cases has been demonstrated through simulations of crystals bounded by randomized partial unit cells [31].In this paper, crystals bounded by partial unit cells are not considered, although we demonstrate that it is the physical crystal boundary that determines the recovered unit cell, as previously postulated.

III. SAMPLE DESIGN AND FABRICATION
In order to perform a controlled experiment on wellcharacterized targets, we designed finite periodic patterns of Pt islands consisting of simple unit cells containing four symmetry-related elliptical objects, as shown in Fig. 1.The ellipses were positioned with sufficient space between them so as to avoid any complications during fabrication.We designed four different unit-cell configurations, in order to demonstrate the importance of the edge truncation of the crystals.Note that internal densities within each crystal are essentially identical and that the differing unit-cell definitions are distinguished only by the way the edges of the crystals terminate.As we will show, the edge termination determines which unit cell is reconstructed by the iterative phasing procedure described later.For each of the four unit cell configurations, we designed 20 different crystals of rectangular shape with randomly chosen dimensions ranging from 5 to 12 unit cells along an edge.
We fabricated our two-dimensional crystals by deposition with a focused ion beam (FIB).Patterns were deposited in Pt onto 30-nm-thick Si 3 N 4 membranes supported by a Si frame with 100 × 100 μm windows.Pt was deposited via a gas injection system (ðCH 3 Þ3PtðCpCH 3 Þ with an electron beam current of 43 pA at an energy of 5 keV.The crystal unit-cell size was approximately 1.25 × 1.25 μm 2 and the deposition thickness (Pt content of approximately 16%) was about 20 nm.Eighty targets were fabricated in total, an example of which is shown in Fig. 1.

IV. INSTRUMENTATION AND DATA COLLECTION
Measurements were carried out at the coherent diffraction imaging (CDI) experimental station at the DiProI beamline [37] of the FERMI@Elettra FEL [38][39][40].We chose to perform our investigations at an FEL source so that the relevant effects of shot-by-shot wavefront phase and intensity variations would be present in our data.Compared to FELs that operate through self-amplified spontaneous emission, the FERMI FEL x-ray pulses are seeded and thus nearly monochromatic.The narrow spectral width (Δλ=λ ≈ 5 × 10 −4 ) [39] is ideal for our experiments since, strictly speaking, the lattice transform recovered by our method is periodic only for a monochromatic beam.
The experimental station hosts all necessary components and diagnostics for beam cleaning and sample positioning.The 32.5-nm wavelength FEL beam was focused with a Kirkpatrick-Baez (KB) mirror system [41] and entered the chamber through a circular aperture of 5 mm diameter in order to reduce the stray radiation coming from the beamline.The focus of the KB optical system was optimized to a spot size of about 25 μm full width at half maximum using scintillator phosphorus screen and indentations on a Poly(methyl methacrylate) coated silicon wafer placed at the sample plane.The average incoming FEL pulse energy was approximately 27 AE 5 μJ, with rootmean-square shot-to-shot fluctuations of about 20% during our measurements.The number of pulses incident on the sample was controlled by a fast shutter with pulse intensities controlled with a gas cell and Al filters that allow beam attenuation of up to 4 orders of magnitude.For each exposure (both in single-shot mode and in multishot accumulative mode), the beam intensity and spectrum were acquired shot by shot.
Diffraction patterns were collected using a detection system in an indirect configuration where the scattered x rays were reflected onto a CCD (Princeton Instrument MTE2048B; 2048 × 2048 pixels, each 13.5 μm in size) by a 45°multilayer mirror, mounted on a motorized optical gimbal, with a central hole to allow the passage of the direct beam [42].A schematic of this geometry is shown in Fig. 2. At our operational conditions, the 16-bit detector is saturated at approximately 20,000 photons per pixel.In order to improve the signal-to-noise ratio of detected diffraction, a 200-nm Al filter (transmission of 30%, taking into account a 7-nm Al 2 O 3 oxide layer on both sides) was placed upstream of the focusing optics in order to remove the contamination of the 260-nm seeding laser.
The samples were optically aligned onto the FEL beam path using a four-axis manipulator motor stage and a longrange optical microscope.Diffraction patterns were first obtained in accumulative mode, using an attenuated x-ray beam in order to center each target using low-intensity exposures of less than 5 mJ=cm 2 .We verified that there were no signs of significant radiation damage at this intensity by comparing a series of successive diffraction patterns.Best centering was determined by inspecting the Bragg reflections in each exposure and seeking the position that produced the greatest degree of inversion symmetry.The breakdown of inversion symmetry about Bragg reflections indicates that the incident wavefront had a nonuniform phase, and we observed that nearly all patterns contained some degree of deviation from ideal inversion symmetry, even when centering was optimized.This is likely because the maximum size of the crystals was 15 μm × 15 μm, so even the well-centered 25-μm beam did not provide uniform illumination over the crystal.Once the targets were centered, we collected a diffraction pattern at full fluence, which destroyed the target.The final 1=3 of the diffraction patterns were captured at full fluence without careful centering because of time constraints (we captured diffraction from all 80 targets over the course of about one day).
Figure 3 shows a typical raw diffraction pattern at full fluence, along with three additional shots at lower fluence.The typical number of pixels between Bragg reflections was approximately 105.This detector sampling was sufficient to reveal the interference fringes from our largest crystal targets, although, as we describe in the following sections, we did not require these fine features in our analysis.In addition to the effects of the nonuniform  Panel (b) corresponds to that in the full pattern (a), while the additional three zoomed regions (c-e) were recorded from the same target but at lower fluence and with the target shifted by a few microns between shots.The breakdown of inversion symmetry, and its sensitivity to beam position, is evident.The prominent central streaks of intensity, which are due to edge scatter from the Si window, were ignored throughout our data analysis.
illuminating wavefront, apparent Bragg peak asymmetries are partly caused by the underlying unit-cell transform, which can be thought of as a modulation of the finite lattice transform according to Eq. ( 5).

V. DATA ANALYSIS AND REDUCTION
Our data-processing pipeline began with a visual inspection of all patterns.We rejected frames that were clearly damaged or contaminated by foreign debris.We saw no obvious need to mask any malfunctioning pixels, but we ignored the central cross-shaped region of the detector shown in Fig. 3 throughout our analysis since it was contaminated by scatter from the sample Si frame.We also ignored the central region of the diffraction patterns since no intensities were observed because of the hole in the 45°mirror that the direct beam passed through.We assumed the beam center to be the point about which an image had maximum inversion symmetry, which we determined by computing the cross correlation between the image and a copy of itself inverted about its center.(The vector pointing to the peak of this cross-correlation function is twice the length of the vector pointing to the inversion center.)The detector distance of 56.7 mm was determined from the known unit-cell size of the targets.
Since the orientations of each target varied slightly from one to the next, we determined the crystal reciprocal-lattice vectors (relative to the laboratory reference frame) for each pattern via an "auto-indexing" algorithm similar to that described by Steller et al. [43] (details can be found in the Appendixes A and B).Our auto-indexing routine was not vulnerable to indexing ambiguities because we knew a priori the approximate orientation of each target-we would have otherwise needed to include the peak intensity information in the reciprocal-lattice determination [44].
Using the crystal reciprocal-lattice vectors, we remapped the raw intensity data onto a symmetric orthogonal grid by averaging pixel intensities according to their nearest fractional Miller indices h; k.By remapping intensities according to only two Miller indices, we effectively projected the intensities onto a plane in reciprocal space, which is justified for our targets because they are known a priori to be thin; we expect no significant reciprocal-space intensity variation along the normal to the crystal plane.We found that it was also necessary to correct for slight distortions in the 45°mirror that reflected diffraction patterns to the CCD.Details of the remapping procedure are discussed in the Appendix C, and a typical resulting pattern along with predicted and found peaks is displayed in Fig. 4.
The intensity division step in Eq. ( 11) was found to be highly sensitive to the background signal that was present in our data.Moreover, the subtraction of a background estimated from blank Si 3 N 4 targets was not sufficiently accurate for our purposes because the targets had varying amounts of debris that scattered onto the detector, and because the background arising from both optical and x-ray photons was slightly dependent on the position of the sample stage.In order to estimate backgrounds on a frameby-frame basis, we fit a polynomial function to the intensities averaged within small regions centered at the diagonal midpoints between neighboring Bragg reflections, where we expect little diffraction signal from the crystals.The polynomial functions were used to generate an estimate for the background throughout each entire pattern, which was subsequently subtracted.
For each of the crystal types 1-4 shown in Fig. 1, we averaged a total of 12, 11, 6, and 16 full-fluence background-subtracted diffraction patterns, respectively.These counts were lower than the total of 20 patterns that we recorded from each type due to the fact that many fullfluence patterns had a large number of saturated peaks, many of them contained diffraction from debris on the Si 3 N 4 membranes, and a few of them were damaged.Each of the four unit-cell transforms were extracted from their averaged crystal diffraction patterns according to Eq. ( 11), as shown in Figs. 5 and 6.Average lattice transforms were constructed by first identifying the reciprocal-space coordinates of each pixel relative to the center of the nearest Wigner-Seitz cell (specifically, these coordinates are h − h 0 , where h 0 is rounded to the nearest integer), averaging the peak profile contained within all Wigner-Seitz cells and then distributing the resulting intensity profile periodically about the full diffraction field (the result of this procedure is shown in Fig. 5).When calculating the average reciprocal-lattice transform, we ignored pixels falling nearby the direct beam and the streaks caused by the edges of the Si sample support frame (the central cross shown in Fig. 3).We found that the inclusion of patterns with saturated peaks degraded the resulting unit-cell transforms because the resulting truncated Bragg peaks biased the extracted reciprocal-space lattice transforms.

VI. RECONSTRUCTION OF THE UNIT CELLS
Defining UðqÞ as the squared unit-cell transform jρðqÞj 2 that we extracted from the data as described in the previous section (we have one for each of the four crystal types), we employed two projection operators in the iterative phaseretrieval process.The intensity projection PI has the action of bringing the magnitudes of the ith estimate of the unit-cell transform, ρi ðqÞ, into correspondence with the measured intensities: where M is the set of constrained intensities (those assumed to be reasonably accurate).The support projection PS sets the real-space densities to zero in the regions outside of the support S and enforces real and positive values in real space: PS ρi ðqÞ ¼ F P S F −1 ρi ðqÞ; ð13Þ where F is the Fourier transform operator and F −1 its inverse.The real-space component of the projection operator is where Refg is the function that returns the real part of a complex number, and the max function returns the maximum of its two input values.With the two projection operations PI and PS , we utilized the update rule of the hybrid input-output (HIO) algorithm [26], along with the update rule of the error-reduction (ER) algorithm [25], Prior to phasing, we rebinned our unit-cell-transform intensities with a coarse sampling corresponding to four points between each pair of Bragg reflections, producing 5 × 5-pixel Wigner-Seitz cells.This down-sampling step, which reduced noise levels as well as computing time, is justified because the unit-cell transform varies over longer distance scales than the finite-lattice transform.We ignored noisy intensities by forming a meshlike mask M that rejected pixels with noninteger values for either the h or k Miller indices.(We found that a thicker mesh, as well as circular regions surrounding Bragg reflections, reduced success rates in our phasing trials, presumably because the thin mesh provided the best compromise between number and accuracy of intensities.)We used a value β ¼ 0.9 and alternated between HIO and ER, beginning with HIO.Each cycle ran for 20 iterations, with 15 HIO steps followed by 5 ER steps, with the support S updated using the Shrinkwrap algorithm [45] at the end of each cycle.The updated support was generated by applying a threshold to the realspace image estimate ρ i ðqÞ after convolution with a Gaussian kernel, where the width of the Gaussian smoothing kernel was gradually reduced after each cycle.The initial support estimate was generated from a threshold applied to the autocorrelation function F −1 IðqÞ, and the initial estimate ρ1 ðqÞ was taken to be the square root of the full set of measured intensities UðqÞ with uniformly random phases.Each phasing trial ran for 1000 iterations.
Typical reconstructions are shown in Fig. 6.An example of the mesh of constrained intensities along with the final retrieved floating intensities is shown in Fig. 7. Approximately 50%-100% of all phasing trials produced reconstructions that appeared to be accurate.Cell types 1 and 4 produced the best results (nearly 100% for both), perhaps because these two cells were determined from the greatest number of averaged patterns.For cell types 1, 2, and 4, we also found reasonable convergence without the assertion of a real, positive-valued object, though the apparent quality and fraction of accurate reconstructions was reduced in the absence of these assertions.
We first quantified the errors in our phasing trials using an R factor defined as where i are the indices that correspond to the unmasked intensities.The resulting histograms are shown in Fig. 8 and were formed from 500 independent phasing trials.We note that the highest and lowest mean R factors correspond to the cell types with the fewest and greatest number of contributing patterns, respectively, as one might expect.We further quantified the quality of our reconstructions with the phase-retrieval transfer function defined as which effectively measures the consistency of the recovered phases.The phase-retrieval transfer function (PRTF) is sensitive to phase ramps that result from the relative shift of each reconstruction in the image plane.Therefore, prior to computing the PRTF, we shifted each reconstruction to best match a template (the first reconstruction) via the upsampled cross-correlation procedure described in Ref. [46].Since as many as 50% of the reconstructions were visibly inconsistent with the others, we calculated the PRTFs only for the 50% of patterns that correlated best on average with all of the other patterns, as determined by the Pearson correlations computed from all pairs of reconstructions for a given unit-cell type.The PRTFs are shown in Fig. 9, which maintain values of greater than 0.5 to resolutions better than 150 nm.

VII. DISCUSSION AND CONCLUSIONS
The results presented here provide the first experimental proof-of-principle demonstration of the method for phasing coherently illuminated crystals originally proposed by Spence et al. [28].We have demonstrated that this method for decoupling the average crystal-lattice transform from the underlying unit-cell transform is effective when the phase and intensity of the illuminating wavefront is noticeably nonuniform and varies from one crystal to the next.We have also demonstrated that the method works in the presence of noise, to the extent that many of the intensity data are completely unreliable and must be ignored.The subset of data at high signal-to-noise ratio nonetheless provided sufficient information for accurate object reconstructions.A data-processing strategy that copes with nonuniform background signal was described.
Despite the fact that all four crystal types that we fabricated were identical in their internal structures, the significance of the crystal boundaries was clearly demonstrated by the four different unit-cell configurations that we reconstructed from each type.As noted in previous work [31,47], the situation becomes more complicated if the finite crystals are not composed of identical real-space unit cells.While we did not consider crystals bounded by partial unit cells here, the present results suggest that methods for coping with partial unit cells such as the special case presented by Kirian et al. [31] will also work with experimental data of quality similar to those considered here.
It is important to note that our crystal targets have a high "solvent" fraction (regions of uniform density) that generously exceeds 50%, in which case there is sufficient information to solve the phase problem using only the integrated Bragg peak intensities.We indeed found this to be true for simulations of our data, as we were able to reconstruct a unit cell from the Bragg intensities alone using a combination of the HIO [26] and Shrinkwrap [45] algorithms.Likely, we could also obtain reasonable results with real data, though our attempts thus far have not produced accurate reconstructions.However, as discussed elsewhere [31], a high solvent fraction is not a requirement of the presented phasing method; the principle advantage of having access to a continuous, oversampled intensity function is that it overdetermines the phase problem in general, without restrictions on resolution or solvent fraction.The approach presented also allows for a good initial estimate of the object support through the autocorrelation function, whereas previous work has required prior knowledge of a molecular envelope when applying the HIO algorithm to Bragg intensity data from protein crystals with a high solvent fraction [21].
Improvements beyond our present results can be expected if we include knowledge of noise levels in our phasing, as discussed in Refs.[48][49][50].Experimental errors are an important challenge in SFX data analysis in general, and it will likely be necessary to address this in future work aimed at phasing protein-crystal targets that have more complex electron densities and where the signal-to-noise ratios are far lower than in the data considered here.An alternative approach to phasing has been suggested by Elser [51], where only the strong intensities located at the Bragg conditions and the associated intensity gradients are utilized for phase determination.In simulations, this algorithm was shown to be effective at high resolution and at signal-to-noise ratios of 2 in the intensity gradients.
It should also be noted that the interactive target positioning that we carried out at low doses for the majority of our data collection is not possible using existing technology for delivering protein nanocrystals to the XFEL beam [52,53].However, in a protein SFX experiment, a small-angle detector may be used to inspect finely sampled shape transforms, as demonstrated by  Chapman et al. [1].If necessary, poorly positioned targets that are intercepted at severely distorted regions of the illuminating wavefront can be rejected prior to merging the diffraction data.As we have demonstrated here, minor (but clearly observable) phase and amplitude distortions can likely be neglected in most situations where the focal spot size spans several unit cells.This might be understood by the fact that the intensities recovered by averaging many patterns with a randomly positioned, nonuniform wavefront are equal to the convolution between the modulussquared wavefront transform and the modulus-squared object transform.The procedure we have applied here would then produce a convolved unit-cell transform.Realspace reconstructions are possible without the need to incorporate knowledge of the wavefront, provided that this convolution is not severe (i.e., that the beam size is not severely distorted and that it is large enough to span many unit cells).
and thus the predicted peak location is where the min function selects the vector with minimum length (this is appropriate for our data, where the vector normal to the sample plane was nearly parallel to the incident beam direction).The coordinates i hk ; j hk of the predicted peaks in the detector plane are equal to We remapped the raw intensity data I ij onto a symmetric orthogonal grid I 0 hk by averaging values in I ij according to their nearest fractional Miller indices h; k.The mapping from the native detector indices to the Miller indices is given by the expression h ¼ U −1 q ij , where U is the 3 × 3 reciprocal-lattice matrix determined by the auto-indexing routine and h is the column vector ½h k l.
As shown in Fig. 4, there are systematic errors between the predicted and found Bragg peaks in our patterns, which were nearly identical for all diffraction patterns.We concluded that the observed systematic errors were likely caused by small slope errors in the 45°mirror that reflects the diffracted x rays to the CCD because we found no experimental geometry that could reproduce the patterns with reasonable accuracy throughout the detector plane.In order to rectify the discrepancies between predicted and found peak locations, we noted that the offsets between peaks could be reasonably well described with smoothly varying functions.We assumed that the correct indices i 0 ; j 0 (those corresponding to measurements made in the absence of mirror distortions) could be related to the native detector coordinates with the formula ði 0 ; j 0 Þ ¼ ½i þ f 1 ði; jÞ; j þ f 2 ði; jÞ, where f n ði; jÞ are two-dimensional fifth-order polynomial functions.The polynomial coefficients were determined via linear least-squares minimization using the observed residuals Δi m ¼ i 0 m − i m , where i 0 m was taken to be the prediction corresponding to the found peak m.The rectified reciprocal-space vectors are equal to q 0 ij ¼ q i 0 j 0 .This method of determining mirror slope errors, based on diffraction from a periodic calibration object, would also be useful for other coherent diffractive imaging experiments that utilize a similar experimental geometry.

FIG. 1 .
FIG. 1. Top figures: Illustration of the four different crystal types that were designed for the diffraction experiment.The unitcell contents for each type are indicated in red.The full-size crystals used in the experiment had randomly chosen edge lengths ranging from 5 to 12 unit cells, with the two dimensions varying independently of each other (allowing for rectangular crystals).Bottom figure: SEM image of an actual sample, with distance scale indicated in red.

FIG. 2 .
FIG.2.Schematic illustration of the experiment.After passing through the optics and beam-cleaning aperture, the ∼25-μm focused FEL beam illuminates the SiN sample window.The direct FEL beam passes through a hole in the 45°multilayer mirror that reflects the diffracted x rays to the CCD camera.The inset at the top left shows the approximate illumination profile of the FEL beam on a typical microcrystal target.

FIG. 3 .
FIG. 3. (a) Single full-fluence shot of a 2D crystal target on a logarithmic intensity scale.The red rectangular box within the full pattern indicates the zoomed regions shown in panels (b-e).Panel (b) corresponds to that in the full pattern (a), while the additional three zoomed regions (c-e) were recorded from the same target but at lower fluence and with the target shifted by a few microns between shots.The breakdown of inversion symmetry, and its sensitivity to beam position, is evident.The prominent central streaks of intensity, which are due to edge scatter from the Si window, were ignored throughout our data analysis.

FIG. 4 .
FIG. 4. Comparison between found and predicted Bragg peaks (a) and remapped diffraction intensities projected onto a plane in reciprocal space, after correcting minor distortions caused by the 45°mirror (b).The majority of remapped patterns appeared qualitatively similar to this one, with peaks well centered within the predicted locations.

FIG. 5 .
FIG. 5. Extraction of the unit-cell transform.(a) The average over eight diffraction patterns with unit-cell type 4. The intensities in the red box show the extracted average reciprocal-lattice transform, generated by averaging over all Wigner-Seitz cells of the average pattern.A single Wigner-Seitz cell that contributed to this average is indicated by the smaller blue box.(b) The unit-cell transform that results from dividing the average pattern by the average reciprocal-lattice transform.Note that the gridlike patches of noisy pixels are due to the relatively low diffraction signal in those regions.

FIG. 7 .
FIG. 7. The down-sampled mesh of intensities used as intensity constraints during phase iterative retrieval (a), and the combined intensity constraints and retrieved floating values (b).Intensities are displayed on a logarithmic scale.

4 FIG. 9 .
FIG. 9. Phase-retrieval transfer functions as a function of inverse resolution d −1 , formed from the 250 patterns that correlated best, on average, with the other 500 phasing trials.