Two-Dimensional Partial-Covariance Mass Spectrometry of Large Molecules Based on Fragment Correlations

Taran Driver , Bridgette Cooper , Ruth Ayers, Rüdiger Pipkorn , Serguei Patchkovskii, Vitali Averbukh, David R. Klug, Jon P. Marangos, Leszek J. Frasinski , and Marina Edelson-Averbukh The Blackett Laboratory Laser Consortium, Department of Physics, Imperial College London, London, SW7 2AZ, United Kingdom German Cancer Research Centre, Department of Translational Immunology, INF 580, 69120 Heidelberg, Germany Max Born Institute for Nonlinear Optics and Short Pulse Spectroscopy, Max-Born-Straße 2A, 12489 Berlin, Germany Department of Chemistry, Imperial College London, London, SW7 2AZ, United Kingdom


I. INTRODUCTION
The spectroscopic study of the structure and dynamics of large polyatomic molecules poses a formidable experimental challenge, owing to spectral complexity and spectral congestion. A solution to this challenge comes from multidimensional spectroscopic techniques, which reveal correlations between spectral signals, decongest the spectra, and provide supreme capabilities both for studying the mechanisms of molecular processes and for deducing molecular structure. In our work, we develop a new type of two-dimensional (2D) partial-covariance spectroscopy applicable to fragmentations of large molecules, including polypeptide and full protein molecules well over 10 kilodaltons (kDa), thereby increasing the size of the systems amenable to covariance-mapping spectroscopy by 2 orders of magnitude. Contrary to the current state of the art, the new partial-covariance mapping extracts the spectral correlations using only the measured spectrum itself, without relying on any additional measurement channels.

A. Covariance-mapping spectroscopy
Covariance-mapping spectroscopy was developed by Frasinski et al. as a tool for the study of mechanisms of radiation-induced fragmentation of di-and triatomic molecules [1]. The technique is based on measuring the covariance CovðX; YÞ between the intensities of every pair of spectral signals X and Y across a series of ionic time-offlight spectra, where angular brackets denote averaging over multiple spectra. If a pair of spectral signals X and Y are characterized by positive covariance, their intensities fluctuate synchronously across different spectra. This correlated fluctuation indicates that the fragments X and Y originate from the same decomposition process of the parent ion Z: Z→XþY, e.g., CO 2þ → C þ þ O þ . More recently, covariance mapping has found application in x-ray free-electron-laser (XFEL) experiments and has been shown to be effective, for example, in unraveling the decomposition mechanisms of "hollow atoms" (unstable states of matter formed by inner shell ionization by intense x-ray irradiation [2]), study of doublecore-hole states in molecules [3], correlating photoelectron emission with the fragmentation of hydrocarbons in intense infrared laser fields [4], and characterizing the degree of alignment of molecules in the gas phase [5]. In many covariance-mapping experiments, the physically informative fragment-fragment correlations are overwhelmed by spurious correlations of no physical significance ("extrinsic correlations"). The extrinsic correlations stem from fluctuations of experimental parameters, such as the laser-pulse intensity in laser-induced decomposition experiments. These fluctuations lead to the simultaneous increase or decrease of all measured signals resulting in extrinsic correlation of every spectral signal with every other one [6]. The solution to such physical situations comes from partial-covariance (pCov) mapping, where these extrinsic correlation signals can be mathematically removed if the fluctuations of the parameters causing them are experimentally recorded. For example, if one is able to continuously monitor a series of independently fluctuating parameters I; J; … affecting the correlations, the partial covariance is given by where CovðI; IÞ; CovðJ; JÞ; … are the variances of the fluctuating parameters [7]. If the fluctuating parameters I; J; … are not fully independent, the partial-covariance formula involves the inverse of their variance-covariance matrix [7]. Since in many laser-induced ionization and fragmentation experiments the overwhelmingly dominant fluctuating parameter is the laser-pulse intensity, Eq. (2) can often be employed with the laser intensity as a single parameter. Indeed, laser-intensity-based partial covariance has enabled the study of x-ray-induced fragmentation at XFELs under significant intensity fluctuations of the ionizing pulse [5,6]. However, application of partialcovariance spectroscopy to laser-induced molecular fragmentation has been restricted to relatively small systems, well within the 100-Da limit.

B. Bioanalytical mass spectrometry
Mass spectrometry (MS) is the method of choice for the structural analysis of biomolecules, such as proteins [8,9], nucleic acids, lipids, and metabolites. The most common bioanalytical application of MS analysis is identification of protein primary structure, i.e., the sequence of protein's molecular building blocks (amino acids) and chemical groups attached to them, e.g., post-translational modifications (PTMs) [10]. To identify the primary structure of a protein, it is typically first cut into smaller fragments, e.g., using enzymes, to obtain peptides which are subsequently sent to a mass spectrometer via soft ionization techniques [electrospray ionization (ESI) [11] or matrix-assisted laser desorption ionization [12] ]. The resulting ionized peptides can be further fragmented to produce detailed structural information by fragmentation methods such as collisioninduced dissociation (CID) [13], electron-transfer dissociation (ETD) [14], electron-capture dissociation (ECD) [15], as well as various laser-based techniques [16][17][18]. The mass-to-charge ratios (m=z) of the intact peptides and the fragment ions are most commonly determined using iontrap or time-of-flight techniques [19]. Finally, all the spectral information (mass spectra of both peptide ions and their fragments) is pieced together to deduce the sequence of the original protein, using a range of structure-to-spectrum matching methods [20][21][22][23][24].
The success of the analytical MS experiment is determined by the certainty with which it attributes the experimental data to the correct primary structure. Mainstream one-dimensional (1D) MS is striving to increase the fidelity of this structure-to-spectrum assignment through two main avenues: (i) the development of new fragmentation methods [14][15][16][17][18] in order to maximize the variety of structure-specific molecular fragments generated in an experiment and (ii) improving the accuracy and resolution of the mass-to-charge measurement [25,26] in order to distinguish between peptides and fragments of very similar m=z values. Typically, however, a significant fraction of the measured fragment mass spectra fails to be successfully interpreted and matched to the correct peptide and protein sequences [25].
A two-dimensional MS technique, 2D Fourier-transform ion cyclotron resonance (2D FT ICR), which is based on the physical principle identical to that of the 2D exchange NMR spectroscopy, was previously introduced [27,28]. The method provides an analytical advantage through correlating intact peptides with their fragments to avoid the losses associated with isolation of individual unfragmented peptide ions. However, 2D FT ICR does not generate any new physical information that is not available through the 1D MS analysis. Here we report 2D mass spectrometry based on a different physical principle: partial-covariance mapping, which generates an entirely new channel of physical information-fragment-fragment correlations.

II. SELF-CORRECTING PARTIAL COVARIANCE: THE PHYSICAL CONCEPT
Large polyatomic molecules pose a particular challenge for the state-of-the-art covariance-mapping techniques. The complex methods of delivery of large molecules to the gas phase [11,12], as well as the ion-transfer and -fragmentation methods [13][14][15][16][17][18] involve a rich manifold of fluctuating experimental parameters (e.g., electrospray intensity, ion-focusing voltages, ion-trap voltages, gas pressure) producing numerous sources of physically meaningless extrinsic correlations. Identifying and continuously monitoring fluctuations of such experimental parameters, as required by the state-of-the-art partial-covariance spectroscopy, becomes unfeasible.
Here we present a conceptually new type of partialcovariance mapping-self-correcting partial covariancewhich in contrast to any available technique does not require any separate detection channels. Instead, as we demonstrate theoretically and experimentally, the selfcorrecting partial covariance removes extrinsic correlations using the information which is already present in the measured fragment spectrum itself. This is done using a single parameter derived from the fragment intensities where ⃗ X is the vector of the measured intensities of individual fragment ions and f is a finite, nonvanishing function. In this work, we choose the single parameter to be the total integrated intensity of an individual spectrum fð ⃗ XÞ ¼ P i X i , which in the case of MS measurement, translates into the universally available total ion count (TIC). We call the resulting new type of mass spectrometric measurement two-dimensional partialcovariance mass spectrometry (2D-PC-MS).
The basic principle of 2D-PC-MS is illustrated in Fig. 1. The inherent spectrum-to-spectrum fluctuations in the intensities of fragments born in the same [pair of blue signals in Fig. 1(a)] and consecutive [pair of red signals in Fig. 1(a)] reactions follow each other (correlate positively) in a series of individual spectra [see Fig. 1(b)]. 2D-PC-MS identifies such fragment pairs by calculating the TIC partial covariance between the intensities of each pair of fragments (X, Y) across multiple fragment spectra. The identified correlating fragment pairs (X, Y) are connected by positive islands on the 2D-PC-MS map, while the extrinsic correlations are suppressed; see Fig. 1(c). The map is symmetric with respect to the x ¼ y autocorrelation diagonal [gray dashed line in Fig. 1(c)], since if ion X correlates with ion Y, then ion Y necessarily correlates with ion X. Peaks falling on the autocorrelation diagonal represent the trivial correlations of each spectral signal with itself. These peaks are removed for clarity from the experimental plots below.
The physical idea behind the TIC-based self-correcting partial-covariance mapping is that the compound effect of all fluctuations in the experimental conditions is reflected in the total number of fragment ions in an individual fragment spectrum. Since all the fragment signals correlate with the intense fluctuations of the TIC, the resulting strong extrinsic correlations overwhelm the intrinsic fragment-fragment correlations. Once the correlations of fragment intensities with TIC are removed, the physical intrinsic correlations of the fragments with each other are recovered.

III. TOTAL ION COUNT FOR SELF-CORRECTING PARTIAL-COVARIANCE MAPPING
The TIC-based self-correcting partial covariance between each pair of fragment signals X and Y is given by Eq.
In order to provide a theoretical basis for the use of TIC as the single parameter of self-correcting partial-covariance mapping, we extend the statistical theory of covariance for noisy Poissonian processes [29]. The theoretical framework of Ref. [29] is valid when the number of ions trapped in each scan follows a noise-augmented Poisson distribution, which we verify for our ESI experiments; see Fig. S1 in the Supplemental Material [30]. We find theoretically (see Appendix A for details) that the value of the TIC-based self-correcting partial covariance between two fragment ions X and Y is pCovðX;Y;TICÞ ¼ β 2 ν 0 α 1 − ων 0 ; ð4Þ TWO-DIMENSIONAL PARTIAL-COVARIANCE MASS … PHYS. REV. X 10, 041004 (2020)

041004-3
where ν 0 is the average number of parent ions fragmented at each scan, α 1 is the probability that a single parent ion will dissociate to produce fragments X and Y, and β is the instrumental fragmentation-to-detection efficiency (e.g., in a mass spectrometer, the product of fragment ion-transfer and -detection efficiencies) typically required to be approximately 30% or higher for a successful covariance analysis [34]. ω is a small positive constant dependent on the instrumental fragmentation-to-detection efficiency and branching ratios of all fragmentation pathways producing two products (see Appendix A). Therefore, the small additional term −ων 0 , which arises due to the imperfection of the TIC-based partial-covariance correction, is guaranteed to be negative and cannot result in positive covariance between two unrelated signals. The important consequence of this is that TIC-based self-correcting partial covariance is false-positive-free: It may artificially suppress a very weak true correlation, but it can never result in a positive correlation between two fragments which did not originate from the same or consecutive decompositions. In Fig. 2, we demonstrate experimentally the suppression of extrinsic correlations by TIC-based self-correcting partial covariance. The figure displays a representative region of the simple covariance [panel (a), Eq. (1)] and the self-correcting partial-covariance [panel (b), Eq. (3)] maps of CID fragmentation of the triply protonated peptide P1 (VTIMPK Ac DIQLAR 3þ ; each letter designates one amino acid [35], K Ac ¼ acetyllysine). In the simple covariance map obtained using Eq. (1), every fragment of P1 is correlated with every other one [see red islands in Fig. 2(a)]. As we discuss in Sec. I A, this is due to the extrinsic correlations arising from the simultaneous fluctuations of all the fragment intensities caused by multiple fluctuating experimental parameters. On the other hand, the selfcorrecting partial covariance [ Fig. 2(b)] obtained using Eq. (3), suppresses these extrinsic correlations [see blue islands in Fig. 2(b)], exposing the physical correlations between ions born in the same or consecutive dissociation processes. For example, the fragments -PK Ac DIQLAR 2þ and -QLAR þ correlated at the top of Figs. 2(a) and 2(b) contain the same part of the original peptide sequence (QLAR) and therefore cannot be intrinsically correlated products in a decomposition of P1; see Eq. (5a). The suppression of extrinsic correlations by TIC partial covariance reveals true physical (intrinsic) correlations between fragments that are indeed formed in the same (VTIM-þ and -PK Ac DIQLAR 2þ ) or in consecutive (e.g., ½VTIM-− CO þ and -PK Ac DIQLAR 2þ ) decompositions [see Eqs. (5b) and (5c)] FIG. 1. The principle of 2D-PC-MS. (a) A computer records a series of fragment mass spectra ("scans") of a biomolecular ion "PEPTIDE 2þ ," which dissociates along two pathways: blue and red. (b) Because of the statistical nature of the dissociation processes, the signals of each kind of fragment (red and blue) fluctuate randomly from scan to scan. These "intrinsic" fluctuations are superimposed with "extrinsic" ones caused by fluctuations in experimental parameters, which affect all the fragments simultaneously and are reflected in the TIC fluctuations (green). Signals formed in the same (blue) or consecutive (red) dissociation processes follow each other and are positively correlated. However, blue fragments are also correlated with red fragments via the TIC fluctuations. (c) Calculating the TIC-based selfcorrecting partial covariance [Eq. (3)] between each pair of fragment signals produces a map of physical fragment correlations: intrinsic correlations between pairs of blue or red fragments are revealed (black islands), while extrinsic correlations are suppressed (green circles).

IV. 2D-PC-MS: DEMONSTRATION OF PRINCIPLE
We now illustrate a full 2D-PC-MS measurement using the triply protonated peptide P1. Fragmentation of the amino acid sequence of P1 under collision-induced dissociation employed in this work, generates a wide range of fragments of diverse and well-understood types ("terminal," containing one of the two ends of the peptide sequence; "internal," not containing either of the two ends of the peptide sequence; "neutral loss," resulting from elimination of small neutral molecules, e.g., H 2 O or CO from a fragment, etc.) providing ample opportunity to examine fragment-fragment correlations of the triply charged peptide. Figure 3 presents the 2D-PC-MS map of P1 plotted as a 3D view of the relative TIC partialcovariance between every pair (X, Y) of the peptide fragment signals. The signals of related fragments generated in the same or consecutive decompositions of the peptide appear as off-diagonal peaks. Such are, for example, correlations between fragments VTIMPK Ac DIQ-2þ (b 2þ 9 ) and -LAR þ (y þ 3 ) produced by a cleavage of a single peptide bond of P1 or those involving product(s) of a loss of a small neutral molecule, such as between and -MPK Ac DIQLAR 2þ (y 2þ 9 ), as well as correlations involving internal fragments -products of consecutive breaking of two peptide bonds [36], e.g., -PK Ac DIQ-þ (b þ ið5−9Þ ) and -LAR þ (y þ 3 ). The standard nomenclature [37] used for peptide fragments is given in the inset of Fig. 3. See the classification of 2D-PC-MS correlation patterns in Appendix B.
In order to rank the strength of the observed correlations between the pairs of fragments, as well as distinguish low-intensity intrinsic correlations from statistical noise due to finite summations in Eq. (3), we introduce a correlation score based on correlation stability toward resampling for each feature on a 2D-PC-MS map; see Appendix C. We find that approximately 10 3 individual fragment spectra are required to identify the intrinsic correlations by their correlation score; see Fig. S3 in the Supplemental Material [30]. We verify that 2D-PC-MS is capable of correlating fragments spanning more than 3 orders of magnitude of intensity, including signals of such low relative abundances as 0.1% and below (see FIG. 2. Suppression of extrinsic correlations in TIC-based self-correcting partial-covariance mapping. In the CID fragmentation of triply protonated peptide P1 (VTIMPK Ac DIQLAR 3þ ), a particular fragment labeled as PK Ac DIQLAR 2þ (shown on 1D spectrum in the bottom) correlates with several other fragments (shown by blue and red dashed arrows). In simple covariance mapping [panel (a)], fluctuations in experimental conditions induce extrinsic correlations between fragments originating from unrelated decomposition processes of the ionized peptide (the fragment correlates with every other one). Subtracting from the simple map a correction that involves covariance of the ion spectra with the TIC [see Eq. (3)] suppresses these extrinsic correlations (blue arrows). The TIC selfcorrecting partial-covariance map [panel (b)] correctly identifies fragments that are born in the same or consecutive decomposition processes (red arrows) of the given peptide ion.  [30]). The ranked 2D-PC-MS data of P1 reveal that all 50 top scoring correlations correspond to physical decomposition channels of the peptide, i.e., the pathways producing fragments generated in the same or consecutive standard CID decompositions (30 of the 50 top scoring correlations are labeled in Fig. 3). The presence of the predicted intrinsic correlations according to the known amino acid sequence and the peptide fragmentation mechanisms, alongside the observed suppression of spurious signals corresponding to unphysical fragmentation pathways, experimentally validate the TIC-based self-correcting partial covariance. We identify the extrinsic correlations with low correlation scores as violating mass conservation, charge conservation, or covariance-mapping correlation patterns (see Appendix B).
The 2D-PC-MS map of P1 (Fig. 3) enables one to detect the origin of the correlation signals simply on the basis of their geometric position. There are three regions in the map, each containing peaks of different origin.
First are the so-called mass conservation lines (plotted as dashed and dotted lines in Fig. 3) containing correlation peaks of pairs of complementary terminal fragments generated by fission of a single peptide bond and making up the whole sequence once combined (see signals shown in blue). Second are the horizontal and vertical peak series relative to the complementary fragment correlations containing the correlation signals of fragments originating from consecutive decompositions through loss of neutral molecules (e.g., NH 3 , H 2 O, CO) from the complementary ions; see green signals in Fig. 3. Third is the manifold of scattered signals revealing correlations involving fragments originating from breakage of more than one bond between the peptide amino acids; these signals are marked in red in Fig. 3. Breaking two or more bonds between amino acids can also lead to noncomplementary terminal fragment correlations excluding the internal parts of the sequence, such as VTIMPK Ac D þ (b þ 7 ) and QLAR þ (y þ 4 ); these signals are marked in magenta in Fig. 3.   FIG. 3. Fragment-fragment correlations of the peptide ion P1 (VTIMPK Ac DIQLAR 3þ ) revealed on the TIC partial-covariance map. The correlation signals are labeled as (X and Y) pairs of fragments corresponding to the 1D spectra plotted along the back edges of the map. The standard peptide fragment nomenclature used [37] is given in the inset. A single peak on the 1D spectrum belonging to the y þ 6 (-DIQLAR þ ) fragment is resolved into four fragment-fragment correlation peaks along the black dotted line on the map, showing four different fragmentation processes by which it is produced. Apart from the y þ 6 and b 2þ 6 signal (marked by ⊛), the 2D-PC-MS peaks are annotated only on one side of the autocorrelation diagonal for clarity. The blue dotted and dashed mass conservation lines follow correlations (blue peaks) where a single bond breaks in the chain of 12 amino acids of the parent ion. Green correlations mark a loss of small neutral molecules (e.g., water) in the fragmentation process. The red and magenta correlations reveal processes where more than one peptide bond is broken. The map provides much more information on the fragmentation mechanisms and for identifying the parent peptide than the 1D spectrum. Selected correlation signal annotations using single-letter amino acid code [35] are shown in Fig. S2 in the Supplemental Material [30].
We further test 2D-PC-MS on a series of protonated and deprotonated peptide ions with unmodified and PTMcontaining sequences of molecular masses ranging from 989.5 to 1603.0 Da (38 in total) listed in Table I. For each examined peptide ion, the 2D-PC-MS map is generated (see examples in Fig. 4), the scored list of correlating fragments is produced, and the highest-scoring correlations (typically numbering at least 40) are successfully confirmed to be free of any spurious signals. Moreover, we test 2D-PC-MS on ribonucleic acid and DNA oligonucleotides as well as on small proteins in the 10 4 -Da mass range (see Figs. S4 and S5 in the Supplemental Material [30]). These data further validate the TIC-based self-correcting partial covariance and demonstrate that it allows one to apply covariance-mapping spectroscopy to molecules of mass greater than 10 4 Da, without the need for continuous monitoring of the multiple experimental parameters in a complex multistage ESI-CID measurement. The revealed fragment ion correlations provide direct experimental evidence for decomposition pathways of the multiply charged biomolecular ions.

V. 2D-PC-MS: STRUCTURAL SPECIFICITY
2D-PC-MS provides a new kind of structural information which is missing in standard 1D mass spectrometry for sequence analysis: the fragment-fragment connectivity. This connectivity can be used to solve the peptide sequence puzzle by piecing the fragments together, as is the primary aim of analytical mass spectrometry. Here we investigate and quantify the advantage presented by fragment connectivity for deducing a peptide sequence by numerical simulation of  the number of possibilities to assign fragment-fragment correlations of a given peptide to amino acid sequence of a different ("erroneous") peptide of the same or nearly the same m=z. This number expressed as a fragment ambiguity rate (FAR) (see Appendix D) is compared to the number of possible false-positive assignments based on individual fragment ions measured by standard 1D MS. The lower the FAR of a spectral signal (fragment-fragment correlation for 2D-PC-MS or individual fragment for 1D MS), the lower the number of different erroneous peptides that can produce this signal, and thus, the more reliable the signal is for the purpose of identifying the correct peptide sequence. Reducing this FAR is the fundamental driver for the development of higher-mass resolution measurement instrumentation [25,26].
We calculate the relative FARs of fragment-fragment correlations for a sample of 50 000 peptide sequences from protein database UniProt/Swiss-Prot [38] based on typical fragmentation of protonated peptides under CID [39]. These fragment ambiguity rates are calculated by matching the theoretical correlations of each peptide from the randomly selected sample pool against the theoretical correlations of the full set of isobaric (within 7-ppm mass tolerance) peptides of the database. The resulting relative FAR values are compared to the relative FARs calculated for matching individual (uncorrelated) fragments produced by the same sample of the 50 000 peptides to simulate 1D MS analysis. Further details of the calculations are given in Appendix D.
The results of the FAR comparison between 2D-PC-MS and 1D MS are presented in Fig. 5. The FAR data reveal that even at the relatively large 0.8-Da mass tolerance for fragment ion matching, the 2D fragment-fragment correlations involving internal fragments are significantly more sequence specific than the absolute lower limit of what 1D MS can achieve in principle at the theoretical infinite mass accuracy and resolution, for any fragment ion type. This superior sequence specificity is shown by the red bar at mass tolerance 0.8 Da in Fig. 5(b) vs the black bar at zero mass tolerance in Fig. 5(a), whose height is marked by dashed line to guide the eye. The abundance of internal fragments in peptide mass spectra relative to terminal ones is determined by two counteracting trends: (i) the number of theoretically possible internal fragments scales as the square of the sequence length (i.e., much stronger than the number of complementary terminal fragments which scales linearly), and (ii) their production requires more bond cleavages, i.e., is less probable at lower activation energies. As a result, the actual fraction of the internal fragments in a spectrum weakly depends on the peptide sequence length and can reach about 50% of TIC depending on the activation mode [40]. Thus, our simulations of fragment ambiguity rate indicate that for up to half of the fragments, the fragment-fragment correlations provide sequence sepcificity that cannot be matched by 1D MS even theoretically in an "ideal" (infinite mass-resolution and accuracy) measurement. Noteworthy, the internal-ion correlations can sometimes be the only available key to discriminating between similar isomeric structures [41].
The data presented in Fig. 5 show a large difference between the FARs for correlations between complementary terminal fragments vs those involving internal ions. While the complementary terminal-ion relative FARs in 2D-PC-MS [green bars in Fig. 5(a)] are found to be similar in magnitude to those of the individual terminal ions in 1D MS [black bars in Fig. 5(a)], at all mass tolerances, the relative FARs of 2D-PC-MS correlations involving an internal ion [red bars in Fig. 5(b)] are more than an order of magnitude below the 1D MS values [blue bars in Fig. 5(b)]. This profound difference is explained by the following mass conservation argument. In the case of complementary fragments, the masses of which add up to the total peptide mass, a chance matching of one of the fragments ensures the mass matching of its complementary counterpart and therefore also the random matching of the full complementary ion pair. On the contrary, for correlations involving noncomplementary fragment ions, e.g., an internal fragment with a terminal one, the above mass conservation argument does not apply, because a part of the peptide sequence is missing, and matching a random fragment at m=z close (within a given tolerance) to m=z In fact, the ambiguity rate for 2D signals involving internal fragments, even at a modest 0.8-Da tolerance, drops well below the theoretical lowest limit (0-Da tolerance) of the 1D technique for any fragment type (horizontal dashed lines). of one of the true correlating ions does not guarantee matching another fragment at m=z similar to its correlating 2D-PC-MS partner. In other words, the random matching of a terminal-ion-internal-ion correlation requires the simultaneous occurrence of two approximately independent random events, making the random match of a correlation much less probable than a match of a single ion. This reduced probability is reflected in the low 2D-PC-MS FARs for internal-ion correlations [see Fig. 5(b)].

VI. CONCLUSIONS AND OUTLOOK
In conclusion, we introduce a new method for the analysis of decompositions of large ionized molecules: twodimensional partial-covariance mass spectrometry. 2D-PC-MS is based on a conceptually new idea of removing the nonphysical extrinsic correlations using the information contained in the acquired spectrum itself. We experimentally and theoretically validate the use of the total ion count as a single spectrum-derived parameter for self-correcting partial-covariance mapping. The TIC parameter derived from the measured spectrum itself embodies the combined effect of all the known and unknown experimental parameters on each spectral signal simultaneously and serves as a key to unlocking the physical correlations between the biomolecular fragments within the self-correcting partialcovariance paradigm. Our method not only enables but in fact makes it straightforward to obtain 2D maps of fragmentfragment correlations of species 2 orders of magnitude larger than has been possible so far: peptides, oligonucleotides, and proteins in the mass region of 10 3 -10 4 Da.
The question as to the dynamical mechanisms of radiation damage to biomolecules is only now becoming possible to answer with high-brightness ultrafast x-ray sources. This question is important to our understanding of life on Earth as well as on other planets, medical physics, and potential new routes to structure determination. Large molecule fragmentation experiments have already been conducted [42] and planned at x-ray free-electron lasers, with the objective of furthering the time-resolved understanding of x-ray fragmentation pathways that may be important in radiation damage mechanisms in biology and the prospects for single-molecule imaging via the "diffract-before-destroy" concept. While coupling of the linear ion-trap mass spectrometer to an x-ray beam line has recently been achieved [43], the present development paves the way for detailed fragmentation studies within an x-ray facility setting by demonstrating the full compatibility of the self-correcting partial-covariance mapping with electrospray ionization [11] and ion-trapping [19] technologies that are core to biomolecular fragmentation research. Therefore, we expect TICbased self-correcting partial-covariance mapping to provide a direct route to unraveling the complex biomolecular XFEL-induced fragmentation pathways.
In addition to their mechanistic significance, the fragment-fragment correlations revealed by the self-correcting partial-covariance mapping open the way for applying covariance-mapping spectroscopy to solving the inverse problem, namely, the reconstruction of molecular structure from the spectrum. This inverse problem is central to analytical mass spectrometry. Our numerical simulations show that fragment-fragment correlations involving internal fragments of the biomolecular sequence provide a structural fingerprint of unparalleled specificity. The proposed method is developed and demonstrated using a commercial benchtop mass spectrometer that requires no hardware modifications, which we believe will facilitate the wide utilization of 2D-PC-MS as a practical tool. In general, the applicability of 2D-PC-MS depends on sufficiently high and uniform fragmentation-to-detection efficiency. Once this is ensured, 2D-PC-MS can be applied to any chemical species, that is, not only to peptide, proteins, and oligonucleotides considered in this work but also to oligosaccharides, lipids, and metabolites. For the same reason, 2D-PC-MS can be applied in combination with any ionization and activation method, including ultraviolet photodissociation, infrared multiphoton dissociation, femtosecond-laser ionization and dissociation, as well as ETD and ECD.  [29] developed a formal theory of covariance for noisy Poisson processes. For such processes, the event rate follows a so-called "augmented Poisson distribution," which is a Poisson distribution for which the mean event rate ν ¼ γν 0 is itself sampled from a normal distribution, with constant ν 0 the central event rate and γ normally distributed and centered on γ ¼ 1. The statistics are analyzed in terms of the "compound outcomes" of the decomposition of a molecular ensemble [29]. This framework can be used to analyze a covariance measurement on a system with known outcome probabilities for the mechanism being probed and known finite fragmentation-to-detection efficiency. By systematically considering the respective probabilities of each relevant spectral measurement, an analytical expression for the simple covariance between any number of spectral channels is derived.
Here we extend the theoretical approach of Ref. [29] to derive an expression for the partial covariance between two spectral signals where the total number of measured ions (TIC) is used as a single partial-covariance parameter [Eq. (3)]. We are interested in the partial covariance between two fragment ions X and Y for the case of the induced fragmentation of a biomolecular ion. The total number of parent ions subjected to analysis at each scan is assumed to follow the noise-augmented Poisson distribution, which we experimentally confirm in Fig. S1 in the Supplemental Material [30] for our measurements. We consider the following probabilities for the decomposition of a single parent ion: (1) X and Y are created in the parent ion decomposition, with probability P ¼ α 1 . (2) X and another molecular fragment Z that is not Y are created in the parent ion decomposition (e.g., Z is the product of a neutral loss from Y), P ¼ α 2 . (3) Y and another molecular fragment Z that is not X are created in the parent ion decomposition (e.g., Z is the product of a neutral loss from X), P ¼ α 3 . (4) Any two detectable molecular fragments, neither of which are X or Y, are created in the parent ion decomposition (e.g., parent ion dissociates to two other complementary fragment ions A þ B, or parent ion dissociates to terminal fragment ion A, internal fragment ion B, and neutral fragment ion C), P ¼ α 4 (5) The parent ion does not fragment, P ¼ α 5 . We then account for the finite fragmentation-to-detection efficiency β by considering possible measurement outcomes following an elementary fragmentation event. The following elementary measurement events are relevant for the TIC-based self-correcting partial covariance: (1) X and Y are both measured at the detector, (2) Only X is detected (i.e., Y or Z is also produced but are not observed due to the finite fragmentation-todetection efficiency), Only Y is detected (i.e., X or Z is also produced but is not observed), P 3 ¼ ðα 1 þ α 3 Þβð1 − βÞ. (4) X and another ion Z that is not Y (e.g., Z is the product of a neutral loss from Y) are both measured at the detector, P 4 ¼ α 2 β 2 . (5) Y and another ion Z that is not X (e.g., Z is the product of a neutral loss from X) are both measured at the detector, P 5 ¼ α 3 β 2 . (6) Only one ion that is not X or Y is detected (i.e., parent ion dissociates to two different fragment ions A þ B and only A or B is observed, a nonfragmented parent ion is detected, or fragmentation leads to X=YþZ, but X=Y is not detected), Any two ions, neither of which are X or Y, are measured at the detector (i.e., parent ion dissociates to two different fragment ions A þ B, both of which are measured at the detector), P 7 ¼ α 4 β 2 . Following the procedure outlined in Ref. [29], we obtain the final expression for the TIC-based self-correcting partial covariance between X and Y: where α s ¼ α 1 þ α 2 þ α 3 þ α 4 . We observe complete parent ion fragmentation in the experiments performed in this work, meaning α 5 ≈ 0 and the expression simplifies to TIC-based self-correcting partial covariance between two fragment ions X and Y as given by pCovðX; Y; TICÞ ¼ β 2 ν 0 α 1 − ων 0 ; where ν 0 is the average number of trapped ions, and the fragmentation-to-detection efficiency β is typically required to be approximately 30% or higher for successful covariance analysis [34]. Equation (A2) shows that the TIC-based self-correcting partial covariance between X and Y has a positive contribution that is directly proportional to the probability of a parent ion producing the fragments X and Y, and a negative contribution stemming from competing processes.
Given high enough fragmentation-to-detection efficiency and provided the number of dissociation pathways available to the fragmenting molecule is large enough such that α 4 ≫ α 1;2;3 (practically, this is the case for all fragmenting biological polymers), the first term dominates, and the TIC partial covariance is linearly dependent on both parent ion abundance and the branching ratio of the process that produces X and Y together, meaning 2D-PC-MS is suitable for quantitative analysis. The formal analysis of the TIC-based self-correcting partial covariance also shows that it is false-positive-free: If the TIC partial covariance between a fragment pair is positive, to a desired degree of statistical significance, this pair is known to have been produced in the same fragmentation process, to at least the same degree of statistical significance.

APPENDIX B: 2D-PC-MS FRAGMENT CORRELATION PATTERNS
The fragment-fragment correlation pairs (X, Y) revealed by 2D-PC-MS can be classified as siblings (Z→XþY, correlations between products of the same decomposition reaction), niblings (Z → X þ Y 1 , Y 1 →X 1 þ Y, correlations involving a product of further decomposition), cousins (Z→X 1 þY 1 , X 1 →XþX 2 , Y 1 →YþY 2 , correlations between products of further decomposition), etc. In the peptide fragmentations considered in this work, siblings are the complementary terminal ions formed by breaking of a single peptide bond, such as VTIM-þ and -PK Ac DIQLAR 2þ in Fig. 3; niblings include products of a neutral loss of a small molecule (e.g., H 2 O, NH 3 ) from one of the fragment ions, such as ½VTI-− H 2 O þ and -MPK Ac DIQLAR 2þ in Fig. 3 and pairs of correlated fragments including an internal ion, such as -PK Ac DIQ-þ and -LAR þ in Fig. 3; cousins are, for example, products of neutral losses from both of the fragment ions formed by peptide bond dissociation, such as ½VTI-− H 2 O þ and ½-MPK Ac DIQLAR − H 2 O þ in Fig. 3.

APPENDIX C: CORRELATION SCORE
It is useful to distinguish low-intensity intrinsic correlations from the statistical noise present because of the finite summations in Eq. (3). We achieve this discrimination by introducing a correlation score SðX; YÞ calculated by normalizing the volume of 2D-PC-MS signals (X, Y) to the standard deviation σðVÞ of their volume upon jackknife resampling [44]: We rank all the 2D-PC-MS signals according to their correlation scores taken as a percentage of the highest SðX; YÞ off-diagonal peak on the map. The SðX; YÞ score reflects the stability of the 2D-PC-MS correlation signals and is superior to 2D-PC-MS peak height or volume as a measure of true structural signals; see Fig. S7 in the Supplemental Material [30]. The correlation score can be used as a single universal parameter for 2D-PC-MS peak selection across the full m=z range.

APPENDIX D: DETAILS OF THE FRAGMENT AMBIGUITY RATE CALCULATION
For every theoretical fragment ion (1D) or fragment ion correlation (2D) x, the relative FAR value is calculated according to the following expression: where the summation is over all the N P database peptide ions P being within 7 ppm of the m=z of the parent ion of the fragment ion (or fragment-fragment correlation) x. N m ðP; T ; xÞ is the number of m=z matches, within the fragment m=z tolerance, T , between x and all the possible fragment ions (or fragment ion correlations) of the fragmenting peptide P. The resulting relative FARs are averaged over all x of chain lengths between two and 15 residues within a given category, e.g., individual y-type terminal ions or b-ion or internal-ion correlations. The comparison is performed at T values of 0.8, 0.02, and 0 Da (i.e., matching isomeric fragments only, "ideal" m=z measurement) for both 1D fragment ion and 2D fragment-fragment correlation matching.
To perform the FAR analysis, we subject the protein sequences of the UniProt/Swiss-Prot database [38] to in silico tryptic digestion (using the most popular trypsin protease, no missed cleavages are considered). We ignore the 0.47% of UniProt/Swiss-Prot protein sequences with ambiguous amino acid residue coding and limit the digestion products to a minimum length of five residues and a maximum molecular weight of 2000 Da. From the resulting tryptic digest, we select five test sets of 10 000 peptides at random. For each such set, we fragment in silico the 3þ parent ion of each peptide within the set to produce every possible b þ , internal b þ and y þ ion triplet from the parent ion. First, we extract all ion-trap measurable b þ ion and internal b þ ion and internal b þ ion and y þ ion correlations from these triplets (ion-trap measurable meaning that the m=z value of each of the correlated fragments is greater than or equal to 1=3 of the parent ion m=z [45]). Each of these 2D-PC-MS fragment correlations is next compared (m=z pair to m=z pair at a series of fragment m=z tolerances) to every b þ ion and internal b þ ion and internal b þ ion and y þ ion correlation from all triply charged peptide sequences of the full database tryptic digest within 7-ppm m=z tolerance of the parent peptide ion.
For the analogous 1D relative FARs, the m=z value of each ion-trap measurable terminal b þ and y þ fragment ions extracted from each triplet is compared individually to that of every possible terminal b þ , b 2þ , y þ ; and y 2þ ion from all triply charged peptide sequences of the full tryptic digest with parent peptide ion m=z within 7 ppm, to find all potential 1D fragment ion matches. For the 1D internal b þ fragment ions, the m=z value of each ion-trap-measurable fragment ion is additionally compared to every possible internal b þ and b 2þ ion from the same set of peptide sequences from the tryptic digest, now modeling a 1D matching scheme that also considers internal ions. For comparison, we also calculate the relative FAR for matching each 2D correlation of every b þ and y þ terminal ion with its complementary terminal ion (y 2þ and b 2þ , respectively) against every b þ ion and y 2þ ion and b 2þ ion and y þ ion correlation, for all triply charged peptide sequences of the tryptic digest within 7-ppm m=z tolerance of the parent peptide ion.