Conceptual understanding through efficient inverse-design of quantum optical experiments

One crucial question within artificial intelligence research is how this technology can be used to discover new scientific concepts and ideas. We present Theseus, an explainable AI algorithm that can contribute to science at a conceptual level. This work entails four significant contributions. (i) We introduce an interpretable representation of quantum optical experiments amenable to algorithmic use. (ii) We develop an inverse-design approach for new quantum experiments, which is orders of magnitudes faster than the best previous methods. (iii) We solve several crucial open questions in quantum optics, which is expected to advance photonic technology. Finally, and most importantly, (iv) the interpretable representation and drastic speedup produce solutions that a human scientist can interpret outright to discover new scientific concepts. We anticipate that Theseus will become an essential tool in quantum optics and photonic hardware, with potential applicability to other quantum physical disciplines.


Introduction
Photons are at the core of many quantum technologies that promise advances for imaging applications [1], efficient metrological schemes [2], fundamentally secure communication protocols [3] as well as simulation [4] and computation techniques [5][6][7] that are beyond the capabilities of their classical counterparts. Besides, photons are also among the core players in the experimental investigation of fundamental questions about the local and realistic nature of our universe.
To advance technological and fundamental progress further and to enable the exploration of numerous proposed ideas in the laboratories, new experimental concepts and ideas are instrumental. Frequently, however, the design of experimental setups even for well-defined targets is challenging for the intuitions of human experts, and existing systematic schemes (e.g. [18]) to date only provide solutions for specific experimental scenarios. For that reason, computational design methods for quantum optical experiments have * mario.krenn@utoronto.ca † alan@aspuru.com been introduced [19], in the form of topological search augmented with machine learning [20,21], genetic algorithms [22,23], active learning approaches [24], and optimization of parametrized setups [25]. Unfortunately, due to the complexity and size of the Hilbert space as well as the breadth of quantum optical applications, those algorithms may have severe drawbacks, such as inefficient discovery rates, requirements of a huge amount of training data or specialization on narrow sets of problems. Most importantly, no method so far has shown how to systematically extract scientific ideas, concepts and understanding from the solutions of the computer algorithm.
Here we demonstrate THESEUS, an inverse-design algorithm for quantum optics with highly interpretable representation that allows to scientists to rationalize the solutions quickly. THESEUS is generally applicable to discrete-variable quantum optics problems (including post-selected and heralded states, probabilistic and deterministic photon sources), does not need training data, and is orders of magnitude faster than previous comparable approaches. The speed-up allows for the application of topological optimization, which uncovers the conceptual cores underlying the solution. Physicists can then interpret, understand and generalize the underlying ideas and concepts. These advances allows us to apply THESEUS to solve several previously open questions about quantum experiments. Concretely, we investigate complex multiphotonic entanglement, the generation of heralded entanglement and complex photonic quantum transformations. In all of these cases, we uncover previously unknown generalizable patterns and new experimental ideas and interpretations.
Our approach differs significantly from others that try to employ machine learning to extract scientific arXiv:2005.06443v3 [quant-ph] 15 Nov 2020 concepts. The main difference is that these applications so far have been applied to rediscover previously known concepts [26]. Examples involve the identification of astronomical concepts such as the heliocentric worldview which has already been considered by Copernicus [27], the arrow of time and related thermodynamical concepts that were discovered in the 20th century [28] or the identification of certain interferometric devices that are used by optical physicists for many years [24]. Those are significant works that indicate great future possibility. However, they come with a grain of salt: It is not clear how much prior knowledge the scientists implicitly use to identify those concepts from the computer algorithms. Therefore, it is unclear how to extend these methods to actual open questions.
In quantum optics, in two works new concepts have been identified [29,30] using a brute-force computational search algorithm [20]. There, 10.000s of CPU-hours were necessary to arrive at a useful design. Those solutions were represented directed as a sequence of optical elements, which are very unintuitive to interpret conceptually. Moreover, the sequences were highly non-optimized because they emerged through random processes. As a consequence, it required scientists weeks or even months to rationalize the underlying principles (see [19] for more details).
In contrast to those previous approaches, we introduce for the first time an algorithm that produces highly interpretable solutions, which we apply to unsolved problems in science. The discovered solutions allow human scientists to rationalize the new, underlying concepts in quasi-real-time. We explicitly demonstrate this by solving several previously unresolved questions. In all of those cases, we can interpret and understand the underlying design concepts outright. To the best of our knowledge, THESEUS is the first algorithm that can provide targeted and systematically new conceptual understanding in a scientific domain. We believe, therefore, that THESEUS is an important step towards the goal of interpretable and explainable AI (XAI) in science that will assist human researchers at a conceptual level.

Results
Graph Theory-Quantum Experiment Mapping -Weighted coloured graphs (explained in Fig.1) can encode the information produced by a photonic quantum experiment involving probabilistic photonpair sources [31] and linear optical components [32]. The vertices correspond to spatial photon paths and edges between vertex v 1 and v 2 stand for probabilistic photon pairs in path v 1 and v 2 . The edge colour represents the internal mode number of the photons and edge weights ω stand for amplitudes. Previously,

Integrated Photonics
Conditioning on one photon in each path: Figure 1. A weighted edge-coloured graph as an abstract and efficient representation of the quantum information of a large variety of quantum optics experiments. A: As a specific example, we show a graph with four vertices and four coloured and weighted edges. The vertices a − d correspond to photonic paths, the edges correspond to correlated photon pairs, the edge colours stand for mode numbers, and weights ω ∈ C stand for complex coefficients. Probabilistic sources create the photon pairs (edges). Thus the entire information about the quantum state is represented by Φ(ω), with x † k being a creation operator of a photon in path x with mode number k. The information carried in the graph can be translated to different schemes of quantum optical experiments, such as (B) standard bulk optics, for example with path encoding, or (C) polarisation encodings as the carrier of quantum information or (D) Entanglement by Path Identity. The results of the quantum experiments can directly be calculated from the information of the graph. For example, a prominent technique is to condition the state on detecting a photon in each of the four detectors (post-selection). The equivalent formulation in terms of graphs is the sum of all subsets of edges that contain every vertex exactly once. It reduces the example quantum state to two terms. If all weights are equal, the resulting quantum state is a four-qubit GHZ state. Access to Φ(ω) allows for the optimisation of nonpostselected, heralded and triggered quantum states too, as we show in examples within the manuscript.
the description was only applicable to post-selected states.
We significantly extend the abstract description of quantum optics experiments as coloured weighted graphs, demonstrating how general quantum optics technology and questions can be raised and solved using the new framework. The extensions allow us for the first time to use the framework of weighted colored graphs for computational design of quantum optical experiments and hardware.
Specifically, here we introduce a weight function Φ(ω) that gives access to the complete information of quantum optical experiments through a (rather than only post-selected states, as in [31][32][33]), and allows us to generalize the scope of the method significantly. It allows for the description of non-postselected states, triggered and heralded photonic states, states with multiple excitations per mode (such as NOON states) and general quantum transformations. Furthermore, it enables the description of photon-number sensitive and insensitive detectors (which correspond to different projections of Φ(ω)) and deterministic photon sources such as quantum dots (see SI for details). Furthermore, we introduce here how graphs can be directly translated to several different schemes of photonic quantum optics, such as standard bulk optics, integrated photonics or entanglement by path identity [29]. A given graph can be translated in multiple ways to quantum experimental setups, while each setup corresponds to precisely one graph (more details in SI). These extensions were necessary to use the graph-theoretical description as a tool for the inversedesign of quantum experiments that are feasible in state-of-the-art quantum photonics laboratories.
Graph-based inverse-design of Quantum Experiments -The abstract and general representation of quantum experiments as graphs allows us to find a new method for inverse-designing quantum experiments. The idea is to write an optimisation objective function in terms of weights ω of the graph. For example, if we aim to find an experimental setups that produces a specific quantum state, the objective function is the fidelity of the state encoded as graph. If we aim to find transformation, then the objective function is the gate fidelity. Importantly, one can use the same technique for more general optimization targets, where neither the quantum state nor the quantum experiments are known beforehand. Examples are quantum metrology, where the objective function would be the Fisher-Information (written in terms of weights ω), or quantum-enhanced imaging technologies, where the objective function could be a signal-to-noise ratio (again, in terms of weights ω).
The most general quantum state corresponds to a complete graph with all possible multi-coloured weighted edges between each vertex (see Fig.2). As an essential step, we need to construct the objective (e.g. state fidelity) in terms of weights, F (ω). While the entire quantum state Φ(ω) is directly defined by the edge weights, conditioning measurements are commonly used to obtain more intricate states and to overcome the lack of single-photon nonlinearities. Prominent examples for such measurements are conditioning on the simultaneous detection of one photon in each path (I) or conditioning on the detection of ancilla photons (II).
As an example in Fig.1, we show the construction of the fidelity for a 4-photon GHZ state |GHZ = 1/ √ 2(|0, 0, 0, 0 + |1, 1, 1, 1 ) a−d , where |0 and |1 stand for one photon in the internal mode 0 and 1 (such as horizontal or vertical polarisation), respectively. The subscript a-d means one photon is in each of the four paths a,b,c and d. Under the condition of simultaneous detection (I), the term |0, 0, 0, 0 can be generated by three different subgraphs: two blue horizontal edges, vertical edges or crossed edges. The weight of a subgraph is the product of all its edge weights. The weight of the whole term is the sum of all weights of the subgraphs. Therefore, the weight of |0, 0, 0, 0 is In an equivalent way, the amplitude of |1, 1, 1, 1 can be written in terms of ω. As a result, we have where N (ω) is a normalisation constant of the state emerging from the graph (more details in SI). The weights of the graph are optimised by minimising a loss function constructed from the fidelity and an additional L 1 regularisation term with positive real coefficient α < 1. Inclusion of the L 1 regularization term can drive the optimisation towards a solution with smaller amplitudes, thereby opening ways to further reduce the edges of the graph by removing edges with small weights. For optimisation, we use the Broyden-Fletcher-Goldfarb-Shanno algorithm, an iterative gradient-descent method that approximates Hessians to solve non-linear optimization problems. If we identify a solution with F (ω) above a limit (we use F limit = 0.95) and small weights ω (we use ω limit = 1), we find a suitable experimental setup candidate. At this point, as the loss minimization is fast, we can perform a topological optimisation. We reduce the size of the graph by iteratively removing individual edges. We can choose the edge from a distribution that depends on the magnitude of the weights of the previous solution (with two special cases: choosing entirely randomly, and always choosing the edge with the smallest weight magnitude).
The new, smaller graph will be used to minimize the loss function in eq.(3). The topological optimisation reduces the size of the graph iteratively. The topological optimisation distills small structures such that human scientists can interpret and understand the underlying physical principles, and use the new knowledge to solve other cases. In many instances, we used these insights to find straight-forward generalizations to infinitely large classes of situations. This is in stark contrast to typical artificial intelligence remove edge, Fin.
The main step is a minimization of the loss function, which contains the quantum fidelity in terms of weights of the graph. Additionally, an L1 regularization term controls the magnitude of the weights. If the weights identified by the optimisation, ωS, lead to fidelities larger than a F limit , and the magnitude of the weights is smaller than ω limit , then one edge of the graph is removed, and the optimisation continues with the smaller graph. On the other hand, if the criteria are not fulfilled, the same graph is optimised (with different starting conditions) until the discovery of a suitable solution, or the number of iterations exceeds c limit . The result of THESEUS is a weighted graph that leads to sufficiently large fidelities, with a small number of edges. This topological optimisation enables the scientific interpretation and understanding of results.
applications in the natural sciences [34], where the solution of a parameter optimisation is the final product, without the intention of discovering new scientific ideas (with few recent exceptions that re-discover previously known physical ideas [27,28]).
Benchmarking -We compare the speed of THESEUS with previous approaches, using classes of high-dimensional multipartite states called Schmidt-Rank Vectors (SRV) as a benchmark [35]. In particular, we aim to discover maximally entangled threeparty quantum states of up to ten local dimensions. This task is well understood theoretically, thus it represents a good benchmark. There are 81 unique entangled structures that could be generated using linear optics [33]. A pure topological circuit search using 150 CPU-hours has discovered 51 out of 81 different states in the set [20]. A reinforcement learning algorithm has identified 17 out of 81 different states, with speed comparable to the topological search algorithm [24]. THESEUS discovers 76 different states within 2 hours, where the first 17 are identified within two seconds, and the first 51 states in less than 15 minutes. This results in a speed-up of a factor 600.
We turn to a second benchmarking task; the identification of high-dimensional CNOT gates. A recent study has shown that the identification of the first photonic high-dimensional controlled operation took 150.000 CPU-hours [30]. Our algorithm finds a solution that is experimentally quantitatively simpler, within 1 CPU-second. This results in a speedup of a factor 10 8 . We come back to this example in Fig.5.
Scientific Discovery and Understanding -The improvement in speed shows that THESEUS is ready to go beyond benchmarks, and be applied to the discovery of new scientific targets and to the development of new scientific insights and understanding. Scientific understanding is essential to the epistemic aims of science [36], but rarely addressed in applications of artificial intelligence to the natural sciences. In the philosophy science, pragmatic criteria have been found for scientific understanding, in particular by de Regt's award-winning work [36,37]. He describes that scientists can understand a phenomenon if they can recognise qualitatively characteristic consequences without performing exact calculations. We connect this criterion to our discoveries: We discover the first high-dimensional six-photonic GHZ states, which have been conjectured to be not constructible with linear optics. We can understand the underlying concept and use it to construct a simple method to generate high-dimensional GHZ states with an arbitrary number of photons. Furthermore, we discover the first solutions of heralded three-dimensional Bell states. We also understand the underlying concept, which, among others, contains an idea to destructively interfere vacuum terms. We generalise the concept to arbitrarydimensional Bell states -without additional calculations. Similarly, we discover setups for heralded GHZ states that need fewer resources than methods proposed in the literature, which could form the building blocks for photonic quantum computation [7]. We furthermore apply THESEUS on multiphotonic transformations. We find a new way to interpret and construct photonic qubit operations such as CNOT gates. Similarly, we discover high-dimensional CNOT operations that need quantitatively less resources than methods proposed in the literature. Connecting to de Regt's criterion, our algorithm has been the source of scientific understanding for multiple instances.
High-dimensional GHZ states -A d-dimensional npartite GHZ quantum state is written as These states are studied in the interplay between quantum and local-realistic theories [38], and have recently found potential applications in quantum communication tasks [39]. Graph-theoretical arguments show that perfect high-dimensional GHZ states can be generated only for 4-photon states [31], because terms in addition to those in eq.(4) necessarily emerge. Using THESEUS, we discover the first example that circumvents the no-go theorem, see Fig.3. The algorithm identifies solutions with fidelities arbitrarily close to one, by adjusting the edge weights such that unwanted terms have arbitrarily small weights (albeit at the expense of lower count rates). The solution can immediately be generalized to GHZ states (and other states) with higher dimensions and a larger number of particles, by identifying subgraphs of additional terms who's edges are multiplied with ω < 1.
No further computations or optimisations are necessary, demonstrating that we achieved scientific understanding based on a computational optimisation in the appropriate representation of the problem at  The initial state is a complete graph of six vertices and three colours. Each pair of vertices is connected by nine edges, which stand for all nine possibilities (blue, red, green stands for modes 0,1,2, respectively). A bicoloured edge stands for a photon pair with different mode numbers. For example, a blue-red edge between vertex a and b stands for a photon pair with one photon in path a with mode number 0, and one photon in b with mode number 1, i.e. a † 0 b † 1 . In total, this corresponds to 135 edges. B: The optimized graph for a 6-photonic 3-dimensional GHZ state. While it has been shown that such a state cannot be created with perfect fidelity [31] with linear optics and probabilistic photon-pair sources (without additional photons), THESEUS found a solution where the fidelity scales with F ≈ 1 − O(ω 4 ) with the overall counts C scaling as C ≈ O(ω 2 ), which is experimentally feasible. A: The concept of the solution can be interpreted in the context of graph-theoretical results and can be immediately generalized by human scientists.

hand.
Heralded photonic entangled states -The next targets we address are heralded entangled photonic states. Standard sources of photonic entanglement such as spontaneous parametric down-conversion or spontaneous four-wave mixing, are entirely probabilistic [9]. That means photons are produced at random times, and only after the detection of the photon state, one knows that they have been created. The generation of heralded states would allow for event-ready schemes, which are essential in photonic quantum computation [7,40]. Experimentally, twodimensional Bell states have been generated conditioned on the detection of photons in four trigger detectors [41,42]. However, higher-dimensional generalizations implementations are missing. A major challenge in creating heralded states are cases where all trigger detectors see a photon, but no photons emerge from the setup. Those cases, where the triggers herald a vacuum term, usually have significantly higher probability of happening than the correct heralded Bell state because fewer pair creation events need to occure simultaneously.
THESEUS identifies experiments for heralded 3dimensional Bell state, see Fig.4A. The setup requires four photon-pair events simultaneously, which is well within today's experimental capabilities [43]. The solution contains a remarkable idea that not been explored before: The destructive interference of the triggered vacuum term, see Fig.4B. Creating the possibility of two heralded vacuum outputs and assigning their amplitudes opposite signs leads to their cancellation.
Furthermore, each of the two subgraphs that lead to a vacuum term in Fig.4B forms the basis of a 3-dimensional Bell-state which constructively interfere while all cross-correlations destructively interfere. More information is provided in the SI. Higher-order events and cases where multiple photons are detected in the trigger detectors can be reduced to arbitrarily low probabilities by adjusting the weights of the edges. Assuming a standard pump laser with 80 MHz repetition rate, the expected count rate to reach a fidelity that guarantees genuine 3-dimensional entanglement, i.e. F > 2/3, is on the order of ten per second (for details see SI). The concepts used by THESEUS, in particular the cancellation of vacuum, can be immediatly generalized to other cases, for example, to arbitrary high-dimensional Bell states, see Fig.4E-F.
Next, we use THESEUS to find heralded multiphotonic states which have been proposed a decade ago, but never experimentally implemented due to their experimental requirements [44,45]. Heralded GHZ states provide the resources for definite demonstration of deterministic violations of local-realistic worldviews [46] and are among the most promising building blocks for photonic quantum computation [7,40]. We find an experimental configuration, which  Fig.4G. The solution is highly symmetric, and uses a very similar concept to avoid lower-order contributions as the solution of the Bell state. In this case however, the problematic lower-order event create single-photon outputs. The strategy, again, is to generate two subgraphs for each single-photon output with opposite phase which destructively interfere (see SI).
Photonic Controlled-Gates -Finally, we demonstrate the usage of THESEUS to photonic quantum transformations, which are essential elements for photonic quantum simulation [4] and computation schemes [3,17,40]. In Fig.5 we introduce virtual vertices that represent input photons, and optimize multiple dependent graphs simultaneously that represent different states of the transformation. Interestingly, exactly this concept has been the core of one of the first photonic CNOT experiments [47], which gives a new interpretation for a 16-year-old experiment (see SI for details).
We apply THESEUS to find high-dimensional quantum transformations, which have been discussed in the context of resource-efficient quantum computation algorithms [48,49]. The solution follows similar concepts as the two-dimensional case, and requires fewer experimental resources than [30], for details see SI.

Discussion
We presented the algorithm THESEUS for the inverse-design of quantum optical experiments, which is based on an abstract physics-inspired representa-tion. We use it to discover several previously unknown experimental configurations of quantum states and transformations in the challenging high-dimensional and multi-photonic regime, such as generation of highdimensional GHZ states, heralded entangled quantum states, high-dimensional controlled operations. Those experimental setups are within reach of modern photonic technology and could lead to fascinating experimental investigations of fundamental questions and technological advances. THESEUS can immediately be applied to discover a multitude of other targets in experimental quantum optics, such as tools to enable silicon-photonics quantum computation [40] or highly efficient, low-noise quantum entanglement sources [43]. It can also directly be applied to situations where the target state is not known beforehand, such as for applications in quantum metrology [2] or in quantum-enhanced microscopes and telescopes [50,51]. In general, the internal representation is directly connected to creation and annihilation operators, which are universally used in quantum physics, thus THESEUS can further be generalized to a much larger scope.
One of the main features is the possibility to extract scientific understanding from the computer-inspired designs. That was made possible by a topological optimisation that reduces the solutions to conceptual cores. Those minimal topologies allow for the interpretation and generalizations of the discovered solution, without performing additional calculations. This is in accord with criteria from the philosophy of science that argue that scientific understanding is connected with the skill to use concepts fruitfully, without exact calculations. Hence, in a broader sense, we argue that the ability of our algorithm goes beyond simple optimisation, and enters the realm of providing scientific insights and allowing for scientific understanding. Thereby, it directly contributes to scientific, explainable AI (XAI), and in general, to the essential aim of science. Designing quantum optical experiments using the abstract notation of graphs is possible because we found translations of graphs into several different experimental schemes. Edges between vertices a and b are translated to probabilistic photon sources, see FigS.6A-F. Edge colours correspond to mode numbers. Multi-edges correspond to superposition or entanglement, and can be created with standard photonic technologies, for example, cross-crystal sources [52,53]. A deterministic single-photon source emitting in path b can be understood as an edge between a vertex b and a virtual vertex Va, FigS.6G. For each term in the resulting quantum state, every virtual vertices always need to have exactly one incoming edge. This is conceptually equivalent to the situation of a probabilistic photon-pair source, where the whole state is conditioned on the detection of one photon using a photon number sensitive detector in path Va.
Edges can be merged at one vertex in several different ways, see FigS.7. If the edges have the same colour (FigS.7A), the corresponding photons have the same mode number. In that case, the edges can be merged with probabilistic beam splitters (green squares, FigS.7B-D) or by creating them directly with path identified photon-pair sources (for instance, SPDC crystals, FigS.7E).
If the edges have different colour (FigS.7F), the corresponding photons have different mode numbers. In that case, the edges can be merged losslessly with mode-dependent beam splitters (so-called multiplexing or de-multiplexing); white squares, for example, polarizing beam splitters if the degree of freedom is photonic polarisation (FigS.7G-I). The edges could also be created by path identified photon-pair sources (for instance, SPDC crystals, FigS.7J. Other probabilistic photon sources, such as lasers as probabilistic single-photon sources, can be added by exploiting hypergraph structures [54]. With the ability to create independent edges, and merge edges, all types of graphs can be translated to experimental setups. Appropriate phase shifters can manipulate the phases of edge weights. Additionally, amplitudes can be manipulated by pump power for SPDC crystals, splitting ratios that are set by half-wave plates, or absorptive filters. Collinear photon pair sources, that produce two photons in the same path, can be described with loops (an edge that connects a vertex to itself).
If we are conditioning the state on one photon in each detector, it reduces to |ψ = 1 N (ω) i,j,k,l∈{0,1} ω |i,j,k,l |i, j, k, l with the edge weights and the normalization constant The objective of the optimization is to find ω i,j x,y ∈ C that minimize the loss function, and subsequently finding solutions with a large number of edge weights being zero. The information about higher-order contributions to the state, which results in experimentally reduced quantum fidelities, is encoded within the weight function Φ(ω). Therefore, higher-order contributions could be directly accounted for within the optimization procedure. More details about the approximations in eq.(5) can be found in [55,56].

III. HERALDED BELL STATE
In FigS.9, we show that the solution for the 3-dimensional Bell state contains two subgraphs, each of them creates a 3-dimensional Bell state individually. The vacuum terms of the two subgraphs cancel (as described in the main text). Each of the two subgraphs can be understood individually, and follows the same concept: Every edge from the output modes a and b is connected to one individual ancilla vertex c − h. Three edges furthermore connect the ancilla vertices. Each of those edges connects vertices with the same colour of the incoming edge from a and b. For example, the left subgraph in FigS.9B has an edge which connects d and g as both of them have an incoming edge with the same colour, red. In that way, if four photon pairs created, only photon pairs with the same edge colour, i.e. mode number, can be created, as seen in FigS.9C. Cross-correlations, which can occur by combining the two subgraphs in FigS.9B are destructively interfered in the same way as the vacuum with the appropriate setting of the phases of weights, as explained in the main text.
The fidelity can be arbitrary close to one, by adjusting the weights of the edges. In the most straightforward setting, all edges that are connected to a or b have the same weight v, while all edges connecting ancilla vertices c − h have weight w (with phases as shown in the main text). In this way, the heralded state can be written as |ψ = 2v 2 w 2 (|0, 0 − |1, 1 − |2, 2 ) a,b + vw 3 |φ one photon + w 4 |φ zero photons + O (higher orders) (9) where |φ one photon stands for combinations where three ancilla photon pairs and one pair containing an ancilla photon and an output photon are produced. The state |φ zero photon are cases where four ancillary photon pairs are created. Both of those terms can be reduced by making w smaller than v. The term O(higher orders) correspond to cases with five or more photon pairs produced, which can be reduced by having v and w smaller than one.
We calculate the fidelity and expected count rates for various settings of weights v and w in Tabel I, calculated up to sixth order of SPDC, and not taking into account any losses or detector inefficiency.
The concepts used in the 3-dimensional case can be immediatly generalized to higher-dimensional Bell states. In FigS.10, we show the solutions for 2-dimensional to 5-dimensional Bell states with their corresponding phase settings.

IV. HERALDED GHZ STATE
More than ten years ago, schemes for heralded GHZ states have been proposed [44,45], which require experimentally significantly more resources and have therefore not yet became practical. In particular, the 3-photon GHZ proposal by Walther et al. [44,45] requires 12 photons (nine ancillary photons that herald a GHZ state). The proposal by Niu et al., [45] requires ten photons (seven ancillary photons), but further requires close to perfectly efficient, photon-number-sensitive detectors for heralding paths, as they need to distinguish between the arrival of one and two photons. In contrast, our proposal requires only ten photons and non-photon number resolving detectors -which is feasible in state-of-the-art photonic laboratories.

V. EXPERIMENTAL 2-QUBIT CNOT
A photonic CNOT transformation was performed by Gasparoni et al., [47] in 2004, which can be seen in FigS.11. An ancillary state |ψ + = 1/ √ 2 (|0, 1 + |1, 0 ) in paths b and c is combined with the incoming control and target photons. A simultaneous detection event in detector a and d heralds a successful realization of a CNOT.
The corresponding graphs for the four different cases are seen below. The resulting states correspond to all subgraphs with one incoming edge in vertex a and one in vertex d (those are heralding detectors), and one edge from each vertex Va and Vd (those represent the incoming photons). It can be seen that Vd (which corresponds to the incoming photon from path d, i.e. the target photon) is responsible for the phase of the quantum states. In that way, it is responsible for the term that is destructively interfered -this is analogous to the situation presented in the main text.

VI. CNOT BEYOND QUBITS
A control operation in a 2 × 3 dimensional space is shown in FigS.12. The subgraph a-f remains constant, while the edges containing Va and Vb changes depending on the input control/target photons. The correct transformation is heralded by simultaneous detection of a photon in each of the detectors c-f . The structure of the subgraph a-f is very reminiscent of the solution of heralded Bell states in Fig.4 of the main text. Here, each internal mode (represented as edge colour) from a and b is connected to one individual heralding detector.
Furthermore, the solution uses destructive interference for producing the correct output states, as in Fig.4 of the main text. Some of the resulting subgraphs (those have one incoming edge to vertex c-f ) do not vanish. Still, they are reduced in magnitude by adapting the edge weights appropriately. Thereby, an experimentally feasible method of performing CNOT transformations beyond qubits is constructed. FigS. 12. High-dimensional CNOT gate, with a qubit control photon and a qutrit target photon.