Classical simulation of Gaussian quantum circuits with non-Gaussian input states

We consider Gaussian quantum circuits supplemented with non-Gaussian input states and derive sufficient conditions for efficient classical strong simulation of these circuits. In particular, we generalise the stellar representation of continuous-variable quantum states to the multimode setting and relate the stellar rank of the input non-Gaussian states, a recently introduced measure of non-Gaussianity, to the cost of evaluating classically the output probability densities of these circuits. Our results have consequences for the strong simulability of a large class of near-term continuous-variable quantum circuits.


I. INTRODUCTION
Understanding the origin of quantum advantage is of paramount importance, both at the fundamental and technological level. Continuous-variable (CV) systems are being recognized as a promising alternative to the use of qubits. On the one hand, unprecedented large CV entangled quantum states, of up to one-million elementary systems, can be deterministically generated [1,2]. On the other hand, they offer the potential of increased robustness with respect to noise [3].
Wigner function negativity has been shown to be a necessary resource for quantum advantage with CV quantum computing architectures [4,5]. Since Gaussian states and processes have positive Wigner functions, this necessarily corresponds to the use of non-Gaussian resources. However, establishing under which conditions non-Gaussianity is also sufficient for quantum advantage [6], and when instead non-Gaussian circuits are classically efficiently simulable [7], is still an open question.
In what follows, we analyse the computational power of non-Gaussian states and thus focus on the case where Gaussian circuits and measurements are supplemented with non-Gaussian input states as a computational resource. We obtain a classical strong simulation algorithm in the case where the non-Gaussian input state has a bounded support over the Fock basis, which runs in time polynomial in the support size and exponential in the total photon number of the input state. Note that any normalised state can be approximated arbitrarily well using states of bounded support simply by considering a renormalised truncation of the state.
This choice of input non-Gaussian states with bounded support is motivated by the recent characterisation of the structure of non-Gaussian quantum states in the singlemode case using the so-called stellar representation [8]. This characterisation establishes an operational hierar- * Electronic address: ulysse.chabaud@gmail.com chy of non-Gaussian states: the states of finite stellar rank, i.e., in a finite level of the stellar hierarchy, are the states that can be obtained from the vacuum with a given number of photon additions, together with Gaussian unitary operations. Alternatively, such states may also be obtained from a state with finite support over the Fock basis-a core state-with a Gaussian unitary operation. In this work, we generalise the stellar representation to the multimode case and relate the cost of classically simulating Gaussian circuits with non-Gaussian input states to the stellar rank of these states. Additionnally, we show that the equivalence between photon addition and core state does not hold in the multimode setting, that is, there exist multimode states with bounded support over the Fock basis that cannot be obtained from the vacuum using a finite number of photon additions and Gaussian unitary operations.
The classical simulation results obtained have consequences for more general CV circuits, since non-Gaussian gates and non-Gaussian measurements can be implemented by Gaussian operations together with non-Gaussian ancillary states [9][10][11]. In particular, we retrieve the fact that Boson Sampling circuits with a logarithmic number of input photons are strongly simulable [12], and we show that Gaussian circuits interleaved with a constant number of photon additions or subtractions can be simulated efficiently classically. These results are reminiscent of similar works in discrete-variable quantum architectures, where Clifford circuits supplemented by few magic states have shown to be classically efficiently simulable [13,14]. Additionally, our results allow us to compute the output probability distributions of a wide variety of CV circuits and imply their efficient classical strong simulation when the non-Gaussianity of these circuits is small enough, such as Boson Sampling with unbalanced heterodyne detection [15,16], measurement-based CV circuits [17,18] or approximate CVIQP circuits [19,20], among others.
The rest of the paper is structured as follows. In section II, we recall notions of classical simulation of quantum computations. In section III, we define classes of Gaussian circuits with non-Gaussian input states and derive explicit expressions for their output probability densities. In section IV, we obtain a classical algorithm for strong simulation of these circuits, with explicit complexity depending on non-Gaussian parameters of the input state-its support size over the Fock basis and its stellar rank. This allows us to give efficient simulability results when these parameters are small enough with respect to the size of the computation, for various interesting subclasses of Gaussian circuits with non-Gaussian input states. We conclude in section V.

II. CLASSICAL SIMULATION OF QUANTUM COMPUTATIONS
Depending on the approach used for simulating classically the functioning of quantum devices, several notions of simulability are commonly used. Hereafter, we recall the notions of strong and weak simulation.

A. Strong simulation
To each quantum computation is associated a probability distribution from which classical outcomes are sampled.
In the case of continuous-variable quantum computations with continuous outcomes, the output probability distribution is replaced by an output probability density. This motivates the following (informal) definition [21,22]: Definition 1 (Strong simulation). A quantum computation is strongly simulable if there exists a classical algorithm which evaluates its output probability distribution (density) or any of its marginals for any outcome in time polynomial in the size of the quantum computation.
This notion of simulability is referred to as strong because it asks more from the classical simulation algorithm than from the quantum computation: the quantum computation is merely sampling from a probability distribution (density), while the classical algorithm has to compute efficiently the exact probabilities. Various relaxations of this definition are possible, allowing the classical evaluation to be approximate rather than exact, or to abort with a small probability.

B. Weak simulation
A sampling counterpart to the notion of strong simulation is to ask the classical simulation algorithm to mimic the output of the quantum computation [21,22]. Informally: Definition 2 (Weak simulation). A quantum computation is weakly simulable if there exists a classical algorithm which outputs samples from its output probability distribution (density) in time polynomial in the size of the quantum computation.
Akin to strong simulation, various relaxations of this definition are possible, allowing the classical sampling to be approximate rather than exact, or to abort with a small probability. Hereafter we only consider the definition above.
In the case of continuous-variable quantum computations with continuous outcomes, a weaker requirement is to ask the classical simulation not to sample from the output probability density, but rather from a discretised probability distribution obtained from the probability density by performing an efficient binning of the sample space. Indeed, samples from the output probability density yield samples of such a discretised probability distribution with efficient classical post-processing.
As it turns out, weak simulation is indeed weaker than strong simulation, as was shown in Refs. [21,22]: an efficient classical algorithm for strong simulation provides an efficient classical algorithm for weak simulation (assuming one can efficiently sample from efficiently computable univariate probability distributions over a polynomial number of samples). For quantum computations yielding continuous classical outcomes, the result still holds with a similar proof for binned discretised probability distributions rather than the corresponding probability density, as long as the discretised probabilities can be computed efficiently from the probability density and have support on a polynomial number of bins for each mode.
In what follows, we consider strong simulation of a large class of CV quantum circuits: Gaussian circuits with non-Gaussian input states.

III. GAUSSIAN CIRCUITS WITH NON-GAUSSIAN INPUT STATES
A. The stellar representation The stellar representation of single-mode continuousvariable quantum states has been introduced in [8]. It establishes a hierarchy among single-mode non-Gaussian pure states based on the number of zeros of their Husimi Q function [23]. It shows that any state of stellar rank N , that is, whose Husimi Q function has exactly 2N zeros (counted with multiplicity), may be written in the following form:D whereD(α) is a displacement of amplitude α ∈ C,Ŝ(ξ) is a squeezing of parameter ξ ∈ C and |C is a core state of rank N , i.e., a state which has bounded support over the Fock basis with highest Fock state |N . Additionnally, the stellar rank of a pure quantum state corresponds to the minimal number of photon additions that are necessary to engineer the state from the vacuum, together with Gaussian unitary operations.
Hereafter, we extend a few definitions from [8] to the multimode case, using bold math for multi-index notations (see Appendix A). First, the stellar function, which provides a representation of multimode pure states as multivariate holomorphic functions: Definition 3 (Multimode stellar function). Let m ∈ N * and let |ψ = n≥0 ψ n |n ∈ H ⊗m be a normalised pure state over m modes. The stellar function of the state |ψ is defined as for all z ∈ C m , where |z = e − 1 2 z 2 n≥0 z n √ n! |n ∈ H ⊗m is the coherent state of complex amplitude z.
The following definition also extends naturally from the single-mode case: Definition 4 (Multimode core state). Multimode core states are defined as the normalised pure quantum states which have a (multivariate) polynomial stellar function.
Like in the single-mode case, these are the states with a finite support over the (multimode) Fock basis. For any m ∈ N * , the set of multimode core states over m modes is dense in the set of normalised states for the trace norm (by considering renormalised cutoff states). We also introduce the following definitions: Definition 5 (Degree of a multimode core state). The degree of a multimode core state is defined as the degreesum of its stellar function.
This definition generalises to the multimode case the notion of stellar rank for core states in the single-mode case.
Definition 6 (Support of a multimode core state). The support of a multimode core state is the set of Fock basis states which have nonzero overlap with the core state.
For example, the 4-mode core state 1 √ 2 (|1230 + |0001 ) has degree 6, support size 2, and its stellar function is given by 1 2 √ 6 z 1 z 2 2 z 3 3 + 1 √ 2 z 4 , for all z 1 , z 2 , z 3 , z 4 ∈ C 4 . Single-mode states of finite stellar rank-which are the states whose stellar function has a finite number of zeros-have two equivalent representations: they are either obtained from an underlying core state by a Gaussian unitary operation, or they can be obtained from the vacuum by photon additions and Gaussian unitary operations. In the multimode setting however, the stellar function has either no zeros or an uncountable infinite number of zeros [24]. Moreover, we show in Lemma 1 that the two representations are no longer equivalent in the multimode setting: the class of multimode states that are obtained from a multimode core state by a multimode Gaussian unitary operation is striclty larger than the class of multimode states that can be obtained from the vacuum by photon additions and multimode Gaussian unitary operations. Hence, we generalise the notion of finite stellar rank for multimode states as follows: Definition 7 (Multimode stellar rank). Let |ψ =Ĝ |C , whereĜ is a multimode Gaussian unitary and |C is a multimode core state. The stellar rank of the multimode state |ψ is defined as the degree of |C .
Additionally, for multimode quantum states which do not admit a decomposition of the formĜ |C , we define their stellar rank to be +∞.
As in the single-mode case, it is easily seen that the stellar rank is invariant under Gaussian unitary operations (we refer to [8] for the proofs in the single-mode case). Similarly, the notion of multimode stellar rank induces a multimode stellar hierachy which is robust with respect to the trace norm. In what follows, we only consider multimode states of finite stellar rank, which form a dense subset of the multimode Hilbert space (since these include the set of multimode core states).
We now turn to the analysis of the computational power of multimode non-Gaussian states. We consider Gaussian unitary circuits with multimode states of finite stellar rank in input. Since these states are of the formĜ |C , whereĜ is a given multimode Gaussian unitary and |C is a given multimode core state, this is equivalent to consider Gaussian unitary circuits with input core states.
In what follows, we show that the degree and support size of a multimode core state are sufficient to quantify the hardness of strongly simulating Gaussian circuits with input core states.

B. Gcore circuits
We define G core circuits as the family of Gaussian circuits with Gaussian measurements, supplemented by non-Gaussian multimode core states in the input.
Measuring a state with unbalanced heterodyne detection effectively amounts to squeezing the state and then sampling from its Q function [16], with a squeezing parameter ξ ∈ C depending on the unbalancing of the detection. Setting ξ = 0 yields balanced heterodyne detection, while sending |ξ| = r to infinity yields homodyne detection. Any Gaussian measurement can thus be implemented by Gaussian unitary operations and heterodyne detection only, since it can be implemented by Gaussian unitary operations and homodyne detection only [25,26]. Without loss of generality, a Gaussian measurement may thus be written as a tensor product of single-mode balanced heterodyne detections preceded by a Gaussian unitary. G core circuits are then described by two (multidimensional) parameters: a multimode core state |C in the input and a Gaussian unitary evolutionĜ (Fig. 1).
In what follows, we derive a general expression for the output probability density functions of these circuits. Then, we study the classical simulability of G core circuits and of various notable subclasses of these circuits.
. . . het . . . n < l a t e x i t s h a 1 _ b a s e 6 4 = " p t X V N P S 7 Z l k / K 3 6 N Y R u T S 9 w w a + 4 = "

t S 9 S a L I w 8 n c A r n 4 M E V V O E W 6 t A A B i E 8 w y u 8 O c p 5 c d 6 d j 0 V r z s l m j u E P n M 8 f 0 8 i Q b A = = < / l a t e x i t >Ĝ
< l a t e x i t s h a 1 _ b a s e 6 4 = " u X X k N 3 F k B i y J Y 7 p V n k d 5 J m g s 0 6 c = " We first recall a few combinatorial functions related to the permanent, which appear in the expressions of the output probability densities. The hafnian of a square matrix A = (a ij ) 1≤i,j≤2m of size 2m is defined as [27] Haf

y 8 T L S R l y 1 H u l r 2 5 f s T T i M T J J j e l 4 b o J + R j U K J v m k 2 E 0 N T y g b 0 Q H v W B r T i B s / m 1 0 7 I a d W 6 Z N Q a V s x k p n 6 e y K j k T H j K L C d E c W h W f S m 4 n 9 e J 8 X w y s 9 E n K T I Y z Z f F K a S o C L T 1 0 l f a M 5 Q j i 2 h T A t 7 K 2 F D q i l D G 1 D R h u A t v r x M m t W K d 1 6 p 3 l + U a 9 d 5 H A U 4 h h M 4 A w 8 u o Q Z 3 U I c G M H i E Z 3 i F N
where the sum is over the perfect matchings of the set {1, . . . , 2m}, i.e., the partitions of {1, . . . , 2m} in subsets of size 2. The hafnian of a matrix of odd size is 0. The hafnian is related to the permanent by for any m × m matrix B. By convention we set Haf (∅) = 1, where ∅ is a square matrix of size 0. The loop hafnian of a square matrix R = (r ij ) 1≤i,j≤r of size r is defined as [28] lHaf (R) := where the sum is over the single pair matchings of the set {1, . . . , r}, defined as the set of perfect matchings of a complete graph with loops with r vertices. This set is isomorphic to the set Π 1,2 ({1, . . . , r}) of partitions of {1, . . . , r} in subsets of size 1 and 2 (by mapping a block {k} of size 1 of a partition to the matching {k, k} and a block {i, j} of size 2 to the matching {i, j}). In particular, when R is a matrix whose diagonal entries are all 0, we have lHaf (R) = Haf (R). The loop hafnian of a matrix of size r may be computed in time O(r 3 2 r/2 ) [28]. We obtain a closed expression for the output probability density of Gaussian circuits with multimode core states input in Theorem 1, by adapting proof techniques from [29][30][31] (see Appendix A for multi-index notations): Theorem 1. Let m, n ∈ N and let . . . het . . . G < l a t e x i t s h a 1 _ b a s e 6 4 = " u X X k N 3 F k B i y J Y 7 p V n k d 5 J m g s 0 6 c = " be an m-mode core state of degree n. LetĜ be a Gaussian unitary over m modes. For all α ∈ C m , let us write V and d = (d, d * ) the covariance matrix and the displacement vector of the Gaussian stateĜ † |α . Then, the output probability density for the G core circuitĜ with input |C and heterodyne detection, evaluated at α, is given by is a Gaussian prefactor and where A p,q is the square matrix of size |p| + |q| obtained from and by replacing the diagonal of V by the elements of D, then by repeating p k times the k th row and column and q k times the (m + k) th row and column of the obtained matrix for all k ∈ {1, . . . , m}.
We give a proof in Appendix B. When the input core state is a multimode Fock state, we refer to the corresponding subclass of G core circuits as G Fock circuits (see Fig. 2). In that case, the sum in Eq. (7) reduces to a single term and we obtain the following result: Corollary 1. Let m, n ∈ N and let n = (n 1 , . . . , n m ) with |n| = n. LetĜ be a Gaussian unitary over m modes.
For α ∈ C m , let us write V andd = (d, d * ) the covariance matrix and the displacement vector of the Gaussian stateĜ † |α . Then, the output probability density for the G Fock circuitĜ with Fock state input |n and heterodyne detection, evaluated at α, is given by n!π m Det (V + 1 2m /2) lHaf (A n,n ), (11) where A n,n is the square matrix of size 2n obtained from and by replacing the diagonal of V by the elements of D, then by repeating n k times the k th and the (m + k) th rows and columns of the obtained matrix for all k ∈ {1, . . . , m}.
Note that the expressions obtained in Theorem 1 and Corollary 1 may be used to retrieve the expressions of the output probability distributions for a large class of CV circuits that do not necessarily have all their non-Gaussian elements in the input. To see this, consider a G core circuit of size 2m with input |C ⊗ |C , where |C and |C are m-mode core states, with Gaussian evolutionĜ⊗1, wherê G is an m-mode Gaussian evolution, and projecting onto tensor products of displaced two-mode squeezed states between mode k and m + k, for all k ∈ {1, . . . , m}. Its output probability density evaluated at 0, in the limit of infinite squeezing for the two-mode squeezed states, is given by | C|Ĝ|C | 2 , up to a normalisation factor. This encompasses the expressions of output probabilities of Boson Sampling circuits [12], whenĜ is a passive linear evolution and |C and |C are multimode Fock states, or else of Gaussian Boson Sampling circuits [29], when |C = |0 ⊗m .

IV. STRONG SIMULATION OF WEAKLY NON-GAUSSIAN QUANTUM CIRCUITS
In this section, we use the expression obtained in Theorem 1 in order to study strong simulation of CV quantum circuits with few non-Gaussian elements. The first general result deals with G core circuits, i.e., Gaussian circuits with multimode core state input.  the input core state. The Gaussian prefactor may be computed efficiently in m the number of modes. Thus, to compute the output probability density, one may compute s 2 loop hafnians of matrices of size at most 2n, which can be done in time O(s 2 n 3 2 n ) [28]. In order to obtain an algorithm for strong simulation, one also needs to compute marginals. We show in Appendix C that these may also be computed in time O(s 2 n 3 2 n + poly m). Theorem 2 implies that strong simulation of G core circuits is efficient (polynomial in m) when the input core state has a logarithmic degree n = O(log m) and polynomial support size s = O(poly m). This result may be understood as a generalisation of the efficient classical simulability of Gaussian computations [32], and has consequences for the simulability of various continuous-variable quantum computing models, in particular those based on Gaussian operations and photon additions or subtractions. We define and consider three interesting examples in what follows: Interleaved Photon-Added Gaussian circuits (IPAG), Interleaved Photon-Subtracted Gaussian circuits (IPSG) and Gaussian circuits with input Fock states (G Fock ).

|0i < l a t e x i t s h a 1 _ b a s e 6 4 = " m F b y g 4 g J W 0 Z X A v v u 8 2 L j F X T 3 N A g = " > A A A B 8 H i c b V B N S 8 N A E J 3 4 W e t X 1 a O X x S J 4 K k k V 9 F j 0 4 r G C / Z A 2 l M 1 2 0 i 7 d b M L u R i i x v 8 K L B 0 W 8 + n O 8 + W / c t j l o 6 4 O B x 3 s z z M w L E s G 1 c d 1 v Z 2 V 1 b X 1 j s 7 B V 3 N 7 Z 3 d s v H R w 2 d Z w q h g 0 W i 1 i 1 A 6 p R c I k N w 4 3 A d q K Q R o H A V j C 6 m f q t R 1 S a x / L e j B P 0 I z q Q P O S M G i s 9 P L l d R e V A Y K 9 U d i v u D G S Z e D k p Q 4 5 6 r / T V 7 c c s j V A a J q j W H c 9 N j J 9 R Z T g T O C l 2 U 4 0 J Z S M 6 w I 6 l k k a o / W x 2 8 I S c W q V P w l j Z k o b M 1 N 8 T G Y 2 0 H k e B 7 Y y o G e p F b y r + 5 3 V S E 1 7 5 G Z d J a l C y + a I w F c T E Z P o 9 6 X O F z I i x J Z Q p b m 8 l b E g V Z c Z m V L Q h e I s v L 5 N m t e K d V 6 p 3 F + X a d R 5 H A Y 7 h B M 7 A g 0 u o w S 3 U o Q E M I n i G V 3 h z l P P i v D s f 8 9 Y V J 5 8 5 g j 9 w P n 8 A t n e Q W Q = = < / l a t e x i t >â † < l a t e x i t s h a 1 _ b a s e 6 4 = " A Y j L f g 7 j o V q V S 4 2 J f T n V / k 9 w H y A = " > A A A B 8 n i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e x G Q Y 9 B L x 4 j m A f s r q F 3 d j Y Z M v t g p l c I I Z / h x Y M i X v 0 a b / 6 N k 2 Q P m l j Q U F R 1 0 9 0 V Z F J o t O 1 v q 7 S 2 v r G 5 V d 6 u 7 O z u 7 R 9 U D 4 8 6 O s 0 V 4 2 2 W y l T 1 A t B c i o S 3 U a D k v U x x i A P J u 8 H o d u Z 3 n 7 j S I k 0 e c J x x P 4 Z B I i L B A I 3 k e k N A C o 9 e C I N + t W b X 7 T n o K n E K U i M F W v 3 q l x e m L I 9 5 g k y C 1 q 5 j Z + h P Q K F g k k 8 r X q 5 5 B m w E A + 4 a m k D M t T + Z n z y l Z 0 Y J a Z Q q U w n S u f p 7 Y g K x 1 u M 4 M J 0 x 4 F A v e z P x P 8 / N M b r 2 J y L J c u Q J W y y K c k k x p b P / a S g U Z y j H h g B T w t x K 2 R A U M D Q p V U w I z v L L q 6 T T q D s X 9 c b 9 Z a 1 5 U 8 R R J i f k l J w T h 1 y R J r k j L d I m j K T k m b y S N w u t F + v d + l i 0 l q x i 5 p j 8 g f X 5 A 8 c g k O w = < / l a t e x i t > a † < l a t e x i t s h a 1 _ b a s e 6 4 = " A Y j L f g 7 j o V q V S 4 2 J f T n V / k 9 w H y A = " > A A A B 8 n i c b V D L S g N B E J y N r x h f U Y 9 e B o P g K e x G Q Y 9 B L x 4 j m A f s r q F 3 d j Y Z M v t g p l c I I Z / h x Y M i X v 0 a b / 6 N k 2 Q P m l j Q U F R 1 0 9 0 V Z F J o t O 1 v q 7 S 2 v r G 5 V d 6 u 7 O z u 7 R 9 U D 4 8 6 O s 0 V 4 2 2 W y l T 1 A t B c i o S 3 U a D k v U x x i A P J u 8 H o d u Z 3 n 7 j S I k 0 e c J x x P 4 Z B I i L B A I 3 k e k N A C o 9 e C I N + t W b X 7 T n o K n E K U i M F W v 3 q l x e m L I 9 5 g k y C 1 q 5 j Z + h P Q K F g k k 8 r X q 5 5 B m w E A + 4 a m k D M t T + Z n z y l Z 0 Y J a Z Q q U w n S u f p 7 Y g K x 1 u M 4 M J 0 x 4 F A v e z P x P 8 / N M b r 2 J y L J c u Q J W y y K c k k x p b P / a S g U Z y j H h g B T w t x K 2 R A U M D Q p V U w I z v L L q 6 T T q D s X 9 c b 9 Z a 1 5 U 8 R R J i f k l J w T h 1 y R J r k j L d I m j K T k m b y S N w u t F + v d
Firstly, we define IPAG circuits with m modes and n photon additions as: (i) product vacuum state over m modes in input, (ii) an evolution composed of interleaved multimode Gaussian unitariesĜ (0) , . . . ,Ĝ (n) and n singlemode photon additions, and (iii) a Gaussian measurement. Without loss of generality, all the photon additions act on the first mode, since swapping two modes is a Gaussian operation. Moreover, up to an added Gaussian unitary to the final Gaussian unitaryĜ (n) , the measurement may be written as a tensor product of balanced heterodyne detections (Fig. 3).
The stellar hierarchy of single-mode pure quantum states derived in [8] details the engineering of a singlemode quantum state from the vacuum using unitary Gaussian operations and single photon addition as a non-Gaussian operation. In particular, the states of finite stellar rank, which correspond to the states that can be obtained from the vacuum using a finite number of single photon additions or subtractions, are shown to be exactly the states that are obtained by applying a Gaussian unitary operation to a single-mode core state (see Eq. (1)).
The situation is different in the multimode case: we show that the set of states that can be obtained from a multimode core state with a multimode Gaussian unitary operation is strictly larger than the set of states that can be obtained from the vacuum using a finite number of single photon additions and Gaussian unitary operations. We do so by providing explicitly an example of a state that is a multimode core state, and that is not obtainable with IPAG circuits (Lemma 1). We also deduce strong simulability results for IPAG circuits.
We first establish a reduction to an equivalent model where the evolution and measurement are Gaussian and only the input state is non-Gaussian. This is done by commuting the photon additions to the input of the circuit. The output state of an IPAG circuit with m modes, n photon additions and Gaussian unitariesĜ (0) , . . . ,Ĝ (n) is given byĜ Gaussian operations act on annihilation and creation operators through their symplectic representation, inducing affine transformations of the vector of annihilation and creation operators [33]. Let us define the column vector of ladder operators and letĜ be an m-mode Gaussian operation. Then, there exists a 2m × 2m symplectic matrix S = (s ij ) 1≤i,j≤2m and a complex vector d = (d 1 , . . . , d m ), such that for all k ∈ {1, . . . , m}, where (Sλ) k indicates the k th element of the column vector Sλ and the identity operator is omitted for brevity. Hence, commuting to the right the creation operators in Eq. (14), starting by the rightmost one, yieldŝ where S (k) and d (k) = (d m ) implement the affine transformation corresponding to the action of (Ĝ (k)Ĝ( k − 1) . . .Ĝ (0) ) † , for all k ∈ {0, . . . , n − 1}. Writ-ingĜ :=Ĝ (n)Ĝ(n−1) . . .Ĝ (0) and S (k) = (s (k) i,j ) 1≤i,j≤2m for k ∈ {0, . . . , n − 1}, we obtain the output statê where the state is a multimode core state of degree equal to n, by property of symplectic matrices. Using this characterisation, we obtain the following result: Lemma 1. The set of output states of IPAG circuits is strictly included in the set of output states of G core circuits.
We prove this Lemma in Appendix D by showing that the core state 1 √ 2 (|20 + |01 ) cannot be generated by an IPAG circuit. In other words, the set of states that can be obtained from a multimode core state with a multimode Gaussian unitary operation is strictly larger than the set of states that can be obtained from the vacuum using a finite number of single photon additions and Gaussian unitary operations, unlike in the single mode case, where the two sets coincide.
When n = O(1), the support size of the core state |C IPAG in Eq. (19) is O(poly m) and its degree is O(1). Then, from a direct application of Theorem 2 we obtain: Lemma 2. IPAG circuits over m modes with n = O(1) photon additions can be strongly simulated efficiently classically.
When n = O(log m) however, the support size of the core state is superpolynomial, so the classical simulation is no longer efficient. Note that when photon additions are implemented using Gaussian operations and threshold detection, Ref. [34] gives a classical simulation algorithm which has exponential space complexity in the number of photon additions.
Similarly, we define Interleaved Photon-Subtracted Gaussian circuits (IPSG) by replacing photon additions by subtractions in the definition of IPAG circuits. With a similar proof we obtain the following result: Corollary 2. IPSG circuits over m modes with n = O(1) photon subtractions can be strongly simulated efficiently classically.
Note that the same reasoning holds for Gaussian circuits interleaved with both photon additions and subtractions.
A particular subclass of IPAG circuits, where all the photon additions act at the beginning of the circuit, is the class of G Fock circuits, i.e., Gaussian circuits with Fock state input. In that case, the input is a multimode core state of support size 1. With Corollary 1, we obtain the following result as an immediate consequence of Theorem 2: Lemma 3. Let m ∈ N * and let n ∈ N m , such that |n| = O(log m). Then, G Fock circuits over m modes with Fock state input |n and heterodyne detection can be strongly simulated efficiently classically.
In other words, sampling with Gaussian measurements over m modes from n = O(log m) indistinguishable photons is strongly simulable classically. This contrasts with the case where m = Ω(poly n): in that case, strong simulation and even weak simulation is classically hard [16].
Note that, when restricting G Fock circuits to Boson Sampling circuits [12] using projection onto two-mode squeezed states as described in the previous section, we retrieve the fact that computing classically the output probabilities is efficient for a logarithmic number of input photons.

V. DISCUSSION AND CONCLUSIONS
In this work, we generalised the notion of stellar rank to the multimode setting and we studied the simulatability of Gaussian circuits with multimode non-Gaussian input states of finite stellar rank, based on the properties of their underlying core states. In particular, we have shown that G core circuits over m modes, with input states possessing a support of size n = O(poly m) over the Fock basis and which stellar function has degree n = O(log m) can be strongly simulated efficiently classically.
Note that this result, formalised in Theorem 2, outperforms existing previous results available in the litterature. In particular, in Ref. [35] it is shown that the cost for classically estimating the probability of a specific outcome of a quantum circuit-an easier task than sampling or strong simulatability-scales polynomially with the Wigner negativity of the circuit. In terms of the more commonly used Wigner logarithmic negativity [36], that result could be reformulated by saying that the cost for classically estimating the probability of a specific outcome of a quantum circuit scales exponentially with the Wigner logarithmic negativity of the circuit [41]. For some core states of degree O(log m), such as the Fock state with log m photons in log m modes and vacuum in the other m − log m modes, the Wigner logarithmic negativity is Ω(log m log(log(m))). Therefore, the classical cost for estimating outcome probabilities with the simulation algorithm from Ref. [35] becomes superpolynomial, i.e., is no longer classically efficient. In contrast, the results on the efficient classical simulatability of Theorem 2 show that we can simulate efficiently classically these circuits. Also note that our results deal with a stronger notion of simulatability than outcome probability estimation.
Our results are complementary to those in a recently appeared work [7], where non-Gaussian states with unbounded Wigner negativity supplemented to Gaussian circuits are also shown to be classically efficiently simulable. In that work, the simulatability with input unbounded non-Gaussianity, namely with states characterized by infinite stellar rank, is possible due to the fact that the input states are discrete-variable stabiliser states encoded in CV by means of some bosonic encoding, such as for instance the Gottesman-Kitaev and Preskill one [9].
We have also identified various subclasses of G core circuits to which our simulation algorithm applies. In particular, we have shown that Gaussian circuits interleaved with a constant number of photon additions or subtractions can be efficiently strongly simulated classically. However, the classical algorithm is no longer efficient when the number of photon additions is logarithmic in the number of modes. This contrasts with the-intuitively equivalent-discrete variable case where strong simulation of Clifford circuit supplemented with a logarithmic number of T gates is classically efficient [37,38]. It would be interesting to investigate whether this is a fundamental difference between the discrete-and continuous-variable cases, or else if more efficient classical simulation algorithms may be derived in the case of CV circuits based on photon addition and subtraction.
Since any quantum state may be approximated up to arbitrary precision using core states, we expect to obtain approximate simulation results for circuits with specific input non-Gaussian states, such as cat states or GKP states [9]. It would also be interesting to investigate further weaker notions of classical simulability for these circuits, such as weak simulation. Finally, it is an open question whether the measure of non-Gaussianity based on the stellar rank can be related to other operational tasks, analogously to [39]. We leave these questions for future work.
We make use of Faà di Bruno's formula [40] in order to expand the product of partial derivatives and we obtain where Π(E p,q ) denotes the set of all partitions of the multiset E p,q , and where the product runs over the blocks B of the partition π ∈ Π(E p,q ), with |B| the size of the block. The functionβ † Vβ + D †β is a sum of a quadratic and a linear functions, so all derivatives of order greater than 2 in the sum vanish. We thus have where Π 1,2 (E p,q ) denotes the set of all partitions of the multiset E p,q in subsets of size 1 and 2. All derivatives of order 2 of the linear term vanish, and all derivatives of order 1 of the quadratic term vanish when evaluated atβ = 0. We thus obtain We now show that this expression may be rewritten as the loop hafnian of a matrix of size |p| + |q|. Define V p,q the (|p| + |q|) × (|p| + |q|) matrix obtained from V by repeating p k times its k th rows and columns and q k times its (m + k) th rows and columns, for k ∈ {1, . . . , m}. Similarly, define D p,q the column vector of size |p| + |q| obtained from D by repeating p k times its k th element and q k times its (m + k) th element, for k ∈ {1, . . . , m}. Finally, let A p,q (V, D) = (a ij ) 1≤i,j≤|p|+|q| be the (|p| + |q|) × (|p| + |q|) matrix obtained from V p,q by replacing its diagonal with the vector D p,q . Then, Eq. (B6) rewrites Let us illustrate with an example how the matrix A p,q (V, D) appearing in Lemma 4 is constructed from the matrix V and the vector D. Let us set m = 2, p = (2, 0) and q = (1, 0). We write We first build the matrix V p,q by repeating p k times the k th row and column of V and q k times the (m + k) th row and column. In that case, p = (p 1 , p 2 ) = (2, 0), so we repeat 2 times the first row and column and discard the second row and column, and q = (q 1 , q 2 ) = (1, 0), so we keep the third row and column and discard the fourth row and column, obtaining the 3 × 3 matrix Similarly, we obtain the vector D p,q by repeating p k times the k th element of D and q k times the (m + k) th element, as Finally, we replace the diagonal of V p,q by D p,q : In this construction by repeating rows and columns, the first index denotes which rows and columns are repeated for indices in {1, . . . , m}, while the second index denotes which rows and columns are repeated for indices in {m+1, . . . , 2m}. Combining Lemma 4 with phase space formalism and properties of Gaussian states, we are now ready to prove Theorem 1: Proof. The Gaussian circuit is composed of a Gaussian unitaryĜ and balanced heterodyne detection. The output probability density reads, for all α = (α 1 , . . . , α m ) ∈ C m , where Π α = 1 π m |α α| is the POVM element corresponding to the heterodyne detection of α = (α 1 , . . . , α m ). The stateĜ † |α is a Gaussian state: let V be its covariance matrix and d its displacement vector. For all γ ∈ C m , we writeγ = (γ 1 , . . . , γ m , γ * 1 , . . . , γ * m ). Then, for all β ∈ C m , i.e., it is a Gaussian function which can be computed efficiently. On the other hand, we have so that for all β ∈ C m . Moreover we have, for all p, q ∈ N m and all β ∈ C m , where δ 2m (β, β * ) = δ(β 1 ) · · · δ(β m ) δ(β * 1 ) · · · δ(β * m where we have set for all β ∈ C m , the integral terms in Eq. (B17) rewrite as for |p| ≤ n and |q| ≤ n, where is a 2m × 2m symmetric matrix, due to the initial structure of the covariance matrix, and where is a column vector of size 2m. By Lemma 4, the terms in Eq. (B20) are equal to where the square matrices A p,q of size |p| + |q| are obtained from V by repeating its entries according to p and q and replacing the diagonal by the corresponding elements of D (see the example following Lemma 4 for a detailed description of the construction). With Eq. (B17) we finally obtain Pr core [α] = κ(α,Ĝ) p,q∈N m |p|≤n,|q|≤n where κ(α,Ĝ) = exp − 1 2d † (V + 1 2m /2) −1d π m Det (V + 1 2m /2) , where V and d are the covariance matrix and the diplacement vector of the Gaussian stateĜ † |α , respectively.
Appendix C: Proof of Theorem 2 Proof. By Theorem 1, up to an efficiently computable prefactor, the output probability density is a sum of s 2 loop hafnians, where s is the support size of the input core state. The loop hafnian of a matrix of size r may be computed in time O(r 3 2 r/2 ) [28]. For |p| ≤ n and |q| ≤ n, the matrices A p,q appearing in Eq. (7) are efficiently computable square matrices of size |p| + |q| ≤ 2n, so all the loop hafnians may be computed in time O(n 3 2 n ). Hence, the output probability density can be evaluated in time O(s 2 n 3 2 n + poly m). We now consider the evaluations of the marginal probability densities. Let k ∈ {1, . . . , m − 1}, for all α = (α 1 , . . . , α k ) ∈ C k we have Pr core [α] = Tr Ĝ |C C|Ĝ † (Π α ⊗ 1 m−k ) where Π α = 1 π k |α 1 , . . . , α k α 1 , . . . , α k | is the POVM element corresponding to the heterodyne detection of (α 1 , . . . , α k ) over the first k modes. With Lemma 4 and the proof of Theorem 1, it is sufficient to show that QĜ † (|α α|⊗1 m−k )Ĝ is an efficiently computable Gaussian function in order to prove that the marginal probability density can be evaluated in time O(s 2 n 3 2 n + poly m).