Testing the nature of gravitational-wave polarizations using strongly lensed signals

Gravitational-wave (GW) observations by a network of ground-based laser interferometric detectors allow us to probe the nature of GW polarizations. This would be an interesting test of general relativity (GR), since GR predicts only two polarization modes while there are theories of gravity that predict up to six polarization modes. The ability of GW observations to probe the nature of polarizations is limited by the available number of linearly independent detectors in the network. (To extract all polarization modes, there should be at least as many detectors as the polarization modes.) Strong gravitational lensing of GWs offers a possibility to significantly increase the effective number of detectors in the network. Due to strong lensing (e.g., by galaxies), multiple copies of the same signal can be observed with time delays of several minutes to weeks. Owing to the rotation of the earth, observation of the multiple copies of the same GW signal would allow the network to measure different combinations of the same polarizations. This effectively multiplies the number of detectors in the network. Focusing on strongly lensed signals from binary black hole mergers that produce two observable"images", using Bayesian model selection and assuming simple polarization models, we show that our ability to distinguish between polarization models is significantly improved.


I. INTRODUCTION
Recent observations of gravitational waves (GW) [1][2][3][4][5] have offered new tests of general relativity (GR) in a regime inaccessible by other astronomical observations and laboratory tests [6,7]. One set of interesting probes includes the nature of GWs themselves [8]. For example, the near-simultaneous observations of the GW and gamma-ray signals from the binary neutron star merger GW170817 [9] have provided a stringent constraint on the speed of GWs, which, in turn, has ruled out several alternative theories to GR invoked to explain the accelerated expansion of the universe [10]; the amount of dispersion in the observed GW signals is constrained by bounding their deviation from GR-based templates, which in turn has provided an interesting upper bound on the mass of the graviton [6,7,11]. Similarly, tests that constrain the values of various post-Newtonian parameters [6,7,[12][13][14][15] describing the GW signal also have constrained parameters of alternative theories [16].
Accurate measurement of the polarizations of GWs provide yet another opportunity to test the predictions of GR. According to GR, GWs have only two independent polarization states -two transverse quadrupole (or, tensor) modes, while a general metric theory of gravity can admit up to six polarization modes (Fig. 1). For example, scalar-tensor theories admit two monopole (or, scalar) modes in addition to the tensor modes -massless scalar-tensor theories admit a transverse scalar (or, breathing) mode, while massive scalar-tensor theories admit both transverse and longitudinal scalar modes [8]. More general theories, such as bimetric theories [17], also admit two dipole (or, vector) modes.
GW polarizations can be constrained from the observation of long-lived signals from spinning neutron stars [18] and stochastic sources [19][20][21] as well as from the observation of transient sources such as compact binary mergers [22,23]. While the detection probabilities of spinning neutron stars and stochastic background are uncertain, we are expecting the detection of hundreds to thousands of compact binary signals in the next few years, using ground-based GW detectors such as LIGO [24], Virgo [25], KAGRA [26] and LIGO-India [27]. Note that each of these quadrupole detectors observes only one linear combination of these polarizations. The relative strength of each polarization mode in the observed signal in each detector depends on the response of the detector to the specific polarization. It turns out that the detector response to both the scalar modes (breathing and longitudinal modes) are identical, making them completely degenerate [28]. In summary, even in the ideal case, if we want to disentangle all the five non-degenerate polarization modes from the GW data, we need at least five detectors having different orientations. This would be challenging even when upcoming detectors such as KAGRA and LIGO-India join the international GW network, since the two LIGO detectors in Hanford and Livingston are co-aligned with each other and hence measure the same linear combination of the polarizations.
Given the data from a network of GW detectors, we can compare the posterior probabilities of different hypotheses, for example, one hypothesis stating that the polarizations are exactly as predicted by GR, while the alternative hypothesis accommodating the presence of additional modes [22]. Motivated by the limited number of linearly independent detectors to observe the polarization modes, the current probes of the nature of GW polarizations have employed highly simplified hypotheses as alternatives to GR. That is, the alternative hypothesis assumes that the polarizations contain only scalar modes or only vector modes (no tensor modes). The analyses of some of the compact binary merger observations by LIGO and Virgo have concluded that the tensor-only hypothesis is preferred over scalar-only or vector-only hypotheses [7,29].
GWs are gravitationally lensed by intervening matter distributions, such as galaxies and galaxy clusters. Although the gravitational lensing is best understood in GR, alternative theories also predict this effect (see, e.g., [30]). Recent estimates suggest that a small fraction (∼ 0.1% − 0.5%) of the binary black hole (BBH) signals that we expect to detect using the LIGO-Virgo network will be strongly lensed by intervening galaxies [31,32], producing multiple "images" of the signals 1 . These signals arrive at the detector with relative time delays ranging from several minutes to several weeks [31]. Lensing by galaxies or clusters is very well approximated by geometric optics since the mass scale of the lens is significantly larger than the wavelength of GWs (GM lens /c 2 λ GW ). Thus, multiple images would correspond to copies of the same signal with a relative magnification and time delay.
Due to the rotation of the earth, observed signals from multiple images will involve different combinations of the same polarizations. As far as the polarization content is concerned, this is equivalent to observing the same signal with a multiplied number of detectors. For example, if two images of the merger are observed using a three-detector network, this is equivalent to observing the one merger signal with a six-detector network. In this paper, we explore the possibility of constraining the polarization content of GWs using BBH mergers that are strongly lensed by galaxies. We use the Bayesian model selection method proposed by [22] to identify the polarization content of simulated GW signals from binary black holes. We show that strongly lensed GW signals will enable us to constrain the polarization content significantly better than their unlensed counterparts.
The rest of the paper is organized as follows: Section II summarizes our methodology, providing a brief introduction to the relevant theory, model selection formalism, as well as the details of the numerical simulations. Section III presents the results, while some concluding remarks and future work is discussed in Sec. IV. 1 Galaxy clusters will also cause strong lensing of the GW signals. However, the lensing probability due to clusters is likely to be significantly smaller than that by galaxies [33]. However, cluster lensing is expected to produce longer time delays between images, which will further improve the our ability for polarization model selection from lensed GW signals.

A. GW polarizations
In the local Lorentz gauge, the spatial components of the metric perturbation h i j at a given space-time point x can be written in terms of six linearly independent polarization tensors, where, the index A stands for different polarizations: tensor "plus" (+) and "cross" (×) modes, vector "x" and "y" modes and scalar "breathing" (b) and "longitudinal" (l) modes; and h A is the amplitude for polarization A. The existence of six independent polarization modes (or, six linearly independent components of the metric perturbation) can be understood in the following way: the full metric perturbation h µν in four dimensions is symmetric and therefore has ten independent components. However, because of the Lorentz gauge condition, four degrees of freedom are taken away, leaving only six. (GR, in addition, satisfies the transverse-traceless gauge condition which takes away additional 4 degrees of freedom and hence allowing only two tensor polarization modes). Further, the polarization tensors can be written in terms of the orthogonal basis vectors w x , w y , w z ≡ w x × w y , where w z is the GW propagation direction.
A ground-based laser interferometric detector measures a combination of these polarizations by the change in lengths of its perpendicular arms. This response is encoded in the detector tensor D, whose components are given by where d x and d y are unit vectors along the detector arms, with a common origin. The strain, h I measured by detector I, is then given as, where, F A I ≡ D i j I e A i j , are called the detector antenna pattern functions, which encode the response of the detector I to polarization A. Therefore, GW strain measured at the detector I can be written as the linear combination of polarization amplitudes multiplied with the corresponding antenna pattern functions. Expanding, F A using Eqs. (2.2) and (2.3): In this Section, we denote four-vectors by the use of arrows (e.g., x) and three-vectors by boldface (e.g., x) and tensors by sans serif fonts (e.g., e). Repeated indices are assumed to be summed over.
Evaluating these antenna pattern functions at a particular detector involves projecting the polarization tensors into the detector frame. This projection depends on the direction from which GW arrives for a particular detector and hence the sky-location of the GW source. We choose to describe the source using the equatorial coordinate system, in terms of right ascension α and declination δ. Additionally, antenna pattern functions also depend on polarization angle, ψ, which is due to the rotational freedom of the orthonormal vectors (w x , w y ) about the propagation direction w z . Further, the detector response and hence the antenna pattern functions also depend on time due to the rotation of the earth. Thus, the antenna pattern functions of the detector I for the polarization mode A can, in general, be written as F A I (α, δ, ψ, t). The antenna pattern functions for tensor polarizations are computed using the PyCBC software package [34]. We added in it the antenna response to the scalar and vector polarizations using Eq.(2.5).
For three detector network, we have three strain measurements giving three different linear combinations of polarizations, as given by Eq.(2.4). However, due to strong lensing, multiple copies of the same GW signal would arrive at each detector with time delay ∆t of minutes to weeks. Due to the rotation of the earth, the antenna patterns during the arrival of, say two images, F A I (α, δ, ψ, t) and F A I (α, δ, ψ, t + ∆t) can be considerably different from each other. This is equivalent to observing one signal with a six-detector network.
According to GR, in the geometrical optics limit, polarization tensors are parallelly propagated along the null geodesics, implying that lensing does not change the polarization content of a GW. We assume this to be true in alternative theories also. As long as the metric perturbation follow a source-free wave equation, the polarization tensor should be conserved along the GW propagation.

B. Model selection of polarizations
Bayesian model selection allows us to assign posterior probabilities for various hypotheses pertaining to the observed data. We formulate the polarization content of GWs as different Bayesian hypotheses. For e.g., GR is denoted as H t , as the theory only predicts tensor modes. The hypothesis that GWs contain only scalar (vector) modes is denoted as H s (H v ). Following [18], we assume that the waveforms in H s and H v are the same as in H t ; the only change is in the antenna pattern functions. We do this as the BBH waveforms for alternative theories are presently not known. If available in the future they can be included in the same formalism.
Given the set of data {d} from a network of detectors, the marginalized likelihood (or, Bayesian evidence) of the hypothesis H p can be computed by where θ is a set of parameters that describe the signal under hypothesis H p (including the masses and spins of the compact objects in the binary, location and orientation of the binary and the arrival time and phase of the signal), P(θ) is the prior distribution of θ (which we take to be independent of H p ), and P({d}|θ, H p ) is the likelihood of the data {d}, given the parameter vector θ and hypothesis H p . Given the hypothesis H p and data {d}, we can sample and marginalize the likelihood over the parameter space using an appropriate stochastic sampling technique such as Nested Sampling [35]. Bayesian model selection allows us to compare multiple hypotheses. For e.g., the odds ratio O t s is the ratio of the posterior probabilities of the two hypotheses H t and H s . When O t s is greater than one then hypothesis H t is preferred over H s and vice versa. Using Bayes theorem, the odds ratio can also be written as the product of the ratio of the prior odds P t s of the hypotheses and the likelihood ratio, or Bayes factor B t s : Since GR has been tested well in a variety of settings, our prior odds are going to be highly biased towards tensor-only modes, i.e., P t s 1. Hence, in order to claim evidence of non-tensor modes the corresponding Bayes factor supporting the alternative hypothesis has to be very large. Since the Bayes factor is the only quantity that is derived from data, for the rest of the paper, we focus on the Bayes factor. The Bayes factor from multiple, uncorrelated events d (i) can be combined as is the Bayes factor obtained from the ith event.
C. Model selection of polarizations using lensed GW events When multiple GW events are produced by the strong lensing of a signal BBH merger, these events cannot be treated as uncorrelated events. Here we derive the Bayes factor between different polarization hypotheses H p using multiple lensed images of the same merger. For simplicity, we will consider only two lensed images. However, the same formalism can be extended to more than two images also. Strong lensing of GWs from BBHs is expected to be dominated by galaxy lenses. Lensing by galaxies and galaxy clusters can be treated in geometric optics regime (wavelength of GWs significantly smaller than the mass scale of the lens). In this regime, lensing does not change the frequency profile of the waveform, and hence multiple images, arriving at the detector at different times, differ from each other only by a relative magnification and a constant phase shift 3 [36,37]. Hence the parameters describing the waveform, except for the luminosity distance (which is degenerate with the lensing magnification) and the time and phase at coalescence will be common between the two images. Now consider the GW signals d (1) and d (2) produced by the strong lensing of a BBH merger. The Bayes factor between two polarization hypotheses H t and H s can be written as where θ c is the vector of common parameters as the signals come from the same merger. Note that the probability distributions are marginalized over all the parameters except θ c . Using the Bayes theorem, the likelihoods P({d (i) }|θ c , H p ) can be written in terms of the posteriors P(θ c |{d (i) }, H p ) as where B t s (1) and B t s (2) are the Bayes factors of the polarization hypotheses obtained from event 1 and 2, respectively [see Eq.(2.9)], while B l u is the lensing Bayes factor defined in [31]. That is, This is the Bayes factor between a different set of two hypotheses H l and H u , which are different from the hypotheses on polarization content. The lensing hypothesis H l states that this pair of events are lensed images of the same merger, while the unlensed hypothesis H u states that these are unrelated events. It can be seen from Eq.(2.14) that B l u |H p is the prior-weighted inner product of the posteriors of the common parameters θ c obtained from the two images. These posteriors are computed assuming the polarization hypothesis H p . Note that these posteriors, and hence the lensing Bayes factor can be computed assuming different hypotheses for the polarization content From Eq.(2.13) it is evident that, given a pair of lensed events, the combined Bayes factor between the two polarization hypotheses is the product of the Bayes factors computed from the individual events multiplied by an extra term B l u |H t B l u |H s , which we call the posterior overlap ratio. We can do parameter estimation for the individual events assuming different polarization hypotheses H p to get the posteriors and marginal likelihoods (evidence) from each event. Later, from these posteriors, we compute the overlap factors B l u of the two events. If the posterior overlaps using the correct polarization hypothesis, say H t , is larger than the same using the wrong hypothesis, say H s , (that is, if B l u |H t > B l u |H s ), then the combined Bayes factor B t s of the two lensed events will be larger than the same computed from two unlensed events with the same individual Bayes factors B t s (1) and B t s (2) . This suggests that lensed events can improve our ability to identify the right polarization hypothesis, with an improvement factor given by the posterior overlap ratio the current detectors [38,39] 4 , we use a simulated catalog of lensed BBH events presented in [31] to study the efficacy of polarization recovery. Simulations in [31] generated the observed parameters for different pairs of lensed BBH merger signals, assuming a well-motivated distribution of lens and source properties [31]. In the geometric optics regime, for a particular event, intrinsic parameters like the black holes' redshifted masses (m 1 and m 2 ) and spins as well as the extrinsic parameters like the sky-location (α and δ) and orientation (ι and ψ) of the binary remain the same for the multiple images. Similarly, the coalescence (orbital) phase φ 0 estimated from the two images should be consistent, apart from a possible constant shift of n π/4, where n is an integer [36] 5 . However, the strain amplitude, magnified due to strong lensing and is degenerate with the luminosity distance (d L ) is different. Apart from this, the observed time of coalescence (t 0 ) of the two signals is different due to the time delay between the arrival of the two signals (ranging from several minutes to several weeks). Therefore, except d L and t 0 , all the other parameters can contribute to our common parameters vector θ c . We limit ourselves to non-spinning binaries and perform Bayesian parameter estimation of the following parameters: m 1 , m 2 , α, δ, d L , t 0 , φ 0 , ι, ψ. However, in order to compute the posterior overlap ratio, we consider only the following parameters: m 1 , m 2 , α, δ, ι. That is, the posteriors are marginalized over all other parameters. This is based on our empirical observation that this choice of 4 We note that, during the late states of the preparation of this paper, [40] has identified an unusual lensing candidate in the data of the second observing run of LIGO and Virgo. 5 Since we assume that the hypothesis that the given pair of events are lensed copies of the same merger has been established using prior analysis, we would know the value of this constant phase shift between the two signals.
parameters typically provide the largest values of the posterior overlap ratios. From the simulated parameters of the lensed events, we generate GW signals at different detectors for each polarization hypothesis -tensor H t , vector H v , and scalar H s , using the corresponding antenna pattern functions (see Sec. II A). The model signals are generated using the antenna patterns of the corresponding polarizations, but always assuming that the time evolution of the simulated waveform always follow the GR waveform. That is, h s (t) = h x (t) = h + (t) and h l (t) = h y (t) = h × (t).
For these simulations, we consider a three detector network consisting of two US-based Advanced LIGO detectors located in Hanford, WA and Livingston, LA and the Advanced Virgo detector located in Pisa, Italy. The LIGO detectors were assumed to have their design sensitivity with the power spectral density (PSD) given in [41] while the Virgo detector was assumed to have the PSD given in [42]. GW signals were simulated using the IMRPhenomPv2 waveform approximant [43][44][45]  We use the standard Gaussian likelihood model for estimating the posteriors of the parameters under different polarization hypotheses (see, e.g., [49]). For simplicity, no noise is added to the simulated signals. Further, we considered only non-spinning binaries. Thus, the likelihood is computed over the following parameters m 1 , m 2 , α, δ, d L , ι, ψ, φ 0 , t 0 . We use uniform priors in component masses of the binary (m 1 , m 2 ∈ [3, 500]M ), isotropic sky location (uniform in α, sin δ) and orientation (uniform in cos ι, φ 0 ), uniform in polarization angle ψ, and a volumetric prior ∝ d 2 L on luminosity distance. Finally, the posteriors are marginalized over all the parameters except the ones that we consider for calculating the posterior overlaps, i.e., m 1 , m 2 , α, δ, ι.
As one would anticipate, the true (injected) parameters are recovered when the injection and recovery model are the same. As an example, Fig. 2 shows the estimated posterior distributions when the injections and recovery are performed using the same tensor hypothesis (i.e., the H t [I] − H t [R] combination). In contrast, when parameter estimation on the tensor injection is performed using vector and scalar hypotheses, the intrinsic parameters (primary and secondary masses) are still recovered well, whereas extrinsic parameters such as the sky location and orientation are not recovered well (see Figs. 3 and 4). This is due to the fact that the recovery of the extrinsic parameters heavily depends on the antenna pattern functions, which are different for the injection and recovery models.
Further, note that sky location posteriors (α, sin δ) of the lensed pairs overlap well with the tensor model (Fig. 2) and not so well with vector and scalar models (Figs. 3 and 4  combinations, respectively). As a result, in general, we would expect that the lensing Bayes factor B l u will be larger for the tensor model. This is quantified in the next section.

III. RESULTS
Our aim is to quantify how well pairs of GW signals produced by the strong lensing of BBH mergers improve our ability to distinguish between polarization models, as compared to pairs of unrelated signals with similar strengths. If the two signals are unrelated (i.e., unlensed events) then the combined Bayes factor will just be the product of individual Bayes factors [Eq. (2.8)]. On the other hand, if the two events are lensed images of the same merger, then the combined Bayes factor is the product of the individual Bayes factors and additional factor, B l u |H right B l u |H wrong , which we call the posterior overlap ratio [Eq. (2.13)]. If the posterior overlaps ratio is greater than one then for the correct polarization hypothesis, this would show that lensing improves our ability to identify the right polarization hypothesis. Figure 5 (top panels) shows the distribution of the polarization Bayes factors from the simulated GW signals. The fact that Bayes factors are almost always greater than 1 (log Bayes factors > 0) suggests that the right polarization hypothesis is almost always preferred. Note that, overall, the lensed Bayes factors are greater than unlensed ones, showing that the strong lensing improves the polarization models selection. This is also evident from the distribution of the posterior overlap ratios (bottom panel of Fig. 5). Note that the overlap ratios are greater than 1 (log overlap ratio > 0) for ∼ 85 − 95% of the simulated events. The median value of the overlap ratio is ∼ 2 − 3, which means that for more than 50% of the events lensing improves the polarization Bayes factor by a factor of e 2 or more.
The lower panel of Fig. 5 shows that, for a small fraction (∼ 5 − 15%) of the simulated events, the posterior overlap ratio is less than (although very close to) one. That is, the lensing Bayes factor [Eq. (2.14)] assuming the right polarization hypothesis is slightly smaller than the same assuming the wrong polarization hypothesis. These unusual event pairs do not show any significant correlations with the intrinsic or extrinsic parameters of the simulated BBHs. However, these event pairs have small lensing time delays (∼ less than half an hour). This is evident from Fig. 6, which plots the log overlap ratios for all the lensed event pairs against the lensing time delay between these images. This observation is broadly consistent with our expectation: during such short time delays (∼ less than half an hour) the change in the antenna patterns of the detectors due to the rotation of the earth is negligible. Hence the antenna patterns at the times of the two images will be very similar to each other. Thus, the linear combination of the polarizations measured from the two events will be practically the same. In other words, the posteriors estimated from the two events will have high overlaps, irrespective of the polarization model used. For such events, lensing is not expected to bring significant additional improvements. This is clear from Fig. 6, which shows that the overlap ratios from the small-time-delay events are modest (less than ∼ e 4 ).
However, we would normally expect that the posterior overlaps of the right polarization model to be at least as large as the same using the wrong polarization models (in other words, the posterior overlap ratio should be greater than or equal to one). The reason for a small fraction of events to have overlap ratios slightly less than one is not well understood. It is likely that, when posteriors (using different polarizations) have very similar overlaps, the final results could be dominated by the numerical errors in our computations, such as the convergence of the parameter estimation and the inaccuracies in estimating the posterior distributions. We leave the detailed investigations on this as future work.

IV. SUMMARY AND FUTURE WORK
Probing the polarization content of the GWs observed by a network of ground-based detectors offers an interesting probe on the nature of gravity. While GR predicts only two (tensor) polarization modes, there are alternative theories that predict up to six polarizations (including scalar and vector modes, apart from the tensor modes predicted by GR). Each ground-based interferometer measures one particular linear combination of all these polarizations. Thus, if there are as many linearly independent detectors in the network as the number of independent polarizations, these polarizations modes can be extracted from the data, in principle. It turns out that the two scalar modes are degenerate as far as observations of ground-based detectors (which are quadrupolar antennas) are concerned. Thus five linearly independent polarization modes can be, in principle, extracted from the data of five linearly independent detectors. In practice, our ability to do this is limited by the presence of noise. In addition, the similar orientation of the two LIGO detectors in the USA makes this job difficult even with the upcoming network of five detectors including LIGO, Virgo, KAGRA and LIGO-India.
Strong lensing of GWs can significantly improve our ability to constrain GW polarizations. Recent estimates suggest that ∼ 0.1 − 0.5% of the GW signals from BBHs that the Advanced GW detectors will observe in the next few years will be strongly lensed by intervening galaxies, producing multiple "images" (copies) of the same signal that arrive at the detector with relative time delays of several minutes to weeks. Since several hundred BBH detections are expected in the next few years, the first observation of lensed GW signals is likely to happen soon. Since the wavelength of the GWs is significantly smaller than the mass scale of these lenses, lensing effects can be calculated using geometric optics. In this limit, lensing does not affect the frequency profile of the GW signals. Thus, the multiple images of a single merger will be comprised of the same GW polarizations (albeit with a relative magnification). Due to the rotation of the earth, each detected image will allow the GW detector network to measure different linear combinations of the same polarizations. This is effectively equivalent to multiplying the number of detectors to observe a single GW signal.
We study the expected improvement, due to lensing, in our ability to probe the nature of GW polarizations making use of the Bayesian model selection formalism that was originally proposed by [22]. This uses a simplified model for the GW polarizations that are not present in GR: We assume that the time evolution of the additional polarizations (scalar and vector modes) follow that of the tensor modes. (Hence our ability to distinguish the polarization models depend greatly on the response of the GW detector network to different polarizations.) Additionally, we make a simplistic assumption that the GW polarizations consist of pure tensor, vector, or scalar modes. . Note that the Bayes factors for the lensed pairs are almost always greater than the same computed from unlensed events with the same parameters. Bottom panels: Corresponding distribution of the ratios of the overlap factors B l u assuming the "right" and "wrong" polarization hypotheses. Note that the overlap ratio is greater than 1 for 85 − 95% of events, suggesting that lensing improves the Bayes factors of the right hypothesis.
We show that strong lensing greatly improves our ability to distinguish the "right" and "wrong" polarization models for the GW signal.
The joint Bayes factors (likelihood ratio between two polarization models) for multiple, unrelated events can be obtained as the product of Bayes factors computed for individual events as the noise and the signal in the individual data segments are unrelated. However, for a pair of strongly lensed events, though the noise is uncorrelated the GW signals present in the data segments are related. We show that the combined Bayes factor from such a lensed event is equal to the product of the individual Bayes factors and an additional factor, namely, posterior overlap ratio, which is the ratio of the prior weighted overlaps of the posterior distributions of the GW parameters 6 that are computed assuming the two polarization models under consideration. From simulated BBH events in the three detector network consisting of Advanced LIGO and Virgo detectors in design sensitivity, we show that the overlap ratio for the majority of lensed events (> 50%) is greater than e 2 − e 3 . This means that the Bayes factor supporting the right polarization hypothesis is improved by a factor of ∼ 7 − 20 for most of the lensed events (as compared to pairs of unlensed signals with similar strengths). The improvement can be as large as several thousands for about 10% of the events. Note that, in this paper, we only consider lensing by galaxies, under the assumption that lensing probability of galaxy clusters is negligible. Lensing by clusters will introduce much larger time delays between images, thus significantly improving our ability to distinguish between polarization models (see, e.g., Fig. 6). The simplistic polarization models that we use in this paper can be extended to more realistic models, where the alternative model to GR would include scalar/vector modes in addition to the tensor modes. Even if we assume that the scalar/vector modes follow the same phase evolution as the tensor modes, this will require us to model the effect of the binary's additional loss of energy and angular momentum (due to additional polarizations) on the orbital evolution itself. Additionally, the polarization model that is used in the model selection will require additional parameters that describe the relative strengths of the scalar, vector, and tensor modes (which will need to be marginalized away). Even then, the model selection described in Sec. II C can be used to characterize the expected improvement due to lensing. Since the "right" polarization model is  6. Correlation of the overlap ratios with the lensing time delay ∆t between the images for each set of injections, created using the tensor (left), vector (middle) and scalar (right) polarization models. Different color markers show the overlap ratios between the "right" and "wrong" polarization models (for e.g., T-V denotes the overlap ratio between posteriors computed using the tensor and vector models). For the events below the black dashed lines, the posterior overlap ratio is less than one; hence lensing does not improve the polarization model selection.
expected to produce larger overlaps between the posteriors estimated from multiple lensed images, we expect that strong lensing will provide similar improvements in our ability to do model selection. Note that, in this paper we consider only double images produced by lensing, while ∼ 30% − 40% of the lensed events will also produce triple or quadruple images [31], potentially providing further improvements in the polarization model selection. Such improvements in the effective number of detectors in the network might also enable us to perform polarization reconstruction in a model agnostic way. We plan to explore these aspects as follow-up projects.