First Identification of a CMB Lensing Signal Produced by $1.5$ Million Galaxies at $z\sim$4: Constraints on Matter Density Fluctuations at High Redshift

We report the first detection of the dark matter distribution around Lyman break galaxies (LBGs) at high redshift through the Cosmic Microwave Background (CMB) lensing measurements with the public {\it Planck} PR3 $\kappa$ map. The LBG sample consists of 1,473,106 objects with the median redshift of $z \sim 4$ that are identified in a total area of 305 deg$^2$ observed by the Hyper Suprime-Cam (HSC) Strategic Survey Program (SSP) survey. After careful investigations of systematic uncertainties, such as contamination from foreground galaxies and Cosmic Infrared Background (CIB), we obtain the significant detection of the CMB lensing signal at $5.1\sigma$ that is dominated by 2-halo term signals of the LBGs. Fitting a simple model consisting of the Navarro-Frenk-White (NFW) profile and the linear-bias model, we obtain the typical halo mass of $2.9^{+9.5}_{-2.5} \times 10^{11} h^{-1} M_\odot$. Combining the CMB lensing and galaxy-galaxy clustering signals on the large scales, we demonstrate the first cosmological analysis at $z\sim4$ that constrains $(\Omega_{{\rm m}0}$, $\sigma_8)$. We find that our constraint on $\sigma_8$ is roughly consistent with the {\it Planck} best-fit cosmology, while this $\sigma_8$ constraint is lower than the {\it Planck} cosmology over the $1\sigma$ level. This study opens up a new window for constraining cosmological parameters at high redshift by the combination of CMB and high-$z$ galaxies as well as studying the interplay between galaxy evolution and larges-scale structure at such high redshift, by upcoming CMB and optical and near-infrared imaging surveys.


INTRODUCTION
Understanding the interplay between galaxy evolution and large-scale structure is key to understanding cosmic evolution. Since galaxies are formed in dark matter halos through gas cooling, such an interplay can be studied by measuring the connection between dark matter halos and galaxies [see reviews by 1,2]. Galaxy-galaxy clustering, the auto-correlation of galaxy positions, is one of the statistical probes that has been widely used to investigate the galaxy-halo connection [e.g., [3][4][5][6][7]. Galaxygalaxy lensing, the cross-correlation between galaxy positions and weak lensing shear of background galaxies, is rapidly emerging as another powerful probe because it enables the direct measurement of dark matter distribution around galaxies [e.g., [8][9][10][11].
The combined measurement of galaxy-galaxy clustering and galaxy-galaxy lensing allows us to constrain cosmological parameters through growth of structure, and thus it is now the standard technique in cosmological galaxy surveys. Such studies can be divided into those using both small and large scales which enable us to in-vestigate the galaxy-halo connection simultaneously [e.g., 12,13], and those focusing on large scales to avoid possible systematics due to our incomplete understanding of galaxy physics [e.g., [13][14][15][16]. These measurements advanced our understanding of cosmic evolution of largescale structure at z < 1 through the constraint on the amplitude of matter power spectrum σ 8 , which are consistent with the Planck cosmology so far. As an independent test, it is of great importance to add such a measurement at higher redshift. Having structure growth measurements with a wide redshift may also allow us to address the H 0 tension between measurements of late time probes such as the type Ia supernova distance and early time probes such as cosmic microwave background (CMB) [e.g., 17,18], through the sensitivity to early dark energy [19].
Recently, detailed study results of galaxy properties and galaxy-halo connection through clustering measurement of Lyman break galaxies (LBGs) at z ∼ 4 − 7 are reported [7,20], where they used wide-field galaxy imaging survey data such as the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) and the Subaru Hyper Suprime-Cam (HSC) Subaru Strategic Program (SSP) Survey [21]. Although one of the next steps is measuring weak lensing on background galaxies around the LBGs, it is practically impossible due to the lack of data for shear measurements of galaxies behind LBGs. Instead, we can use weak lensing on the CMB anisotropies, which is sensitive to the large-scale structure at z > 1. In this paper, using CMB lensing measurements in the Planck satellite data, we present the first detection of dark-matter distribution around LBGs at z ∼ 4. We also show cosmological constraints from clustering and lensing measurements.

DATA HSC GOLDRUSH Catalog
We use the second public data release (PDR2) product taken by the HSC SSP from 2014 March to 2018 January [22]. We use the Wide layer data whose total full color full depth area is 305 deg 2 . The HSC data are reduced by the HSC SSP collaboration with hscPipe version 6.7 [23,24]. The typical 5σ limiting magnitudes in a 1.5 diameter aperture are 26.4, 25.9, 25.8, 25.1, and 24.2 mag in g, r, i, z, and y-bands, respectively.
To construct a z ∼ 4 LBG catalog of Great Optically Luminous Dropout Research Using Subaru HSC (GOL-DRUSH) [25], we use the dropout selection method. Here we briefly describe the selection method; full details of the catalog construction is presented in Harikane et al. [26]. We use the following color criteria to select z ∼ 4 galaxies: which are the same as the GOLDRUSH catalog construction with the first HSC data release [7,25,27]. In addition to these criteria, we apply a detection limit of a > 5σ level in the i-band. We evaluate source detections with aperture magnitudes, and measure total fluxes and colors of sources with convolvedflux magnitudes, which show good agreement with the cModel magnitude in the first public data release [28]. We mask out regions affected by bright object halos or unreliable background subtractions. Finally, we select a total of ∼2,000,000 LBGs. Ono et al. [25] used spectroscopically identified high-z galaxies in the dropout sample to estimate the average redshift of the sample z = 3.8 (median redshift of z ∼ 4), which we adopt for our analysis. The surface number density and angular correlation functions of the LBGs agree well with previous studies. In our analysis, we use 1, 643, 863 LBGs whose total i-band magnitudes are brighter than 25.5 mag, which is reduced to 1,473,106 LBGs after applying the Planck mask described below.
Some foreground objects such as red galaxies at intermediate redshift can satisfy our selection criteria by photometric errors. A contamination rate is estimated to be f cont ∼ 0.25 based on HSC WIDE layer results obtained by [25] who derive a fraction of contaminating galaxies as a function of i-band magnitude by calculating the number of foreground galaxies that satisfy the color criteria.

Planck κ map
We construct the convergence map using lensing products in the third public data release from the Planck satellite mission (PR3) [29] [30]. We use the minimum variance (MV) estimator and masks in the baseline lensing potential estimates. The MV estimators are provided as harmonic coefficients with the maximum multipole L max = 4096, while the mask is in real space. To construct the convergence map, we subtract the MV mean field from the MV convergence, and transform the harmonic coefficients to a real-space map applying a Gaussian filter with σ = 2 . We apply the mask to remove the galactic plane, point sources, and Sunyaev-Zel'dovich (SZ) clusters, which leaves a total unmasked sky fraction f sky = 0.671. This results in the healpix map with NSIDE=2048 where the area of each healpixel is 2.95 arcsec 2 . Note that the HSC-SSP survey does not include the galactic plane, and that we have a full overlap between our LBG sample and the Planck convergence map. After removing regions of point sources and SZ clusters masked in the Planck data, the total overlap area is 270 deg 2 .

MEASUREMENT
To measure the stacked convergence profile, we first create galaxy-centric angular bins for each galaxy by linearly dividing 0 < θ < 20 into 14 bins, and identify healpixels that belong to each bin. Note that we validate the binning scheme as follows. First, we generate the 10000 realizations of mock signals by fluctuating the convergence signal following the jackknife covariance described below. We then compute for each mock realization and confirmed that the χ 2 distribution follows the predicted χ 2 distribution with degree of freedom of 14.
We then collect the healpixels over all galaxies, and compute the average convergence in each bin. To subtract the residual mean field, we also perform the same measurement around random points, and subtract the random signal from the observed convergence profile. The number of random points is 40 times larger than that of the LBGs.
We estimate the jackknife covariance dividing the PDR2 area into 262 subregions such that each subregion has a similar number of galaxies, and measure the meansubtracted signal for each jackknife subregion. Note that the covariance is highly correlated due to the low-pass filter applied in the process of MV convergence estimation.
To test the contamination from cosmic infrared background (CIB), we stack Planck SZ-nulled temperature map (SMICA noSZ) around the LBGs. We do not find a gradient in the signal, and thus conclude that there is no significant CIB contamination. In addition we confirm that there is no significant B-mode signal.

RESULTS
In Fig. 1, we show the measured convergence profile together with the mean-field profile and the convergence profile before the mean-field subtraction. We confirm that the observed amplitude of mean-field signal is typical by measuring the convergence around random points using the publicly-available simulated lens map in PR3. Defining the significance of the convergence profile after the mean-field subtraction as [S/N] κ = ij κ i Cov −1 ij κ j , we obtain the significance level of [S/N] κ = 5.1 with all radial bins. Note that the convergence of each sub-field of the PDR2 data [22] tends to be positive but we do not find a clear signal. This is because these sub-fields have a much smaller number of LBGs compared to our full sample.
The convergence profile is contaminated by foreground galaxies as described in the previous section. We thus model our signal as κ obs tot = (1 − f cont )κ obs LBG + f cont κ obs cont . We use the halo models for the convergence profiles of real LBGs and contaminating galaxies with different halo masses and redshifts.
Let us first consider the convergence profile without the Planck beam. In the halo model approach, the convergence profile is composed of 1-halo and 2-halo terms; κ(θ) = κ 1h (θ) + κ 2h (θ), where κ 1h (θ) and κ 2h (θ) are the contribution from dark matter halos around galaxies and neighboring halos, respectively. We model the 1-halo term as κ 1h is the redshift of lens (CMB),ρ m is the mean matter density at present,ũ m (k; z l ) is the Fourier-space Navarro-Frenk-White (NFW) [31,32] profile truncated at the radius where the enclosed mass is 200 times the mean density R 200 , J 0 (x) is the zeroth-order Bessel function of the first kind. We adopt the analytical form ofũ m (k; z l ) provided by Takada and Jain [33], and assume the concentrationmass relation derived by Duffy et al. [34].
The critical surface density is defined as Σ cr (z l , z s ) = LBGs. The red circles denote the measured lensing signal after the mean-field subtraction, where the filled (open) circles denote data points used for (excluded from) our model fit. The black crosses (triangles) denote the lensing signal before the mean-field subtraction (mean-field signal measured by stacking the Planck κ map around random points). The red solid (dashed) line represents our best-fit model of the lensing signal at the scales used for (excluded from) our model fit. The red dotted line shows the lensing signal of low-redshift contaminating galaxies in the LBG sample constrained by optical lensing measurements with the HSC shapes (for details, see text). The significance of the signal is 5.1σ (3.5σ) against null (the contamination signal).
is the speed of light, G is the gravitational constant, d A (z s ), d A (z l ), and d A (z l , z s ) are the angular diameter distances for the lens-source system. To model the 2-halo term, we adopt the linear bias model, i.e., κ 2h (R) = Σ −1 cr (z l , z CMB )b(M h , z l )ρ m (kdk/2π)P m (k, z l )J 0 (kR), where b(M h , z l ) is the linear bias parameter, and P m (k, z l ) is the linear matter power spectrum at z l . We adopt the fitting function for the linear bias derived by Tinker et al. [35], and compute the matter power spectrum using CAMB [36]. Note that in our model the 1-halo and 2-halo terms are connected through halo mass. Finally, we convolve the Planck beam by applying the low pass filter L ≤ 4096 and the Gaussian filter with σ = 2 to obtain a model for the observed signal κ obs (θ).
To estimate the halo mass of contaminating galaxies, we measure the weak lensing signal around the LBGs with the HSC first-year shear catalog [37,38]. This measurement enables us to pick up the lensing signal from contaminating galaxies, since most of the HSC source galaxies reside at z < 4 and thus are insensitive to the real LBGs at z = 3.8. We first measure the excess surface mass density ∆Σ(R), following the procedure described in Mandelbaum et al. [37]. Note that we include the random subtraction in the estimator, and we do not apply the boost factor correction since the boost factor is less than 1% [39]. Since we use the full probability density functions of photometric redshift (photo-z), the photo-z correction needed for the use of point estimates is not applied [40]. The covariance is estimated using the jackknife method using the same jackknife subsamples as the CMB lensing measurement. We fit the signal with a model based on halo occupation distribution (HOD) with off-centering (5 parameters for HOD and 2 parameters for off-centering), which is similar to the one described in More et al. [13], using the publicly-available Markov Chain Mote Carlo (MCMC) sampler called emcee [41]. We translate the best-fit model in ∆Σ into convergence κ to obtain the contamination signal [see Eq.(85) in 42]. We estimate the uncertainty in the contamination signal by generating the convergence profile using the MCMC chains. The 1-sigma uncertainty is at most 25% of the contamination signal itself, which is much smaller than the statistical uncertainty of our CMB lensing measurements, as can be seen in Fig 1. We thus conclude that the uncertainty of contamination signal does not significantly impact our analysis.
We then fit the convergence model to the signal varying the halo mass of LBGs.We use the signals at 6 < θ < 20 to remove small scales that can possibly suffer from residual signals at L > 2048 [29]. We show the best-fit model in Fig. 1 together with the expected contamination signal based on the halo mass derived above. Note that the contamination level is much lower than the observed signal because of the insensitivity of CMB lensing at low redshift. In fact, the significance of the observed signal against the contamination signal is 3.5σ. The LBG halo mass is constrained as M h = 2.9 +9.5 −2.5 × 10 11 h −1 M (68% credible level) [43], which is consistent with previous studies [7,20] where they derived the halo mass through clustering measurements. Note that the signal is dominated by the 2-halo term, as expected from the fact that R 200 corresponding to this halo mass is at θ ∼ 0.02 and the 1-halo term is almost completely smeared out by the Planck beam, and thus the halo mass constraint comes from the mass-bias relation by Tinker et al. [35].

DISCUSSIONS
A measurement of the matter correlation function can constrain cosmological parameters such as the matter energy density at present Ω m0 and the amplitude of the matter power spectrum σ 8 . Since galaxies are a biased tracer of the underlying matter distributions, we cannot directly measure the matter correlation function from the spatial distributions of galaxies (galaxy clustering). The matter correlation function, however, can be recovered by combining the galaxy clustering and galaxy lensing signal [44], because the clustering and lensing signal depend on the bias in a different manner. In the limit of large scales where the galaxy density fluctuation follows the linear bias model δ g = bδ m , the clustering signal provides the galaxy-galaxy correlation function ξ gg = b 2 ξ mm , while the lensing signal indicates the galaxy-matter correlation function ξ gm = bξ mm .
Adding a projected clustering measurement of the LBG sample [26] to our lensing signal measurement, we demonstrate the first cosmological analysis through the growth of large-scale structure at z ∼ 4. We measure the projected clustering signal and its jackknife covariance of the LBGs in the same manner as the previous measurements [7]. To ensure large-scale signals are used for cosmological constraints, we use the clustering signal at 1.5 < θ < 6.5 and lensing signal at 6.0 < θ < 20 , which corresponds to the comoving scale ranges of 2.4 < R/[h −1 Mpc] < 9.4 and 8.7 < R/[h −1 Mpc] < 30, respectively, at z = 3.8.
We estimate the impact of the uncertainty associated with LBG redshifts by using a subsample of LBGs which have spectroscopic redshifts. Using the spectroscopic redshifts, we find the model varies by 4% and 17% for lensing and clustering, respectively. These deviations are much smaller than the statistical uncertainties of the measurements which are typically 70% and 40% for lensing and clustering, respectively.
We constrain the cosmological parameters Ω m0 and σ 8 by simultaneously fitting the large scale clustering and lensing signal, while the other cosmological parameters are fixed to the Planck cosmology [45]. We model the lensing signal in the same manner as the modeling procedures described in the previous section with the following modifications. For the convergence profile of LBGs, we use only the 2-halo term and vary b as a fitting parameter. We also vary Ω m0 and A s on which the linear power spectrum depends. Both 1-halo and 2-halo term in the lensing signal from contaminating galaxies are fixed. We model the observed clustering signal as ω tot (θ) = (1 − f cont ) 2 ω LBG (θ), where ω LBG (θ) is the clustering signal from LBGs. Note that we ignore the contamination signal for clustering signal because it is suppressed by f 2 cont ∼ 0.06. The projected clustering signal pf LBGs is modeled as ω LBG (θ) = dzN 2 (z)(dr/dz) −1 dk(k/2π)b 2 P mm (k, z; Ω m0 , A s ) J 0 (r(z)θk), where N (z) is the normalized redshift distribution of galaxies taken from [25].
We perform an MCMC analysis with flat priors, Ω m,0 ∈ [0.01, 0.95], ln 10 10 A s ∈ [0.1, 5.0], and b ∈ [0.1, 30]. Fig. 2 shows constraints on these parameters for the combined lensing and clustering results, together with constraints based on the lensing or clustering results alone. Since σ 8 and b degenerate in a different manner for lensing and clustering at small b, the combination of lensing and clustering probes improves the constraint on σ 8 , yielding σ 8 = 0.46 +0. 25 −0.20 and b = 6.5 +5.2 −4.2 (68% credible level). Although it is known that there is a strong degeneracy between Ω m0 and σ 8 in a combined analysis of lensing and clustering [e.g., 13,15,16], the degeneracy is not observed in our analysis, and there is almost no constraint on Ω m0 . This is due to the fact that the sensitivity of matter power spectrum to Ω m0 significantly decreases at high redshifts. Fig. 3 compares σ 8 constraints from our high-z study with those from low-z studies in the literature. Our constraint on f σ 8 (z) is derived by converting our constraints on (Ω m0 , σ 8 ) into f σ 8 (z), where f = −d ln D(z)/d(1 + z) is the logarithmic growth rate and σ 8 (z) = σ 8 D(z)/D(0) is the linear matter fluctuation at redshift z. Our constraints on both σ 8 and f σ 8 (z) are roughly consistent with constraints computed from the Planck 2018 TT,TE,EE+lowE+lensing result [45], while our constraints are lower than the Planck cosmology at the 1.4σ level for both σ 8 and f σ 8 (z). Motivated by the potential deviations from the Planck constraints on ΛCDM cosmology we repeat the same analysis with the assumption that the time dependent dark energy is characterized by the equation-of-state parameter w(a) = w 0 + (1 − a)w a . We find that again our constraint is consistent with the Planck cosmology.
The HSC SSP survey will be completed by the end of 2021, which will cover ∼1,400 deg 2 of the sky and provide the GOLDRUSH sample three times as large as the one used in this paper. The Advanced Atacama Cosmology Telescope Polarimeter (AdvACT) recently released the temperature and polarization maps much deeper than Planck, and will increase sensitivity in coming years [46,47]. Combining these data sets, we can significantly improve the signal-to-noise ratios of the lensing and clustering signals. In addition, the beam size of AdvACT is much smaller than Planck, which will improve systematics in CMB lensing measurements at small scales. This will enables to perform better measurements of 1-halo term. Moreover, in the near future, optical and near-infrared (O-NIR) programs of the Rubin Observatory Legacy Survey of Space and Time (LSST) [48] and the Nancy Grace Roman Space Telescope [49] will provide much larger and cleaner LBG samples by their deeper and redder imaging. Spectroscopic observations planned with Roman will enable us to evaluate completeness and purity of the LBG samples. On the CMB side, Simons Observatory [50] and CMB-S4 [51], which will provide much deeper data compared to Ad-vACT while keeping the beam size, are scheduled to be deployed at around the same time as the O-NIR surveys start. The combination of these data sets will allow us to extend precision cosmology studies via the structure growth from high (z ∼ 4 − 6) to low redshift (z < 1) [52]. The structure growth measurement at such a high redshift may enable us to explore a non-standard cosmological model that can resolve the H 0 tension such as an early dark energy model [18]. In fact, Klypin et al. [19] showed that halo clustering at z ∼ 4 (z ∼ 6) can be suppressed by ∼ 20% (∼ 30%) by early dark energy. Thus measurements of the high-redshift large scale structure can be one of the important probes for cosmology beyond the concordance model. This paper is based [in part] on data collected at the Subaru Telescope and retrieved from the HSC data archive system, which is operated by Subaru Telescope and Astronomy Data Center (ADC) at NAOJ. Data analysis was in part carried out with the cooperation of Center for Computational Astrophysics (CfCA), NAOJ.
The Pan-STARRS1 Surveys (PS1) and the PS1 public science archive have been made possible through contributions by the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg, and the Max Planck Institute for Extraterrestrial Physics, Garching, The Johns Hopkins University, Durham University, the University of Edinburgh, the Queen's University Belfast, the Harvard-Smithsonian Center for As- Constraints on σ8 based on 2-point statistics from our CMB lensing and galaxy-galaxy clustering measurements at z ∼ 4 (red circle) along with low-z measurements from galaxy surveys. The blue cross denotes the HSC cosmic shear measurement [53], while the green triangle (purple square) represents the combined analysis of cosmic shear, galaxy-galaxy lensing, and projected (redshift-space) galaxygalaxy clustering conducted with DES (KiDS, 2DFLenS, and BOSS) data [15,54]. The gray band shows constraints from Planck PR3 [45]. bottom: Our constraint on f σ8(z) (red circle) that is obtained by converting the Ωm0 and σ8 constraints into f σ8(z). The constraints at low-z from redshiftspace distortion measurements are indicated with the blue cross, green triangle, purple square, orange diamonds, yellow pentagons, brown hexagon, and magenta octagon that correspond to 6dFGRS [55], 2dFGRS [56], GAMA [57], WiggleZ [58], BOSS [59], VIPERS [60], and FastSound [61], respectively. We thank Mathew Madhavacheril, Shuichiro Yokoyama, and Kiyotomo Ichiki for useful discus-sions. This work was supported in part by World Premier In-ternational Research Center Initiative (WPI Initiative), MEXT, Japan, and JSPS KAKENHI Grant Numbers JP15H02064, JP19H05100, JP20H01932, and JP21H00070.