BICEP2 I: Detection Of B-mode Polarization at Degree Angular Scales

(abridged for arXiv) We report results from the BICEP2 experiment, a cosmic microwave background (CMB) polarimeter specifically designed to search for the signal of inflationary gravitational waves in the B-mode power spectrum around $\ell\sim80$. The telescope comprised a 26 cm aperture all-cold refracting optical system equipped with a focal plane of 512 antenna coupled transition edge sensor 150 GHz bolometers each with temperature sensitivity of $\approx300\mu\mathrm{K}_\mathrm{CMB}\sqrt{s}$. BICEP2 observed from the South Pole for three seasons from 2010 to 2012. A low-foreground region of sky with an effective area of 380 square deg was observed to a depth of 87 nK deg in Stokes $Q$ and $U$. We find an excess of $B$-mode power over the base lensed-LCDM expectation in the range $30<\ell<150$, inconsistent with the null hypothesis at a significance of $>5\sigma$. Through jackknife tests and simulations we show that systematic contamination is much smaller than the observed excess. We also examine a number of available models of polarized dust emission and find that at their default parameter values they predict power $\sim(5-10)\times$ smaller than the observed excess signal. However, these models are not sufficiently constrained to exclude the possibility of dust emission bright enough to explain the entire excess signal. Cross correlating BICEP2 against 100 GHz maps from the BICEP1 experiment, the excess signal is confirmed and its spectral index is found to be consistent with that of the CMB, disfavoring dust at $1.7\sigma$. The observed $B$-mode power spectrum is well fit by a lensed-LCDM + tensor theoretical model with tensor-to-scalar ratio $r=0.20^{+0.07}_{-0.05}$, with $r=0$ disfavored at $7.0\sigma$. Accounting for the contribution of foreground dust will shift this value downward by an amount which will be better constrained with upcoming data sets.

The discovery of the Cosmic Microwave Background (CMB) by Penzias & Wilson (1965) confirmed the hot big bang paradigm and established the CMB as a central tool for the study of cosmology.In recent years, observations of its temperature anisotropies have helped establish and refine the "standard" cosmological model now known as ΛCDM, under which our universe is understood to be spatially flat, dominated by cold dark matter, and with a cosmological constant (Λ) driving accelerated expansion at late times.CMB temperature measurements have now reached remarkable precision over angular scales ranging from the whole sky to arcminute resolution, producing results in striking concordance with predictions of ΛCDM and constraining its key parameters to sub-percent precision (e.g.Bennett et al. 2013;Hinshaw et al. 2013;Story et al. 2013;Hou et al. 2014;Sievers et al. 2013;Das et al. 2013;Planck Collaboration XV 2013;Planck Collaboration XVI 2013).
Inflationary cosmology extends the standard model by postulating an early period of nearly exponential expansion which sets the initial conditions for the subsequent hot big bang.It was proposed and developed in the early 1980s to resolve mysteries for which the standard model offered no solution, including the flatness, horizon, smoothness, entropy, and monopole problems (Brout et al. 1978;Starobinsky 1980;Kazanas 1980;Sato 1981;Guth 1981;Linde 1982Linde , 1983;;Albrecht & Steinhardt 1982; see Planck Collaboration XXII 2013 for a review).Inflation also explains the universe's primordial perturbations as originating in quantum fluctuations stretched by this exponential expansion (Mukhanov & Chibisov 1981;Hawking 1982;Guth & Pi 1982;Starobinsky 1982;Bardeen et al. 1983;Mukhanov 1985), and thus to be correlated on superhorizon scales.The simplest models further predict these perturbations to be highly adiabatic and Gaussian and nearly scale-invariant (though typically slightly stronger on larger scales).These qualities, though not necessarily unique to the inflationary paradigm, have all been confirmed by subsequent observations (e.g., Spergel & Zaldarriaga 1997;Peiris et al. 2003, and references above).Although highly successful, the inflationary paradigm represents a vast extrapolation from well-tested regimes in physics.It invokes quantum effects in highly curved spacetime at energies near 10 16 GeV and timescales less than 10 −32 s.A definitive test of this paradigm would be of fundamental importance.
Gravitational waves generated by inflation have the potential to provide such a definitive test.Inflation predicts that the quantization of the gravitational field coupled to exponential expansion produces a primordial background of stochastic gravitational waves with a characteristic spectral shape (Grishchuk 1975;Starobinsky 1979;Rubakov et al. 1982;Fabbri & Pollock 1983;Abbott & Wise 1984; also see Krauss & Wilczek 2013).Though unlikely to be directly detectable in modern instruments, these gravitational waves would have imprinted a unique signature upon the CMB.Gravitational waves induce local quadrupole anisotropies in the radiation field within the last-scattering surface, inducing polarization in the scattered light (Polnarev 1985).This polarization pattern will include a "curl" or B-mode component at degree angular scales that cannot be generated primordially by density perturbations.The amplitude of this signal depends upon the tensor-to-scalar ratio, r, which itself is a function of the energy scale of inflation.The detection of B-mode polarization of the CMB at large angular scales would provide a unique confirmation of inflation and a probe of its energy scale (Seljak 1997;Kamionkowski et al. 1997;Seljak & Zaldarriaga 1997).
Gravitational lensing of the CMB's light by large scale structure at relatively late times produces small deflections of the primordial pattern, converting a small portion of E-mode power into B-modes.The lensing B-mode spectrum is similar to a smoothed version of the E-mode spectrum but a factor ∼ 100 lower in power, and hence also rises toward sub-degree scales and peaks around ≈ 1000.The inflationary gravitational wave (IGW) B-mode, however, is predicted to peak at multipole ≈ 80 and this creates an opportunity to search for it around this scale where it is quite distinct from the lensing effect. 18 large number of current CMB experimental efforts now target B-mode polarization.Evidence for lensing B-mode polarization at sub-degree scales has already been detected by two experiments in the past year, first by the SPT telescope (Hanson et al. 2013) and more recently by POLAR-BEAR (POLARBEAR Collaboration et al. 2013a,b, 2014).The search for inflationary B-modes at larger scales will also be a goal of these experiments, as well as other ongoing experimental efforts in the US that include the ABS (Hileman et al. 2009), ACTPOL (Niemack et al. 2010), and CLASS (Eimer et al. 2012) ground-based telescopes and the EBEX (Reichborn-Kjennerud et al. 2010), SPIDER (Fraisse et al. 2013), and PIPER (Kogut et al. 2012) balloon experiments, each employing a variety of complementary strategies.It is also a major science goal of the ESA Planck satellite mission.
The BICEP/Keck Array series of experiments have been specifically designed to search for primordial B-mode polarization on degree angular scales by making very deep maps of relatively small patches of sky from the South Pole.The BICEP1 instrument initiated this series (Keating et al. 2003), observing from 2006 to 2008.Its initial results were described in Takahashi et al. (2010) and Chiang et al. (2010) (hereafter T10 and C10), and final results were recently reported in Barkats et al. (2014) (hereafter B14) yielding a limit of r < 0.70 at 95% confidence.
In this paper we report results from BICEP2 -a successor to BICEP1 which differed principally in the focal plane where a very large increase in the detector count resulted in more than an order of magnitude improvement in mapping speed.The observation field and strategy were largely unchanged, as were the telescope mount, observation site etc.Using all three seasons of data taken with BICEP2 (2010-2012) we detect Bmode power in the multipole range 30 < < 150, finding this power to have a strong excess inconsistent with lensed ΛCDM at > 5σ significance.
The structure of this paper is as follows.In §2 and §3 we briefly review the BICEP2 instrument, observations and low-level data reduction deferring details to a companion paper BICEP2 Collaboration II (2014) (hereafter the Instrument Paper).In §4 we describe our map making procedure and present signal and signal-differenced "jackknife" T , Q and U maps which have unprecedented sensitivity.This section introduces "deprojection" of modes potentially contaminated through beam systematics which is an important new technique.In §5 we describe our detailed timestream-level simulations of signal and pseudo-simulations of noise.In §6 we describe calculation of the power spectra, including matrixbased B-mode purification.In §7 we present the signal and jackknife power spectrum results for T E, EE, BB, T B and EB.In §8 we discuss and summarize the many studies we have conducted probing for actual and potential sources of systematic contamination, and argue that residual contamination is much smaller than the detected B-mode signal.Full details are deferred to a companion paper BICEP2 Collaboration III (2014) (hereafter the Systematics Paper).In §9 we investigate foreground projections based on external data and conclude that it is implausible that the B-mode signal which we see is dominated by dust, synchrotron or any other known foreground source.In §10 we take cross spectra of the BI-CEP2 maps with those from BICEP1 (as presented in B14) and find that the spectral signature of the signal is consistent with the CMB.Finally in §11 we calculate some simple, largely phenomenological, parameter constraints, and conclude in §12.

THE BICEP2 INSTRUMENT
BICEP2 was similar to BICEP1 (see T10) reusing the same telescope mount and installation at the South Pole.Like BI-CEP1 the optical system was a simple 26 cm aperture allcold refractor housed entirely in a liquid helium cooled cryostat.The main differences from BICEP1 were the use of a focal plane array of planar antenna-coupled devices (Bock 2003) with voltage-biased transition-edge sensor (TES) detectors (Irwin 1995) and a multiplexed superconducting quantum interference device (SQUID) readout.BICEP2 observed at 150 GHz only.A very brief review of the instrument follows -for more details please refer to the Instrument Paper.

Optics
The optics were adapted from the original BICEP1 design (Keating et al. 2003).Light entered the cryostat through a polypropylene foam window, passed through PTFE filters cooled to 100 K and 40 K, and then through polyethylene objective and eyepiece lenses cooled to 4 K.A 26.4 cm diameter aperture stop was placed at the objective lens and an additional nylon filter was placed on the sky side of the eyepiece lens.All the lenses and filters were antireflection coated and the interior of the optics tube was lined with microwave absorber.The optics were designed to be telecentric (flat focal plane) and the resulting beams had full-width-half-max of ≈ 0.5 • .An absorptive forebaffle was mounted on the front of the telescope which was designed to prevent radiation from boresight angles greater than ∼ 20 • entering the telescope.The telescope was located inside a large stationary reflective groundscreen.

Focal Plane
The BICEP2 focal plane employed monolithic arrays of antenna coupled TES detectors designed and fabricated at Caltech/JPL.Each pixel was composed of two interleaved 12×12 arrays of orthogonal slot antennas feeding beam-forming (phased-array) summing trees.The output of each summing tree was a microstrip which passed through a band-defining filter and deposited its power on a thermally isolated island.Changes in the power incident on this island were detected using a transition edge sensor (TES).There was an 8×8 array of pixels on each tile, and four such tiles were combined to form the complete focal plane unit.There were thus, in principle, 256 dual-polarization pixels in the focal plane for a total of 512 detectors, each with temperature sensitivity of ≈ 300 µK CMB √ s. (Six pixels were deliberately disconnected between antenna and TES sensor to provide diagnostic "dark" channels.)The focal plane was cooled to 270 mK by a closed cycle three-stage sorption refrigerator.

Detector Readout and DAQ System
The TES detectors were read out through time-division SQUID multiplexing chips provided by NIST.A single readout channel was connected in rapid succession to 32 detectors, reducing wiring and heat load requirements.These SQUID systems were biased and read out by a Multi-Channel Electronics (MCE) crate external to the cryostat (provided by UBC).The housekeeping and readout electronics were connected to a set of Linux-based computers running a control system called gcp, originally developed for the CBI experiment and developed and reused by many experiments since.

Telescope Mount
The receiver cryostat was mounted on a three-axis mount able to move in azimuth and elevation and to rotate the entire telescope about its boresight.Hereafter we refer to the lineof-sight rotation angle as the "deck" angle.The window of the telescope looked out through an opening in a flexible environmental seal such that the cryostat, mount and electronics were all located in a room temperature laboratory environment.

OBSERVATIONS AND LOW LEVEL DATA REDUCTION
3.1.Observations BICEP2 observed on a three day schedule locked to sidereal time.As in BICEP1, the basic unit of observation was a ≈ 50 minute "scanset" during which the telescope scanned back-and-forth 53 times at 2.8 • s −1 in azimuth in a smooth turnaround triangle wave pattern, with a throw of ≈ 60 • , at fixed elevation.We refer to each of the 106 motions across the field (either left-or right-going) as a "half-scan".We do not use the turnaround portions of the scans in this analysis.
BICEP2 observed the same CMB field as BICEP1 -a low foreground region centered at RA 0h, Dec. −57.5 • .At the South Pole, the elevation angle is simply the negative of declination and azimuth maps to RA shifting by 15 • per hour.The scan speed on the sky was thus ≈ 1.5 • s −1 mapping multipole = 100 into the timestream at ≈ 0.4 Hz.At the end of each scanset the elevation was stepped by 0.25 • and the azimuth angle updated to recenter on RA 0h.The scans thus "slide" with respect to the sky during each scanset by ≈ 12.5 • allowing us to subtract a scan fixed "template" from the timestream while leaving degree-scale sky structure only slightly attenuated (see Sections 4.1 and 6.3).
A total of 21 elevation offsets were used between Dec. of −55 • and −60 • .Note that since the field of view of the focal plane -∼ 20 • -is much larger in Dec., and somewhat larger in RA, than the region scanned by the boresight the final coadded map is naturally apodized.After a complete threeday schedule the instrument was rotated to a new deck angle, the refrigerator was re-cycled, and the process repeated.See the Instrument Paper for more details of the observation strategy.
The control system ran CMB observation schedules relentlessly between early 2010 and late 2012 collecting over 17,000 scansets of data (≈ 590 days).(There were some breaks for beam mapping and other calibrations during the austral summers.)The raw data were transferred off-site daily via satellite, allowing rigorous quality monitoring and ongoing analysis development.The analysis presented in this paper uses all of the CMB data taken by BICEP2.

Analysis Pipeline
The analysis pipeline used in this paper is written in the MATLAB language and was originally developed for the QUAD experiment (Pryke et al. 2009).It was then adapted to BICEP1 data and became the secondary, and then primary, analysis pipeline for the C10 and B14 papers, respectively.For BICEP2 it has seen substantial further development including the addition of a sophisticated automatic data selection framework, full deprojection of beam systematics, and a map-based B-mode purification operation; these enhancements are detailed below.

Transfer Function Correction and Deglitching
Starting from the raw timestreams, the first step of the pipeline is to deconvolve the temporal response of the instrument.The TES detectors themselves have a very fast and uniform response at all frequencies of interest.To correct for the effect of the digital low-pass filtering, which was applied to the data before it was downsampled for recording to disk, we apply an FIR deconvolution operation in the time domain (which also re-applies a zero-delay low pass filter).Glitches and flux jumps in the SQUID readout are also corrected and/or flagged at this point -they are relatively rare in these data.See the Instrument Paper for more details.

Relative Gain Calibration
At the beginning and end of each scanset an elevation nod or "elnod" was performed.The telescope was moved up/down/up or vice versa in a roughly sinusoidal excursion in time, injecting a signal proportional to the atmospheric opacity gradient into the detector timestreams.In analysis, each elnod is regressed against the airmass profile through which it was looking to derive a relative gain coefficient in SQUID feedback units per airmass.The timestream for each scanset is then divided by its own elnod coefficient and multiplied by the median over all good detectors.This roughly equalizes the gain of each channel and results in considerable cancellation of atmospheric fluctuations when taking the difference of detector pairs, thus making the data considerably easier to work with.The relative gain as determined using the atmospheric gradient is not necessarily the relative gain which minimizes leakage of CMB temperature anisotropy to polarization -see §8.Absolute calibration is deferred until after the final coadded map is made -see §4.8.

First Round Data Cuts
At this point in the data reduction individual channels are cut at per half-scan granularity.Reasons for removal include glitches and flux jumps in the channel in question, or its multiplex neighbors, and synchronization problems in the DAQ system.BICEP2 data are very well behaved and over 90% of the data pass this stage.

Timestream Filtering
In the next step the sum and difference of each detector pair is taken, the pair-sum being ultimately used to form maps of temperature anisotropy, and the pair-difference to measure polarization.Each half-scan is then subjected to a third order polynomial filtering.
Each half-scan is constrained to have the same number of time samples.In addition to the polynomial filtering we also perform a "template" subtraction of any scan-synchronous component by averaging together the corresponding points over a scanset and removing the result from each half-scan.Forward and backward half-scans are treated separately.
Within our simulation-based analysis framework we are free to perform any arbitrary filtering of the data which we choose.Although any given filtering implies some loss of sensitivity due to the removal and mixing of modes within the map these effects are corrected as described in §6.2 and §6.3.We defer discussion of the particular filtering choices made in this analysis to §8.

Pointing Reconstruction
The pointing trajectory of the telescope boresight (i.e. the line of sight axis of rotation of the mount) is determined using a mount pointing model calibrated using a star camera as described in the Instrument Paper.To convert timestream into maps it is then necessary to know the pointing offset of each detector from this direction.To measure these we first make per channel maps assuming approximate offsets, and then regress these against the WMAP5 temperature map to determine corrections.Comparing maps made from left-going and right-going scans at each of the four deck angles, we estimate that this procedure is accurate to better than 0.05 • absolute pointing uncertainty.The beam positions relative to the boresight are averaged over the scan directions and deck angles to produce a single reconstruction for each detector used in mapmaking.

Construction of Deprojection Timestream
The two halves of each detector pair would ideally have identical angular response patterns (beams) on the sky.If this is not the case then leakage of temperature anisotropy (pair-sum) to polarization (pair-difference) will occur (Shimon et al. 2008).One can re-sample an external map of the temperature sky and its derivatives to generate templates of the leakage resulting from specific differential beam effects.In this analysis we smooth the Planck 143 GHz map 19 using the average measured beam function and resample following the procedure described in Sheehy (2013) and the Systematics Paper.Our standard procedure is to calculate templates for the six modes which correspond to differences of elliptical Gaussian beams.In practice we do not normally use all six -see §4.6 and §8.

Binning into Pairmaps
At this point we bin the pair-sum and pair-difference signals into per-scanset, per-pair RA/Dec.pixel grids which we refer to as "pairmaps".The pixels are 0.25 • square at declination −57.5 • .The data from each scanset are weighted by their inverse variance over the complete scanset (with separate weights for pair-sum and pair-difference).We note that while the pair-sum weights vary widely due to variation in atmospheric 1/ f noise, the pair-difference weights are extremely stable over time -i.e.atmospheric fluctuations are empirically shown to be highly unpolarized.For pair-difference a number of products of the timestream and the sine and cosine of the polarization angle are recorded to allow construction of Q and U maps as described in §4.7 below.The deprojection templates are also binned into pairmaps in parallel with the pair-difference data.
We use per-pair detector polarization angles derived from a dielectric sheet calibrator (as described in the Instrument Paper) 20 .The measured polarization efficiency of our detectors is very high (≈ 99%, see the Instrument Paper) -we perform a small correction to convert temperature-based gains to polarization gains.4.5.Second Round Data Cuts http://irsa.ipac.caltech.edu/data/Planck/release_1/all-sky-maps/previews/HFI_SkyMap_143_ 2048_R1.10_nominal/index.html 20These derived angles are within 0.2 • rms of their design values, well within the required accuracy.However note that we later apply an overall rotation to minimize the high T B and EB spectra -see §8.2.
The per-scanset, per-pair maps are recorded on disk to allow rapid recalculation of the coadded map while varying the socalled "second round" cut parameters.These include a variety of cuts on the bracketing elnods, including goodness-of-fit to the atmospheric cosecant model and stability in both absolute and pair-relative senses.We also make some cuts based on the behavior of the data itself including tests for skewness and noise stationarity.Many of these cuts identify periods of exceptionally bad weather and are redundant with one another.We also apply "channel cuts" to remove a small fraction of pairs -principally those with anomolous measured differential beam shapes.In general BICEP2 data are very wellbehaved and the final fraction of data retained is 63%.See the Instrument Paper for more details.
4.6.Accumulation of Pairmaps to Phase, and Template Regression Once the second round cuts have been made we accumulate the pairmaps over each set of ten elevation steps (hereafter referred to as a "phase").The deprojection templates are also accumulated.We then regress some of these binned templates against the data -i.e.we effectively find the best fit value for each non-ideality, for each pair, within each phase.The templates scaled by the regression coefficients are then subtracted from the data, entirely removing that imperfection mode if present.This operation also filters real signal and noise due to chance correlation (and real T E induced correlation in the case of signal).This filtering is effectively just additional timestream filtering like that already mentioned in §4.1 and we calibrate and correct for its effect in the same way (see §6.2 and §6.3).
The choice of deprojection timescale is a compromisereducing it guards against systematic modes which vary over short timescales (as relative gain errors might), while covering more sky before regressing reduces the filtering of real signal (the coefficient is fit to a greater number of pixels).In practice reduction of the filtering going from ten elevation steps to twenty is found to be modest and for this analysis we have deprojected modes on a per-phase basis.
We also have the option to fix the coefficients of any given mode at externally measured values, corresponding to a subtraction of the systematic with no additional filtering of signal.In this analysis we have deprojected differential gain and pointing, and have subtracted the effects of differential ellipticity -we defer discussion of these particular choices to §8.

Accumulation over Phases and Pairs
We next proceed to coadded maps accumulating over phases and pairs.Full coadds are produced as well as many "jackknife" splits -pairs of maps made from two subsets of the data which might be hypothesized to contain different systematic contamination.Some splits are strictly temporal (e.g.first half vs. second half of the observations), some are strictly pair selections (e.g.inner vs. outer part of each detector tile), and some are both temporal and pair-wise (e.g. the so-called tile/deck jackknife) -see §7.3 for details.
Once the accumulation over all 590 days and ≈ 200 detector pairs is done the accumulated quantities must be converted to T , Q and U maps.For T this is as simple as dividing by the sum of the weights.For Q and U we must perform a simple 2 × 2 matrix inversion for each pixel.This matrix is singular if a given pixel has been observed at only a single value of the deck angle modulo 90 • .In general for BICEP2 data we have angles 68 • , 113 • , 248 • and 293 • as measured relative to the celestial meridian.
We perform absolute calibration by taking the cross spectrum of the T map with either the Planck 143 GHz map or the WMAP9 W-band T map as described in the Instrument Paper.We adopt an absolute gain value intermediate to these two measurements and assign calibration uncertainty of 1.3% in the map to account for the difference.

Maps
Figure 1 shows the BICEP2 T , Q and U signal maps along with a sample set of difference (jackknife) maps.The "vertical-stripe-Q, diagonal-stripe-U" pattern indicative of an E-mode dominated sky is visible.Note that these maps are filtered by the relatively large beam of BICEP2 (≈ 0.5 • FWHM).Comparison of the signal and jackknife maps shows that the former are signal dominated -they are the deepest maps of CMB polarization ever made at degree angular scales with an rms noise level of 87 nK in (nominal) 1 • × 1 • pixels.

Signal Simulations
As is common practice in this type of analysis we account for the filtering which our instrument and data reduction impose on the underlying sky pattern through simulations (Hivon et al. 2002).Starting with input T , Q, and U sky maps we smooth using the average measured beam function and then re-sample along the pointing trajectory of each detector at each timestream sample.We have the option of perturbing to per-channel elliptical Gaussian beam shapes using the derivatives of the map (in a similar manner to the construction of the deprojection templates described in §4.3 above).However for our standard simulations we include only differential pointing as this is our leading order beam imperfection (see §8).
We perform three sets of signal-only simulations: i) simulations generated from unlensed ΛCDM input spectra (hereafter "unlensed-ΛCDM"), ii) simulations generated from those same input skies, explicitly lensed in map space as described below (hereafter "lensed-ΛCDM"), and iii) simulations containing only tensor B-modes with r = 0.2 (and n t = 0).

Constrained input maps
The observing matrix and purification operator described in §6.2 are constructed for a specific assumed T sky map.Since its construction is computationally very expensive it is preferable to constrain the input T skies used for the simulations to be the same rather than to recalculate the operator for each simulation.
To construct constrained Q and U sky maps which respect the known ΛCDM T E correlation we start from a map of the well measured temperature anisotropy, specifically the Planck Needlet Internal Linear Combination (NILC) T map21 .We calculate the a T m using the synfast software from the Healpix22 package (Górski et al. 2005), and then calculate sets of a E m using where the C 's are ΛCDM spectra from CAMB23 with cosmological parameters taken from Planck Collaboration XVI (2013), and the n m are normally distributed complex random numbers.For C T T we use a lensed-ΛCDM spectrum since the a T m from Planck NILC inherently contain lensing.We have found the noise level in the Planck NILC maps for our region of observation and multipole range to be low enough that it can be ignored.
Using the a E m we generate Nside=2048 maps using synfast.We substitute in the a T m from Planck 143 GHz so that the T map more closely resembles the T sky we expect to see with BICEP2.(This is also the map that is used in §4.6 to construct deprojection templates.)Additionally, we add in noise to the T map at the level predicted by the noise covariance in the Planck 143 GHz map, which allows us to simulate any deprojection residual due to noise in the Planck 143 GHz map.

Lensing of input maps
Lensing is added to the unlensed-ΛCDM maps using the LensPix24 software (Lewis 2011).We use this software to generate lensed versions of the constrained CMB input a m 's described in §5.1.1.Input to the lensing operation are deflection angle spectra that are generated with CAMB as part of the standard computation of ΛCDM spectra.The lensing operation is performed before the beam smoothing is applied to form the final map products.We do not apply lensing to the 143 GHz temperature a T m from Planck since these inherently contain lensing.Our simulations hence approximate lensed CMB maps ignoring the lensing correlation between T and E.

Noise Pseudo Simulations
The previous BICEP1 and QUAD pipelines used a Fourier based procedure to make simulated noise timestreams, maintaining correlations between all channels (Pryke et al. 2009).For the increased channel count in BICEP2 this is computationally very expensive, so we have switched to an alternate procedure adapted from van Engelen et al. (2012).We perform additional coadds of the real pairmaps randomly flipping the sign of each scanset.The sign-flip sequences are constructed such that the total weight of positively and negatively weighted maps is equal.We have checked this technique against the older technique, and against another technique which constructs map noise covariance matrices, and have found them all to be equivalent to the relevant level of accuracy.By default we use the sign flipping technique and refer to these realizations as "noise pseudo simulations." We add the noise maps to the lensed-ΛCDM realizations to form signal plus noise simulations -hereafter referred to as lensed-ΛCDM+noise.
6. FROM MAPS TO POWER SPECTRA 6.1.Inversion to Spectra The most basic power spectrum estimation procedure which one can employ is to apply an apodization windowing, Fourier transform, construct E and B from Q and U, square, and take the means in annuli as estimates of the CMB bandpowers.A good choice for the window may be the inverse of the noise variance map (or a smoothed version thereof).Employing this simple procedure on the unlensed-ΛCDM simulations we find an unacceptable degree of E to B mixing.While such mixing can be corrected for in the mean using simulations, its fluctuation leads to a significant loss of sensitivity.
There are several things which can cause E to B mixing: i) the "sky-cut" implied by the apodization window (the transformation from Q and U to E and B is non-local so some of the modes around the edge of the map are ambiguous), ii) the timestream (and therefore map) filtering which we have imposed in §4.1 and §4.6, and iii) the simple RA/Dec.map projection which we have chosen.
To correct for sky-cut induced mixing improved estimators have been suggested.We first tried implementing the estimator suggested by Smith (2006) which takes Fourier transforms of products of the map with various derivatives of the apodization window.However, testing on the unlensed-ΛCDM simulations revealed only a modest improvement in performance since this estimator does not correct mixing caused by filtering of the map.

Matrix Based Map Purification
To overcome the E to B mixing described in the previous subsection we have introduced an additional purification step after the Q and U maps are formed.This step has to be performed in pixel space where the filtering takes place.In parallel with the construction of the pairmaps and their accumulation we construct pixel-pixel matrices which track how every true sky pixel maps into the pixels of our final coadded map due to the various filtering operations.We take "true sky pixel maps" to be Nside=512 Healpix maps, whose pixel size (∼ 0.1 • on a side) is smaller than our observed map pixels (0.25 • ).The act of simulating our various filtering operations becomes a simple matrix multiplication: where m is a vector consisting of [Q,U] values for each Healpix pixel and m is a [Q,U] vector as observed by BI-CEP2 in the absence of noise.Next, we "observe" an Nside=512 Healpix theoretical covariance matrix 25 , C, with R: We form C for both E-mode and B-mode covariances.These matrices provide the pixel-pixel covariance for Emodes and B-modes in the same observed space as the real data.However, the matrix R has made the two spaces nonorthogonal and introduced ambiguous modes, i.e. modes in the observed space which are superpositions of either Emodes or B-modes on the sky.
To isolate the pure B-modes we adapt the method described in Bunn et al. (2003).We solve a generalized eigenvalue 25 Constructed following Appendix A of Tegmark & de Oliveira-Costa (2001). problem: where b is a [Q,U] eigenmode and σ 2 is a small number introduced to regularize the problem.By selecting modes corresponding to the largest eigenvalues λ b 1, we can find the B-mode subspace of the observed sky which is orthogonal to E-modes and ambiguous modes.The covariance matrices are calculated using steeply reddened input spectra (∼ 1/ 2 ) so that the eigenmodes are separated in angular scale, making it easy to select modes up to a cutoff set by the instrument resolution.
The matrix purification operator is a sum of outer products of the selected eigenmodes; it projects an input map onto this space of pure B-modes: It can be applied to any simulated map vector ( m) and returns a purified vector which contains only signal coming from Bmodes on the true sky: This method is superior to the other methods discussed above because it removes the E-to-B leakage resulting from the filterings and the sky-cut, because R contains all of these steps.After the purification, in the present analysis we use the simple power spectrum estimation described in the previous subsection, although in the future we may switch to a fully matrix-based approach.
Testing this operator on the standard unlensed-ΛCDM simulations (which are constructed entirely independently) we empirically determine that it is extremely effective, with residual false B-modes corresponding to r < 10 −4 .Testing the operator on the r = 0.2 simulations we find that it produces only a very modest increase in the sample variance -i.e. the fraction of mixed (ambiguous) modes is found to be small.

Noise Subtraction and Filter/Beam Correction
As is standard procedure in the MASTER technique (Hivon et al. 2002), we noise debias the spectra by subtracting the mean of the noise realizations (see §5.2).The noise in our maps is so low that this is a relevant correction only for BB, although we do it for all spectra.
To determine the response of each observed bandpower to each multipole on the sky we run special simulations with delta function spectra input to synfast multiplied by the average measured beam function.Taking the mean over many realizations 26 we determine the "bandpower window functions" (BPWF) (Knox 1999).The integral of these functions is the factor by which each bandpower has been suppressed by the instrument beam and all filterings (including the matrix purification).We therefore divide by these factors and renormalize the BPWF to unit sum.This is a variant on the standard MASTER technique.(We choose to plot the bandpower values at the weighted mean of the corresponding BPWF instead of at the nominal band center, although this has a significant effect only on the lowest one.) One point worth emphasizing is that when comparing the real data to our simulations (or jackknife differences thereof) the noise subtraction and filter/beam corrections have no effect since they are applied equally to the real data and simulations.The BPWFs are required to compare the final bandpowers to an arbitrary external theoretical model and are provided with the data release.
The same average measured beam function is used in the signal simulations and in the BPWF calculation.In as much as this function does not reflect reality the real bandpowers will be under or over corrected at high .We estimate the beam function uncertainty to be equivalent to a 1.1% width error on a 31 arcminute FWHM Gaussian.

Power Spectra
Following the convention of C10 and B14 we report nine bandpowers, each ≈ 35 multipoles wide and spanning the range 20 < < 340. Figure 2 shows the BICEP2 power spectra 27 .With the exception of BB all spectra are consistent with their lensed-ΛCDM expectation values -the probability to exceed (PTE) the observed value of a simple χ 2 statistic is given on the plot (as evaluated against simulations -see §7.3).BB appears consistent with the lensing expectation at higher , but at lower multipoles there is an excess which is detected with high signal-to-noise.The χ 2 of the data is much too high to allow us to directly evaluate the PTE of the observed value under lensed-ΛCDM using the simulations.We therefore "amplify" the Monte Carlo statistics by resampling bandpower values from distributions fit to the simulated ones.For the full set of nine bandpowers shown in the figure we obtain a PTE of 1.3 × 10 −7 equivalent to a significance of 5.3σ.Restricting to the first five bandpowers ( 200) this changes to 5.2σ.We caution against over interpretation of the two high bandpowers at ≈ 220 -their joint significance is < 3σ (also see Figure 9).
Figure 2 also shows the temporal-split jackknife -the spectrum produced when differencing maps made from the first and second halves of the data.The BB excess is not seen in the jackknife, which rules out misestimation of the noise debias as the cause (the noise debias being equally large in jackknife spectra).

E and B Maps
Once we have the sets of E and B Fourier modes, instead of collapsing within annuli to form power spectra, we can instead reinvert to make apodized E and B maps.In Figure 3 we show such maps prepared using exactly the same Fourier modes as were used to construct the spectra shown in Figure 2 filtering to the range 50 < < 120.In comparison to the simulated maps we see i) BICEP2 has detected B-modes with high signal-to-noise ratio in the map, and ii) this signal appears to be evenly distributed over the field, as is the expectation for a cosmological signal, but generally will not be for a Galactic foreground.

Internal Consistency Tests
We evaluate the consistency of the jackknife spectra with their ΛCDM expectations by using a simple χ 2 statistic, 27 These bandpowers along with all the ancillary data required to use them are available for download at http://bicepkeck.org/bicep2_ 2014_release.
where d is the vector of observed bandpower values, m is the mean of the lensed-ΛCDM+noise simulations, and D is the bandpower covariance matrix as evaluated from those simulations 28 .We also compute χ 2 for each of the simulations 29 and take the probability to exceed (PTE) the observed value versus the simulated distribution.In addition to χ 2 we compute the sum of normalized deviations, where the d i are the observed bandpower values and m i and σ mi are the mean and standard deviation of the lensed-ΛCDM+noise simulations.This statistic tests for sets of bandpowers which are consistently all above or below the expectation.Again we evaluate the PTE of the observed value against the distribution of the simulations.
We evaluate these statistics both for the full set of nine bandpowers (as in C10 and B14), and also for the lower five of these corresponding to the multipole range of greatest interest (20 < < 200).Numerical values are given in Table 1 and the distributions are plotted in Figure 4. Since we have 500 simulations the minimum observable non zero value is 0.002.Most of the T T , T E and T B jackknifes pass, but following C10 and B14 we omit them from formal consideration (and they are not included in the table and figure).The signal-tonoise in T T is ∼ 10 4 so tiny differences in absolute calibration between the data subsets can cause jackknife failure, and the same is true to a lesser extent for T E and T B.Even in EE the signal-to-noise is approaching ∼ 10 3 (500 in the ≈ 110 bin) and careful examination of the table shows marginal values for the tile jackknife in this spectrum.However with a maximum signal-to-noise ratio of 10 in BB such calibration differences are not a concern and all the BB (and EB) jackknifes are seen to pass.
To form the jackknife spectra we difference the maps made from the two halves of the data split, divide by two, and take the power spectrum.This holds the power spectrum amplitude of a contribution which is uncorrelated in the two halves (such as noise) constant, while a fully correlated component (such as sky signal) cancels perfectly.The amplitude of a component which appears only in one half will stay the same under this operation as it is in the fully coadded map and the apparent signal-to-noise will also stay the same.For a noise-dominated experiment this means that jackknife tests can only limit potential contamination to a level comparable to the noise uncertainty.However the BB bandpowers shown in Figure 2 have signal-to-noise as high as 10.This means that jackknife tests are extremely powerful in our case -the reductions in power which occur in the jackknife spectra are empirical proof that the B-mode pattern on the sky is highly correlated between all data splits considered.
28 Due to differences in sky coverage between the two halves of a jacksplit, in conjunction with filtering, the expectation value of some of the jackknifes is not quite zero -hence we always evaluate χ 2 versus the mean of the simulations.Because the BPWF overlap slightly adjacent bandpowers are 10% correlated.We zero all but the main and first off-diagonal elements of D as the other elements are not measured above noise given the limited simulation statistics. 29Recomputing D each time, excluding that simulation.    1.These distributions are consistent with uniform.
We have therefore conducted an unusually large number of jackknife tests trying to imagine data splits which might conceivably contain differing contamination.Here we briefly describe each of these: BICEP2 observed at deck angles of 68 • , 113 • , 248 • and 293 • .We can split these in two ways without losing the ability to make Q and U maps (see §4.7).The deck jack is defined as 68 • and 113 • vs. 248 • and 293 • while the alt.deck jack is 68 • and 293 • vs. 113 • and 248 • .Uniform differential pointing averages down in a coaddition of data including an equal mix of 180 • complement angles, but it will be amplified in either of these jackknifes (as we see in our simulations).The fact that we are passing these jackknifes indicates that residual beam systematics of this type are subdominant after deprojection.
The temporal-split simply divides the data into two equal weight parts sequentially.Similarly, but at the opposite end of the time scale range, we have the scan direction jackknife, which differences maps made from the right and left going half-scans, and is sensitive to errors in the detector transfer function.
The azimuth jackknife differences data taken over different ranges of telescope azimuth angle -i.e. with different potential contamination from fixed structures or emitters on the ground.A related category is the moon jackknife, which differences data taken when the moon is above and below the horizon.
A series of jackknifes tests if the signal originates in some subset of the detector pairs.The tile jackknife tests tiles 1 and 3 vs. 2 and 4 (this combination being necessary to get reasonable coverage in the Q and U maps).Similarly the tile inner/outer and tile top/bottom jackknifes are straightforward.The focal plane inner/outer does as stated for the entire focal plane and is a potentially powerful test for imperfections which increase radially.The mux row and mux column jackknifes test for systematics originating in the readout system.
The tile/deck jackknife tests for a possible effect coming from always observing a given area of sky with detectors the "same way up", although due to the small range of the elevation steps it is limited to a small sky area.
Finally we have performed one test based on beam nonideality as observed in external beam map runs.The A/B offset best/worst jackknife differences the best and worst halves of the detector pairs as selected by that metric.
See the Systematics Paper for a full description of the jackknife studies.

SYSTEMATIC UNCERTAINTIES
Within the simulation-calibrated analysis framework described above we are free to perform any arbitrary filtering of the data which may be necessary to render the results insensitive to particular systematics.However, as such mode removal increases the uncertainty of the final bandpowers, we clearly wish to filter only systematics which might induce false Bmode at relevant levels.Moreover it may not be computationally feasible to construct simple timestream templates for some potential systematics.Therefore once we have made our selection as to which filterings to perform we must then estimate the residual contamination and either subtract it or show it to be negligible.
To guide our selection of mode removal we have two main considerations.First, we can examine jackknifes of the type described in §7.3 above -reduction in failures with increasing mode removal may imply that a real systematic effect is present.Second, and as we will see below more powerfully, we can examine external calibration data (principally beam maps) to directly calculate the false B-mode expected from specific effects.

Simulations Using Observed Per Channel Beam Shapes
As described in the Instrument Paper we have made extremely high signal-to-noise in situ measurements of the farfield beam shape of each channel.Fitting these beams to elliptical Gaussians we obtain differential parameters that correlate well with the mean value of the deprojection coefficients from §4.6.One may then ask whether it would be better to subtract rather than deproject.In general it is more conservative to deproject as this i) allows for the possibility that the coefficients are changing with time, and ii) is guaranteed to completely eliminate the effect in the mean, rather than leaving a residual bias due to noise on the subtraction coefficients.
We use the per-channel beam maps as inputs to special Tonly input simulations and measure the level of T to B mixing while varying the set of beam-modes being deprojected.The beam maps do not provide a good estimate of differential gain so we substitute estimates which come from a perchannel variant of the absolute calibration procedure mentioned in §4.7 above.The left panel of Figure 5 shows B-mode power spectra from these simulations under the following deprojections i) none, ii) differential pointing only, iii) differential pointing and differential gain, iv) differential pointing, differential gain and differential beam width, and v) differential pointing, differential gain and differential ellipticity.
We see that differential pointing has the largest effect and so to be conservative we choose to deproject it.Differential gain is also seen to be a significant effect and we again deproject it -we lack independent subtraction coefficients, and it might plausibly be time variable.Differential beam width is a negligible effect and we do not deproject it.Differential ellipticity is also a small effect.We find in the simulations that deprojection of differential ellipticity interacts with real T E correlation in a complex manner slightly distorting the T E spectrum.We therefore choose to subtract this effect by fixing the coefficients to their beam map derived values in §4.6.Whether differential ellipticity is deprojected or subtracted makes no significant difference to any of the spectra other than T E. Finally we make a small correction for the undeprojected residual by subtracting the final curve in the left panel of Figure 5 from the results presented in §7.(The correction is equivalent to r = 0.001.)We also increase the bandpower fluctuation to reflect the post-correction upper limit on extended beam mismatch shown in the right panel of Figure 5. See the Systematics Paper for details.

Overall Polarization Rotation
Once differential ellipticity has been corrected we notice that an excess of T B and EB power remains at > 200 versus the ΛCDM expectation.The spectral form of this power is consistent with an overall rotation of the polarization angle of the experiment.While the detector-to-detector relative angles have been measured to differ from the design values by < 0.2 • we currently do not have an accurate external measurement of the overall polarization angle.We therefore apply a rotation of ∼ 1 • to the final Q/U maps to minimize the T B and EB power (Keating et al. 2013;Kaufman et al. 2013).We emphasize that this has a negligible effect on the BB bandpowers at < 200.

Other Possible Systematics
Many other systematics can be proposed as possibly leading to false B-modes at a relevant level.Some possible effects will produce jackknife failure before contributing to the nonjackknife B-mode power at a relevant level.Limits on others must be set by external data or other considerations.Any azimuth fixed effect, such as magnetic pickup, is removed by the scan-synchronous template removal mentioned in §3.1 and §4.
We have attempted an exhaustive consideration of all possible effects -a brief summary will be given here with the details deferred to the Systematics Paper.The right panel of Figure 5 shows estimated levels of, or upper limits on, contamination from extended beam mismatch after the undeprojected residual correction, thermal drift in the focal plane, systematic polarization angle miscalibration, randomized polarization angle miscalibration, ghost beams, detector transfer function mismatch, and crosstalk.The upper limit for extended beam mismatch is the 1σ uncertainty on contamination predicted from beam map simulations identical to those described in §8.1 but using a larger region of the beam.(Note that this will include beam or beam-like effects which are present in the beam mapping runs, including crosstalk and sidelobes at 4 • .)For systematic polarization angle miscalibration it is the level at which such an error would produce a detectable T B signal with 95% confidence.For randomized polarization angle miscalibration, it is the leakage we would incur from assuming nominal polarization angles, i.e. no ability to measure per-pair relative polarization angles.For thermal drift, it is the noise floor set by the sensitivity of the thermistors that monitor focal plane temperature.Having established that the detected B-mode signal is not an instrumental artifact, we now consider whether it might be due to a Galactic or extragalactic foreground.At low or high frequencies Galactic synchrotron and polarized-dust emission, respectively, are the dominant foregrounds.The intensity of both falls rapidly with increasing Galactic latitude but dust emission falls faster.The equal amplitude cross-over frequency therefore rises to 100 GHz in the cleanest regions (Dunkley et al. 2009, Fig. 10).The BICEP2 field is centered on Galactic coordinates (l, b) = (316 • , −59 • ) and was originally selected on the basis of exceptionally low contrast in the FDS dust maps (Finkbeiner et al. 1999).It must be emphasized that these ultra clean regions are very special -at least an order of magnitude cleaner than the average b > 50 • level.

FOREGROUND PROJECTIONS
Foreground modeling involves extrapolating high signalto-noise ratio maps taken at lower/higher frequencies to the CMB observation band, and there are inevitably uncertainties.Many previous studies have been conducted and projections made -see for instance Dunkley et al. (2009) and references therein.We note that such studies generically predict levels of foreground B-mode contamination in clean high latitude regions equivalent to r 0.01 -well below that which we observe.

Polarized Dust Projections
The main uncertainty in foreground modeling is currently the lack of a polarized dust map.(This will be alleviated soon by the next Planck data release.)In the meantime we have therefore investigated a number of existing models and have formulated two new ones.A brief description of each model is as follows: FDS: Model 8 (Finkbeiner et al. 1999) FIG.6.-Polarized dust foreground projections for our field using various models available in the literature, and two new ones formulated using publically available information from Planck.Dashed lines show autospectra of the models, while solid lines show cross spectra between the models and the BICEP2 maps.The cross spectra are consistent with zero, and the DDM2 auto spectrum (at least) is noise biased high (and is hence truncated to < 200).The BICEP2 auto spectrum from Figure 2 is also shown with the lensed-ΛCDM+r = 0.2 spectrum.stant emissivity value of 1.6 and a constant temperature of 19.6 K.In our field these values agree both with the mean values shown by the Planck Collaboration in dust polarization 32 , and with the median values of the recently delivered Planck dust model (Planck Collaboration et al. 2013a).A uniform 5% sky polarization fraction is assumed in agreement with the first all-sky images of dust polarization shown by the Planck Collaboration 33 .The polarization angles are taken from the PSM.DDM2: "Data Driven Model 2" (DDM2) constructed using all publicly available information from Planck.Uses the same dust model temperature map as DDM1, with polarization fractions and angles matching those shown by the Planck Collaboration 33 .
All of the the models except FDS make explicit predictions of the actual polarized dust pattern in our field -presumably with varying probabilities of success.We can therefore search for a correlation between the models and our signal by taking cross spectra against the BICEP2 maps. Figure 6 shows the resulting BB auto and cross spectra -note that the autospectra are all well below the level of our observed signal and that the cross spectra are consistent with zero 34 .We also note that the DDM2 model auto spectrum (which is the highest) contains uncorrected noise bias from the polarization fraction and angle maps (which is why this curve in Figure 6 is truncated to < 200).9.2.Synchrotron In our field and at angular scales of > 30 the WMAP Kband (23 GHz) maps are noise dominated.Extrapolating them 32 http://www.rssd.esa.int/SA/PLANCK/docs/eslab47/Session09_Data_Processing/47ESLAB_April_04_10_30_ Aumont.pdf 33http://www.rssd.esa.int/SA/PLANCK/docs/eslab47/Session07_Galactic_Science/47ESLAB_April_04_11_25_ Bernard.pdf 34The cross spectra between each model and real data are consistent with the cross spectra between that model and (uncorrelated) lensed-LCDM+noise simulations.
to our observing frequency using a spectral index of β = −3.3derived from WMAP foreground products results in an upper limit to synchrotron contamination equivalent to r = 0.003.Taking the cross spectrum against our observed map indicates that the true value is lower.9.3.Point Sources Extragalactic point sources might also potentially be a concern.Using the 143 GHz fluxes for the sources in our field from the Planck catalog (Planck Collaboration et al. 2013b), together with polarization information from ATCA (Massardi et al. 2011) we find that the contribution to the BB spectrum is equivalent to r ≈ 0.001.This is consistent with the projections of Battye et al. (2011).
10. CROSS SPECTRA 10.1.Cross Spectra with BICEP1 BICEP1 observed essentially the same field as BICEP2 from 2006 to 2008.While a very similar instrument in many ways the focal plane technology of BICEP1 was entirely different, employing horn fed PSBs read out via neutron transmutationdoped (NTD) germanium thermistors (see T10 for details).The high-impedance NTD devices and readouts have different susceptibility to microphonic pickup and magnetic fields, and the shielding of unwanted RFI/EMI was significantly different from that of BICEP2.The beam systematics were also quite different with a more conservative edge taper and a more complex pattern of observed pair centroid offsets.BICEP1 had detectors at both 100 and 150 GHz.
Figure 7 compares the BICEP2 EE and BB auto spectra with cross spectra taken against the 100 and 150 GHz maps from BICEP1.For EE the correlation is extremely strong, which simply confirms that the mechanics of the process are working as expected.For BB the signal-to-noise is of course much lower, but there appear to be positive correlations.To test the compatibility of the BB auto and cross spectra we take the differences and compare to the differences of lensed-ΛCDM+noise+r = 0.2 simulations (which share common input skies) 35 .Using bandpowers 1-5 the χ 2 and χ PTEs are mid-range indicating that the spectra are compatible to within the noise.(This is also true for EE.) Calculating the BB χ 2 and χ statistics against the lensed-ΛCDM model the BICEP2×BICEP1 150 spectrum has PTEs of 0.37 and 0.05 respectively.However, BICEP2×BICEP1 100 has PTEs of 0.005 and 0.001 corresponding to ≈ 3σ detection of power in the cross spectrum.While it may seem surprising that one cross spectrum shows a stronger detection than the other, it turns out not to be unusual in lensed-ΛCDM+noise+r = 0.2 simulations.

Spectral Index Constraint
We can use the BICEP2 auto and BICEP2×BICEP1 100 spectra shown in Figure 7 to constrain the frequency dependence of the nominal signal.If the signal at 150 GHz were due to synchrotron we would expect the frequency cross spectrum to be much larger in amplitude than the BICEP2 auto spectrum.Conversely if the 150 GHz power were due to polarized dust emission we would not expect to see a significant correlation with the 100 GHz sky pattern.Pursuing this formally, we use simulations of both experiments observing a common sky to construct a combined likelihood function for bandpowers 1-5 of the BI-CEP2 auto, BICEP1 100 auto, and their cross spectrum using the Hamimeche & Lewis (2008) approximation; see B14 for implementation details.We use this likelihood function to fit a six-parameter model parametrized by the five 150 GHz bandpowers and a single common spectral index, β.Without loss of generality we take this spectral index to be the power law exponent of the signal's antenna temperature as a function of frequency.We marginalize this six-parameter model over the bandpowers to obtain a one-parameter likelihood function over the spectral index.
Figure 8 shows the resulting estimate of the spectral index, with approximate 1σ uncertainty range.We evaluate the consistency with specific values of β using a likelihood ratio test.The data are consistent with the spectrum of the CMB (β = −0.7 for these bands and conventions), with a CMB-topeak likelihood ratio of L = 0.68.Following Wilks (1938) we take χ 2 ≈ −2 log L and evaluate the probability to exceed this value of χ 2 (for a single degree of freedom).A synchrotron spectrum with β = −3 is disfavored (L = 0.076, PTE 0.023, 2.3σ), as is the preferred whole-sky dust spectrum from Planck Collaboration et al. (2013a), which corresponds under these conventions to β ≈ +1.75 (L = 0.091, PTE 0.028, 2.2σ).We have also conducted a series of simulations applying this procedure to simulated data sets with CMB and dust spectral indexes.These simulations indicate that the observed likelihood ratios are typical of a CMB spectral index but atypical of dust.

Additional Cross Spectra
Having seen that the BICEP2 auto spectrum is compatible with both the BICEP2×BICEP1 100 and the BICEP2×BICEP1 150 cross spectra we proceed to com- bine the latter36 .Figure 9 compares the result to the BICEP2 auto spectrum from Figure 2. Taking the difference of these spectra and comparing to the differences of the lensed-ΛCDM+noise+r = 0.2 simulations the bandpower 1-5 χ 2 and χ PTEs are mid-range indicating compatibility.
Comparing the BICEP2×BICEP1 comb spectrum to the lensed-ΛCDM expectation the χ 2 and χ values have PTE of 0.005 and 0.002 respectively corresponding to ≈ 3σ evidence of excess power.The compatibility of the BICEP2 auto and BICEP2×BICEP1 comb cross spectra combined with the detection of excess power in the cross spectra is powerful additional evidence against a systematic origin of the nominal signal given the significant differences in focal plane technology and beam imperfections.
The successor experiment to BICEP2 is the Keck Array which consists of five BICEP2 like receivers (Sheehy et al. 2010).The Keck Array data analysis is not yet complete and will be the subject of future publications.However as an additional systematics check we show in Figure 9 a cross spectrum between BICEP2 and preliminary Keck Array maps from the 2012 and 2013 seasons.This cross spectrum also shows obvious excess BB power at low .

COSMOLOGICAL PARAMETER CONSTRAINTS
We have shown that our observed B-mode spectrum i) cannot be explained by systematics (jackknifes, beam-map simulations, other systematics studies, and cross spectra with BICEP1 150 ), and ii) seems highly unlikely to be dominated by foregrounds (dust model projections, dust model cross correlations, and spectral index constraints from cross spectra with BICEP1 100 ).In this section we do some basic fitting of cosmological parameters while noting again that all the bandpowers and ancillary data are available for download so that others may conduct fuller studies.In Figure 2 we see a substantial excess of BB power in the region where an inflationary gravitational wave (IGW) signal would be expected to peak.We therefore proceed to find the most likely value of the tensor-to-scalar ratio r using the "direct likelihood" method introduced in B14.We first form additional sets of simulations for many values of r by combining the lensed-ΛCDM and scaled r = 0.2 simulations 37 .We then combine the bandpowers of these and the real bandpowers with s/n weighting where s is the IGW spectrum for a small value of r and n is the variance of the lensed-ΛCDM+noise simulations.Arranging the simulation pdf values as rows we can then read off the likelihood curve for r as the columns at the observed combined bandpower value.
The result of this process is shown in Figure 10.Defining the confidence interval as the equal likelihood contour which contains 68% of the total likelihood we find r = 0.20 +0.07 −0.05 .This uncertainty is driven by the sample variance in our patch of sky, and the likelihood falls off very steeply towards r = 0.The likelihood ratio between r = 0 and the maximum is 2.9 × 10 −11 equivalent to a PTE of 3.3 × 10 −12 or 7.0σ.The numbers quoted above are for bins 1-5 although due to the weighting step they are highly insensitive to this choice.(Absolute calibration and beam uncertainty are included in these calculations but have a negligible effect.) Evaluating our simple χ 2 statistic between bandpowers 1-5 and the lensed-ΛCDM+noise+r = 0.2 simulations yields a value of 1.1, which for 4 degrees of freedom has a PTE of 0.90.The model is therefore a perfectly acceptable fit to the data.
In Figure 11 we recompute the r constraint subtracting each of the foreground models shown in Figure 6.For the auto spectra the range of maximum likelihood r values is 0.12-0.19,while for the cross it is 0.16-0.21(random fluctuations in the cross can cause shifts up as well as down).The probability that each of these models reflects reality is hard to assess.Presumably greatest weight should be given to the DDM2 cross spectrum and we note that in this case the maximum likelihood value shifts down to r = 0.16 +0.06 −0.05 with a like-37 Hence we assume always nt = 0 making the value of r independent of the pivot scale.lihood ratio between r = 0 and maximum of 2.2 × 10 −8 , equivalent to a PTE of 2.9 × 10 −9 or 5.9σ.Performing this subtraction slightly increases χ 2 (to 1.46) but the fit remains perfectly acceptable (PTE 0.84).
The dust foreground is expected to have a power law spectrum which slopes modestly down ∝ ∼−0.6 in the usual l(l + 1)C l /2π units (Dunkley et al. 2009).In Figure 6 we see that the DDM2 model appears to do this in both auto and cross, before the auto spectrum starts to rise again due to noise in the polarization fraction and angle input maps.We note that the s/n bandpower weighting scheme described above weights the first bin very highly.Therefore if we were to exclude it the difference between the unsubtracted and foreground subtracted model lines in Figure 11 would be much smaller; i.e. while dust may contribute significantly to our first bandpower it definitely cannot explain bandpowers two through five.

Scaled-lensing + Tensors
Lensing deflections of the CMB photons as they travel from last scattering re-map the patterns slightly.In temperature this leads to a slight smoothing of the acoustic peaks, while in polarization a small B-mode is introduced with a spectrum similar to a smoothed version of the EE spectrum a factor ∼ 100 lower in power.Using their own and other data Planck Collaboration XVI (2013) quote a limit on the amplitude of the lensing effect versus the ΛCDM expectation of A L = 0.99 ± 0.05.
Figure 12 shows a joint constraint on the tensor-to-scalar ratio r and the lensing scale factor A L using our BB bandpowers 1-5.As expected there is an anti-correlation -one can partially explain the low excess by scaling up the lensing signal.However, since the lensing and IGW signals have different spectral shapes the degeneracy is not complete.The maximum likelihood scaling is ≈ 1.5.Marginalizing over r the likelihood ratio between peak and unity is 0.75 indicating compatibility, while the likelihood ratio between peak and zero is 0.03, equivalent to a PTE of 7.0 × 10 −3 or a 2.7σ detection of lensing in the BICEP2 BB auto spectrum.

Compatibility with Temperature Data
If present at last-scattering, tensor modes will add power to all spectra including T T .For an r value of 0.2 the contribution to T T at the largest angular scales ( < 10) would be ≈ 10% of the level measured by WMAP and Planck.The theoretical ΛCDM power level expected at these scales is dependent on several cosmological parameters including the spectral index of the initial scalar perturbations, n s , and the optical depth to the last scattering surface, τ .However by combining temperature data taken over a wide range of angular scales indirect limits on r have been set.Using WMAP+SPT data Story et al. (2013) quote r < 0.18 (95% confidence) tightening to r < 0.11 when also including measurements of the Hubble constant and baryon acoustic oscillations (BAO).More recently Planck Collaboration XVI (2013) quote r < 0.11 using a combination of Planck, SPT and ACT temperature data, plus WMAP polarization (to constrain τ ).
These limits appear to be in moderately strong tension with interpretation of our B-mode measurements as tensors.Since we have dispensed with the possibility of significant system- -Modified constraints on the tensor-to-scalar ratio r when subtracting each of the foreground models shown in Figure 6 from the BICEP2 BB bandpowers.The line styles and colors match Figure 6 with dashed for auto spectra and solid for cross spectra.The probability that each of these models reflects reality is hard to assess -see the text for discussion.atic contamination, and shown that foreground is highly unlikely to contribute a large fraction of our observed signal, we must ask what extensions to the standard model might resolve this situation.
One obvious modification is to allow the initial scalar perturbation spectrum to depart from the simple power law form which is assumed in the base ΛCDM model.A standard way in which this is done is by introducing a "running" parameter dn s /d ln k.In Planck Collaboration XVI (2013) the constraint relaxes to r < 0.26 (95% confidence) when running is allowed with dn s /d ln k = −0.022± 0.010 (68%) (for the Planck+WP+highL data combination).In Figure 13 we show the constraint contours when allowing running as taken from Figure 23  12.-Joint constraints on the tensor-to-scalar ratio r and the lensing scale factor A L using the BICEP2 BB bandpowers 1-5.One and two σ contours are shown.The horizontal dotted lines show the 1σ constraint from Planck Collaboration XVI (2013).The BICEP2 data are compatible with the expected amplitude of the lensing B-mode which is detected at 2.7σ.importance sampling (Hastings 1970) to these chains using our r likelihood as shown in Figure 10 to derive the blue contours, for which the running parameter constraint shifts to dn s /d ln k = −0.028± 0.009 (68%).
The point of Figure 13 is not to endorse running as the correct explanation of the observed deficit of low T T power.It is simply to illustrate one example of a simple model extension beyond standard ΛCDM+tensors which can resolve the apparent tension between previous T T measurements and the direct evidence for tensors provided by our B-mode measurements -probably there are others.Of course one might also speculate that the tension could be reduced within the standard ΛCDM+tensors model, for example if τ or other parameters were allowed to shift.We anticipate a broad range of possibilities will be explored.

CONCLUSIONS
We have described the observations, data reduction, simulation and power spectrum analysis of all three seasons of data taken by the BICEP2 experiment.The polarization maps presented here are the deepest ever made at degree angular wikiSI/planckpla section "Cosmological Parameters".13.-Indirect constraints on r from CMB temperature spectrum measurements relax in the context of various model extensions.Shown here is one example, following Planck Collaboration XVI (2013) Figure 23, where tensors and running of the scalar spectral index are added to the base ΛCDM model.The contours show the resulting 68% and 95% confidence regions for r and the scalar spectral index ns when also allowing running.The red contours are for the "Planck+WP+highL" data combination, which for this model extension gives a 95% bound r < 0.26 (Planck Collaboration XVI 2013).The blue contours add the BICEP2 constraint on r shown in the center panel of Figure 10.See the text for further details.
scales having noise level of 87 nK-degrees in Q and U over an effective area of 380 square degrees.
To fully exploit this unprecedented sensitivity we have expanded our analysis pipeline in several ways.We have added an additional filtering of the timestream using a template temperature map (from Planck) to render the results insensitive to temperature to polarization leakage caused by leading order beam systematics.In addition we have implemented a map purification step that eliminates ambiguous modes prior to Bmode estimation.These deprojection and purification steps are both straightforward extensions of the kinds of linear filtering operations that are now common in CMB data analysis.
The power spectrum results are perfectly consistent with lensed-ΛCDM with one striking exception: the detection of a large excess in the BB spectrum in exactly the range where an inflationary gravitational wave signal is expected to peak.This excess represents a 5.2σ excursion from the base lensed-ΛCDM model.We have conducted a wide selection of jackknife tests which indicate that the B-mode signal is common on the sky in all data subsets.These tests offer very strong empirical evidence against a systematic origin for the signal.
In addition we have conducted extensive simulations using high fidelity per channel beam maps.These confirm our understanding of the beam effects, and that after deprojection of the two leading order modes, the residual is far below the level of the signal which we observe.
Having demonstrated that the signal is real and "on the sky" we proceeded to investigate if it may be due to foreground contamination.Polarized synchrotron emission from our galaxy is easily ruled out using low frequency polarized maps from WMAP.For polarized dust emission public maps are not yet available.We therefore investigate a range of models including new ones which use all of the information which is currently available from Planck.These models all predict auto spectrum power well below our observed level.In addition none of them show any significant cross correlation with -BICEP2 BB auto spectra and 95% upper limits from several previous experiments (Leitch et al. 2005;Montroy et al. 2006;Sievers et al. 2007;Bischoff et al. 2008;Brown et al. 2009;QUIET Collaboration et al. 2011, 2012;Bennett et al. 2013;Barkats et al. 2014).The curves show the theory expectations for r = 0.2 and lensed-ΛCDM.our maps.
Taking cross spectra against 100 GHz maps from BICEP1 we find significant correlation and set a constraint on the spectral index of the signal consistent with CMB, and disfavoring synchrotron and dust by 2.3σ and 2.2σ respectively.The fact that the BICEP1 and Keck Array maps cross correlate is powerful further evidence against systematics.
The simplest and most economical remaining interpretation of the B-mode signal which we have detected is that it is due to tensor modes -the IGW template is an excellent fit to the observed excess.We therefore proceed to set a constraint on the tensor-to-scalar ratio and find r = 0.20 +0.07 −0.05 with r = 0 ruled out at a significance of 7.0σ.Multiple lines of evidence have been presented that foregrounds are a subdominant contribution: i) direct projection of the best available foreground models, ii) lack of strong cross correlation of those models against the observed sky pattern (Figure 6), iii) the frequency spectral index of the signal as constrained using BICEP1 data at 100 GHz (Figure 8), and iv) the spatial and power spectral form of the signal (Figures 3 and 10).
Subtracting the various dust models and re-deriving the r constraint still results in high significance of detection.For the model which is perhaps the most likely to be close to reality (DDM2 cross) the maximum likelihood value shifts to r = 0.16 +0.06 −0.05 with r = 0 disfavored at 5.9σ.These high values of r are in apparent tension with previous indirect limits based on temperature measurements and we have discussed some possible resolutions including modifications of the initial scalar perturbation spectrum such as running.However we emphasize that we do not claim to know what the resolution is.
Figure 14 shows the BICEP2 results compared to previous upper limits.The long search for tensor B-modes is apparently over, and a new era of B-mode cosmology has begun.
BICEP2 was supported by the US National Science Foundation under grants ANT-0742818 and ANT-1044978 (Caltech/Harvard) and ANT-0742592 and ANT-1110087 (Chicago/Minnesota).The development of antenna-coupled detector technology was supported by the JPL Research and Technology Development Fund and grants 06-ARPA206-0040 and 10-SAT10-0017 from the NASA APRA and SAT programs.The development and testing of focal planes were supported by the Gordon and Betty Moore Foundation at Caltech.Readout electronics were supported by a Canada Foundation for Innovation grant to UBC.The receiver development was supported in part by a grant from the W. M. Keck Foundation.The computations in this paper were run on the Odyssey cluster supported by the FAS Science Division Research Computing Group at Harvard University.Tireless ad-ministrative support was provided by Irene Coyle and Kathy Deniston.
FIG. 1.-BICEP2 T , Q, U maps.The left column shows the basic signal maps with 0.25 • pixelization as output by the reduction pipeline.The right column shows difference (jackknife) maps made with the first and second halves of the data set.No additional filtering other than that imposed by the instrument beam (FWHM 0.5 • ) has been done.Note that the structure seen in the Q&U signal maps is as expected for an E-mode dominated sky.
FIG.2.-BICEP2 power spectrum results for signal (black points) and temporal-split jackknife (blue points).The red curves show the lensed-ΛCDM theory expectations -in the case of BB an r = 0.2 spectrum is also shown.The error bars are the standard deviations of the lensed-ΛCDM+noise simulations.The probability to exceed (PTE) the observed value of a simple χ 2 statistic is given (as evaluated against the simulations).Note the very different y-axis scales for the jackknife spectra (other than BB).See the text for additional discussion of the BB spectrum.
FIG.4.-Distributions of the jackknife χ 2 and χ PTE values over the 14 tests and three spectra given in Table1.These distributions are consistent with uniform.
FIG.5.-Left:BB spectra from T -only input simulations using the measured per channel beam shapes compared to the lensed-ΛCDM+r = 0.2 spectrum.From top to bottom the curves are i) no deprojection, ii) deprojection of differential pointing only (dp), iii) deprojection of differential pointing and differential gain of the detector pairs (dp+dg), iv) adding deprojection of differential beam width (dp+dg+bw), and v) differential pointing, differential gain and differential ellipticity (dp+dg+ellip.).Right: Estimated levels of other systematics as compared to the lensed-ΛCDM+r = 0.2 spectrum.Solid lines indicate expected contamination.Dashed lines indicate upper limits.All systematics are comparable to or smaller than the extended beam mismatch upper limit.
, assuming a uniform polarization fraction of 5% and setting Q = U. BSS: Bi-Symmetric Spiral (BSS) model of the Galactic magnetic field (O'Dea et al. 2012) 30 .LSA: Logarithmic Spiral Arm (LSA) model of the Galactic magnetic field (O'Dea et al. 2012) 30 .PSM: Planck Sky Model (PSM) (Delabrouille et al. 2013) version 1.7.8, run as a "Prediction" with 15% dust intrinsic polarization fraction and Galactic magnetic field pitch angle of −30 •31 .DDM1: "Data Driven Model 1" (DDM1) constructed from publicly available Planck data products.The Planck dust model map at 353 GHz is scaled to 150 GHz assuming a conhttp://www.apc.univ-paris7.fr/~delabrou/PSM/psm FIG.7.-The BICEP2 EE and BB auto spectra (as shown in Figure2) compared to cross spectra between BICEP2 and the 100 and 150 GHz maps from BICEP1.The cross spectrum points are offset horizontally for clarity.
FIG.8.-The constraint on the spectral index of the BB signal based on joint consideration of the BICEP2 auto, BICEP1 100 auto, and BICEP2×BICEP1 100 cross spectra.The curve shows the marginalized likelihood as a function of assumed spectral index.The vertical solid and dashed lines indicate the maximum likelihood and the ±1σ interval.The blue vertical lines indicate the equivalent spectral indices under these conventions for the CMB, synchrotron, and dust.The observed signal is consistent with a CMB spectrum, while synchrotron and dust are both disfavored by 2σ.
FIG. 9.-Comparison of the BICEP2 BB auto spectrum and cross spectra taken between BICEP2 and BICEP1 combined, and BICEP2 and Keck Array preliminary.(For clarity the cross spectrum points are offset horizontally and the BICEP2×BICEP1 points are omitted at > 200.) FIG.10.-Left:The BICEP2 bandpowers plotted with the maximum likelihood lensed-ΛCDM+r = 0.20 model.The uncertainties are taken from that model and hence include sample variance on the r contribution.Middle: The constraint on the tensor-to-scalar ratio r.The maximum likelihood and ±1 σ interval is r = 0.20 +0.07 −0.05 , as indicated by the vertical lines.Right: Histograms of the maximum likelihood values of r derived from lensed-ΛCDM+noise simulations with r = 0 (blue) and adding r = 0.2 (red).The maximum likelihood value of r for the real data is shown by the vertical line.
FIG.12.-Joint constraints on the tensor-to-scalar ratio r and the lensing scale factor A L using the BICEP2 BB bandpowers 1-5.One and two σ contours are shown.The horizontal dotted lines show the 1σ constraint from Planck Collaboration XVI (2013).The BICEP2 data are compatible with the expected amplitude of the lensing B-mode which is detected at 2.7σ.
FIG.13.-Indirect constraints on r from CMB temperature spectrum measurements relax in the context of various model extensions.Shown here is one example, following Planck Collaboration XVI (2013) Figure23, where tensors and running of the scalar spectral index are added to the base ΛCDM model.The contours show the resulting 68% and 95% confidence regions for r and the scalar spectral index ns when also allowing running.The red contours are for the "Planck+WP+highL" data combination, which for this model extension gives a 95% bound r < 0.26(Planck Collaboration XVI  2013).The blue contours add the BICEP2 constraint on r shown in the center panel of Figure10.See the text for further details.