Abstract
The Cosmic Evolution Survey (COSMOS) has become a cornerstone of extragalactic astronomy. Since the last public catalog in 2015, a wealth of new imaging and spectroscopic data have been collected in the COSMOS field. This paper describes the collection, processing, and analysis of these new imaging data to produce a new reference photometric redshift catalog. Source detection and multiwavelength photometry are performed for 1.7 million sources across the 2 deg2 of the COSMOS field, ∼966,000 of which are measured with all available broadband data using both traditional aperture photometric methods and a new profile-fitting photometric extraction tool, The Farmer, which we have developed. A detailed comparison of the two resulting photometric catalogs is presented. Photometric redshifts are computed for all sources in each catalog utilizing two independent photometric redshift codes. Finally, a comparison is made between the performance of the photometric methodologies and of the redshift codes to demonstrate an exceptional degree of self-consistency in the resulting photometric redshifts. The i < 21 sources have subpercent photometric redshift accuracy and even the faintest sources at 25 < i < 27 reach a precision of 5%. Finally, these results are discussed in the context of previous, current, and future surveys in the COSMOS field. Compared to COSMOS2015, it reaches the same photometric redshift precision at almost one magnitude deeper. Both photometric catalogs and their photometric redshift solutions and physical parameters will be made available through the usual astronomical archive systems (ESO Phase 3, IPAC-IRSA, and CDS).
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
1. Introduction
Photometric surveys are an essential component of modern astrophysics. The first surveys of the sky with photographic plates (Bigourdan 1888) permitted a quantitative understanding of our universe; longer exposures on increasingly larger telescopes led to the first accurate understanding of the true size and scale of our universe (Hubble 1934). Recent breakthroughs have been enabled by the advent of wide-field cameras capable of covering several square degrees at a time (such as MegaCam, Boulade et al. 2003), coupled with wide-field spectroscopic instruments capable of collecting large numbers of spectroscopic redshifts like the Visible Multi-Object Spectrograph (VIMOS; Le Fèvre et al. 2003) and the Multi-Object Spectrograph For Infrared Exploration (MOSFIRE; McLean et al. 2012).
The launch of the Hubble Space Telescope (HST) led to the first Hubble Deep Field catalog (HDF; Williams et al. 1996), which, although limited to an area of 7.5 arcmin2 in four optical bands to ∼28 AB depth, revealed the morphological complexity of the distant universe. This first step gave way to an explosion of data from similar surveys (see Madau & Dickinson 2014 and references therein). The installation of the Advanced Camera for Surveys (ACS) on HST led to a dramatic increase in the field of view and sensitivity of optical observations from space. This advancement laid the groundwork for the Great Observatories Origins Deep Survey (Giavalisco et al. 2004), which captured multiband ACS observations over two 16′ × 10′ fields, totaling over 40 times more area than the original HDF. These observations provided groundbreaking insights into the nature of high-redshift galaxies and their rest-frame properties and helped guide the development of methods to select different classes of objects. Although deep ground-based near-infrared imaging achieved notable successes (e.g., FIRESurvey; Labbé et al. 2003), the installation of the near-infrared camera WFC3 on HST in 2009 expanded our ability to probe the distant universe. This allowed, for the first time, spatially resolved measurements of rest-frame optical light at early cosmic times to depths unreachable from ground-based facilities, because of the high-infrared sky background. The combined power of ACS and WFC3 yielded the deepest “blank-field” image of the universe, the Hubble Ultra Deep Field (HUDF; Beckwith et al. 2006; Ellis et al. 2013; Illingworth et al. 2013; Teplitz et al. 2013), observed over the course of a decade in 13 filters, some reaching depths of ∼29.5–30 AB. Together with ground-based spectroscopy, it was then possible to confirm some of the most distant galaxies that likely contributed to the reionization of the universe (e.g., Robertson et al. 2013; Ishigaki et al. 2018). However, the transformative power of these forerunner observations was limited by their small area, complicating efforts to detect and characterize populations of rare high-redshift galaxies. To combat the effects of cosmic variance, the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS; Grogin et al. 2011; Koekemoer et al. 2011) placed observations over five separate fields, covering ∼100 times more area than the HUDF with ACS and WFC3/IR in multiple filters to depths of ∼28–29 AB, which enabled precise measurements of the physical parameters of galaxies over cosmic time. Despite these significant advantages and the groundbreaking science they allowed, their individual areas proved still too small to fully combat cosmic variance to the extent required to probe large numbers of galaxies at high redshift.
The Cosmic Evolution Survey (COSMOS; Scoville et al. 2007b) began in 2003 with a 1.7 deg2 mosaic with ACS over 583 HST orbits, reaching a 5σ depth of 27.2 AB in the F814W band (Koekemoer et al. 2007; Scoville et al. 2007a). This was the largest single allocation of HST orbits at the time and remains the largest contiguous area mapped with HST to date. Since then, the field has been covered with deep observations by virtually all major astronomical facilities that have consistently invested in extragalactic studies.
While various HST observations have been carried out with other bands in COSMOS, the programs completed to date generally cover no more than a few percent of the field. Ground-based broad- and narrowband observations with Subaru Suprime-Cam were some of the first to be performed over the entire area in 2006, providing one of the largest imaging data sets available at that time (Capak et al. 2007). Mid-infrared observations of the entire COSMOS field were also taken using the Spitzer Space Telescope (Sanders et al. 2007).
The key to exploiting these multiwavelength data sets has been “photometric redshift” estimation (hereafter photo-z), in which template spectral energy distributions (SEDs) are fit to photometry to estimate distances and physical parameters of galaxies (see Salvato et al. 2019 for a review). This has enabled the construction of large statistical samples of galaxies with well-characterized photometric redshifts calibrated to subsets of galaxies with accurate spectroscopic redshifts. COSMOS has been a benchmark testing ground for photo-z measurement techniques, due to its unrivaled multiwavelength imaging data and thousands of measured spectroscopic redshifts.
Over the years, several COSMOS photometric catalogs have been publicly released (Capak et al. 2007; Ilbert et al. 2009, 2013; Muzzin et al. 2013b; Laigle et al. 2016). Each of these releases followed new availability of progressively deeper data, such as the intermediate-band Subaru/Suprime-Cam data (Taniguchi et al. 2015) and the VISTA near-infrared coverage (McCracken et al. 2012; Milvang-Jensen et al. 2013). The most recent release, COSMOS2015 (Laigle et al. 2016), contains half a million galaxies detected in the combined zYJHKs images from the Subaru and VISTA telescopes. Four ultradeep stripes in VISTA and Spitzer, although nonuniform, cover a total area of 0.62 deg2 (e.g., Ashby et al. 2018). The reported photometric redshifts reach a subpercent precision at i < 22.5. This methodology was applied also to the Subaru-XMM Deep Field (Mehta et al. 2018), the only other deep degree-scale field to feature similarly deep near- and mid-infrared coverage.
For more than a decade, the COSMOS field has occupied an outstanding position in the modern landscape of deep surveys and has been relied upon to address fundamental scientific questions about our universe. The 2 deg2 of COSMOS have been used to trace large-scale structure (Scoville et al. 2013; Laigle et al. 2018), discover groups and clusters (e.g., Capak et al. 2011; Casey et al. 2015; Hung et al. 2016; Cucciati et al. 2018), and link galaxies to their dark matter halos (e.g., Leauthaud et al. 2007; McCracken et al. 2015; Legrand et al. 2019). The COSMOS photo-z distribution is used as reference to establish the true redshift distribution in redshift slices in the Dark Energy Survey (DES; Troxel et al. 2018), a crucial component when estimating cosmological parameters with weak lensing (e.g., Mandelbaum 2018). COSMOS demonstrated feasibility of combining space-based shape measurements with ground-based photometric redshifts to map the spatial distribution of dark matter (Massey et al. 2007), a method that will be used by the Euclid mission (Laureijs et al. 2011). COSMOS is already being used to prepare essential spectroscopic observations for the mission (Masters et al. 2019) and to study biases in shape analyses. COSMOS photometric data are being used to predict the quality of Euclid photo-z (G. Desprez et al. 2022, in preparation), as well as the number of [O ii] and Hα emitters expected for future dark energy surveys (Saito et al. 2020). Hence, the photometric catalogs created in COSMOS continue to play a crucial role in cosmic shear surveys (Albrecht et al. 2006).
The combination of its depth in the visible and near-infrared, and the wide area covered, makes COSMOS ideal for identifying the largest statistical samples of the rarest, brightest, and most massive galaxies, such as ultramassive quiescent galaxies up to z ∼ 4 (e.g., Stockmann et al. 2020; Schreiber et al. 2018; Valentino et al. 2020), as well as extremely luminous z ∼ 5–6 starbursts (e.g., Riechers et al. 2010, 2014, 2020; Pavesi et al. 2018; Casey et al. 2019), quasars (e.g., Prescott et al. 2006; Heintz et al. 2016), and UV-bright star-forming galaxies at 6 < z < 10 (e.g., Caputi et al. 2015; Stefanon et al. 2019; Bowler et al. 2020). With rich multiwavelength coverage at all accessible wavelengths from the X-ray (Civano et al. 2016) to the radio (Smolčić et al. 2017), an accurate picture of the galaxy stellar-mass assembly was established with this data set, including numerous estimates of the galaxy stellar-mass function (e.g., Ilbert et al. 2013; Muzzin et al. 2013a; Davidzon et al. 2017), star formation rate (SFR) density (e.g., Gruppioni et al. 2013; Novak et al. 2017), mass and SFR relation (Karim et al. 2011; Rodighiero et al. 2011; Ilbert et al. 2015; Lee et al. 2015; Leslie et al. 2020), and star formation quenching (e.g., Peng et al. 2010). A large number of follow-up programs have been conducted, including extensive spectroscopic coverage (e.g., Lilly et al. 2007; Le Fèvre et al. 2015; van der Wel et al. 2016; Hasinger et al. 2018), integral field spectroscopy (e.g., Förster Schreiber et al. 2009), and ALMA observations (Scoville et al. 2017; Le Fèvre et al. 2020).
This paper presents “COSMOS2020,” the latest release of the COSMOS catalog. The principal additions comprise new ultradeep optical data from the Hyper Suprime-Cam (HSC) Subaru Strategic Program (SSP) PDR2 (SSP; Aihara et al. 2019), new Visible Infrared Survey Telescope for Astronomy (VISTA) data from DR4 reaching at least one magnitude deeper in the Ks band over the full area, and the inclusion of all Spitzer IRAC data ever taken in COSMOS. Additionally, even deeper u*- and new u-band imaging from the Canada–France–Hawaii Telescope program CLAUDS (Sawicki et al. 2019) provides uniform, deep coverage over greater area than available in 2015. Legacy data sets (such as the Suprime-Cam imaging) have also been reprocessed. All imaging data are now aligned with Gaia DR1 (Gaia Collaboration et al. 2016) for the optical and near-infrared data and DR2 (Gaia Collaboration et al. 2018) for the U bands and IRAC data (see A. Moneti et al. 2022, in preparation). This is reflected in band-to-band astrometric precision, which is comparably better than that in Laigle et al. (2016). Taken together, these additions result in a doubling of the number of detected sources and an overall increase in photometric and astrometric homogeneity of the full data set.
Previous COSMOS photometric catalogs were created with SExtractor (Bertin & Arnouts 1996), wherein each image is first homogenized to a common “target” point-spread function (PSF). Fluxes are then extracted within circular apertures (Capak et al. 2007; Ilbert et al. 2009; Laigle et al. 2016). While this approach is widely applied in the literature (e.g., Hildebrandt et al. 2012; Sawicki & Yee 1998), other approaches avoid this homogenization process in order to preserve the original PSFs. The most common alternative involves using a model profile to estimate fluxes, with a wide variety of implementations and variations thereof (e.g Mobasher et al. 1996; Fernández-Soto et al.1999; Labbé et al. 2006; Hsieh et al. 2012; Labbé et al. 2015). Of recent popularity are prior-based techniques (e.g., De Santis et al. 2007; Laidler et al. 2007; Merlin et al. 2016) that use the highest-resolution image as a prior, convolve it with the corresponding PSF kernel of the lower-resolution images and utilize the normalization of the PSF-convolved prior image to estimate the flux in the lower-resolution images. Such an approach was instrumental to extract Spitzer/IRAC photometry in the CANDELS catalogs. Recently, The Tractor (Lang et al. 2016) was developed to perform profile-fitting photometry. Instead of a prior cut from a high-resolution image (e.g., HST), The Tractor derives entirely parametric models from one or more images containing some degree of morphological information. This has two immediate advantages in that The Tractor does not require a high-resolution image from HST and can hence be readily and consistently applied to ground-based data sets nor does it require that all the images are aligned on the same or integer-multiple pixel grid. Because the models are purely parametric, The Tractor can provide shape measurements for resolved sources in addition to fluxes. The Tractor has already been applied to several deep-imaging surveys (Nyland et al. 2017; Dey et al. 2019), the methods of which have greatly influenced this work.
For COSMOS2020, two independent catalogs are created using different techniques. One is created using the same standard method as Laigle et al. (2016) where aperture photometry is performed on PSF-homogenized images, with the exception of IRAC where PSF fitting with the IRACLEAN software (Hsieh et al. 2012) is used. This is the Classic catalog. The other catalog is created with The Farmer (J. R. Weaver et al. 2022, in preparation), a software package that generates a full multiwavelength catalog utilizing The Tractor to perform the modeling. In this sense, The Farmer provides broadly reproducible source detection and photometry that The Tractor, requiring a custom driving script, cannot do by itself. Detailed comparisons of both photometric catalogs and the quality of the photo-z derived from each of them are presented. By utilizing these two methods in tandem it is possible to evaluate the reliability of COSMOS2020. This work presents a detailed analysis of the advantages of each method and provide quantitative arguments that could guide photometric extraction choices for future photometric surveys. The most compelling advantage, however, lies not in discriminating between the catalogs but rather in using them constructively to evaluate the significance, accuracy, and precision of scientific results, a feature that has not yet been possible from a single COSMOS catalog release.
The paper is organized as follows. In Section 2, the imaging data set and the data reduction are presented. Section 3 describes the source extraction and photometry. The photometry from the two photometric catalogs are compared in Section 4. Section 5 presents the photometric redshift measurements. In Section 6, the physical parameters of the sources in the catalog are presented. Section 7 presents our summary and conclusions.
The two catalog files contain the position, extracted multiband photometry, matched ancillary photometry, area flags, derived photometric redshifts, and physical parameters. Details of the catalog files including column names and descriptions will purposely not be presented in this paper, as at the time of writing the two catalog files have a combined 1181 columns. Instead, reliable and up-to-date information corresponding to the particular catalog release version can be found in their accompanying README file and separate release documentation currently in preparation. More information can be found in Appendix A.
The results presented in this paper adopt a standard ΛCDM cosmology with H0 = 70 km s−1 Mpc−1, Ωm,0 = 0.3, and ΩΛ,0 = 0.7. All magnitudes are expressed in the AB system (Oke 1974), for which a flux fν
in microjansky (10−29 erg cm−2 s−1 Hz−1) corresponds to AB
.
2. Observations and Data Reduction
2.1. Overview of Included Data
The principal improvements in COSMOS2020 compared to previous catalogs are the significantly deeper optical and near-infrared images from ongoing Subaru-HSC and VISTA-VIRCAM surveys. In addition, this release contains the definitive reprocessing of all Spitzer data ever taken on COSMOS. “Legacy” or preexisting data sets present in COSMOS2015 have been reprocessed to take advantage of improved astrometry from Gaia (the only exceptions being external ancillary data such as GALEX). All images are resampled to make final stacks with a 015 pixel scale. These stacks are aligned to the COSMOS tangent point, which has R.A. and decl. (J2000) of (10h00m27
92 +02°12′03
50).
Figure 1 illustrates the footprint of the observations in the COSMOS field. Complete details of included data are listed in Table 1. Image quality of the optical and near-infrared data, typically reported as the FWHM of a Gaussian fit to the light profile, is excellent; with the exception of the Suprime-Cam g+-band stack, FWHM values are all between 06 and 1
0. Figure 2 shows the filter transmission curves. Figure 3 indicates the depths of the photometric data and provides a comparison with the COSMOS2015 depths. The depth computations are explained in Section 3.1.3 and follow largely the methods in Laigle et al. (2016). As in previous releases, in each band the image and the corresponding weight map is resampled on the same tangent point using SWarp (Bertin et al. 2002). These images will be made publicly available through the COSMOS website at the NASA/IPAC Infrared Science Archive
41
(IRSA).
Figure 1. Schematic of the COSMOS field. The background image corresponds to the izYJHKs detection image. The solid lines represent survey limits, and the dashed lines indicate the deepest regions of the images. In the case of UltraVISTA, the dashed lines illustrate the “ultradeep” stripes. In the case of CLAUDS, the solid line shows the limit of the u-band image, and the dashed line shows the deepest region of the u*-band image.
Download figure:
Standard image High-resolution imageFigure 2. Relative transmission curves for the photometric bands used. The effects of the atmosphere, telescope, camera optics, filter, and detector are included. The black curves represent medium and narrow bands. The profiles are normalized to a peak transmission of 1.0 for the broad bands, and to 0.3 for the medium and narrow bands.
Download figure:
Standard image High-resolution imageFigure 3. Depths at 3σ measured in empty 3″ diameter apertures in PSF-homogenized images, except for NUV and IRAC images. The NUV depth is from Zamojski et al. (2007) and the F814W 3σ depth is derived from the 5σ value in Koekemoer et al. (2007). For the Y, J, H, and Ks bands, the depths in the ultradeep regions are indicated. The length of each segment is the FWHM of the filter transmission curve. The thin black segments show the depths of the medium and narrow bands. The gray segments indicate the depths of the images used in Laigle et al. (2016) for comparison.
Download figure:
Standard image High-resolution imageTable 1. UV-optical-IR Data Used in The Catalogs
| Instrument | Band | Central a | Width b | Depth c | Error Fact. d |
|---|---|---|---|---|---|
| /Telescope | λ (Å) | (Å) | (2″/3″) | (2″/3″) | |
| (Survey) | ±0.1 | ±0.1 | |||
| GALEX | FUV | 1526 | 224 | 26.0 e | ⋯ |
| NUV | 2307 | 791 | 26.0 e | ⋯ | |
| MegaCam | u | 3709 | 518 | 27.8/27.2 | 1.7/2.0 |
| /CFHT | u* | 3858 | 598 | 27.7/27.1 | 1.4/1.6 |
| ACS/HST | F814W | 8333 | 2511 | 27.8 e | ⋯ |
| HSC | g | 4847 | 1383 | 28.1/27.5 | 1.4/1.8 |
| /Subaru | r | 6219 | 1547 | 27.8/27.2 | 1.4/1.7 |
| HSC-SSP | i | 7699 | 1471 | 27.6/27.0 | 1.5/1.9 |
| PDR2 | z | 8894 | 766 | 27.2/26.6 | 1.4/1.7 |
| y | 9761 | 786 | 26.5/25.9 | 1.4/1.7 | |
| Suprime-Cam | B | 4488 | 892 | 27.8/27.1 | 1.5/1.8 |
| /Subaru | g+ | 4804 | 1265 | 26.1/25.6 | 5.5/5.8 |
| V | 5487 | 954 | 26.8/26.2 | 2.1/2.3 | |
| r+ | 6305 | 1376 | 27.1/26.5 | 1.6/1.9 | |
| i+ | 7693 | 1497 | 26.7/26.1 | 1.5/1.8 | |
| z+ | 8978 | 847 | 25.7/25.1 | 1.5/1.7 | |
| z++ | 9063 | 1335 | 26.3/25.7 | 2.3/2.6 | |
| IB427 | 4266 | 207 | 26.1/25.6 | 2.0/2.2 | |
| IB464 | 4635 | 218 | 25.6/25.1 | 3.1/3.3 | |
| IA484 | 4851 | 229 | 26.5/25.9 | 1.5/1.7 | |
| IB505 | 5064 | 231 | 26.1/25.6 | 1.6/1.8 | |
| IA527 | 5261 | 243 | 26.4/25.8 | 1.7/2.0 | |
| IB574 | 5766 | 273 | 25.8/25.3 | 2.4/2.5 | |
| IA624 | 6232 | 300 | 26.4/25.7 | 1.4/1.7 | |
| IA679 | 6780 | 336 | 25.6/25.1 | 2.5/2.7 | |
| IB709 | 7073 | 316 | 25.9/25.4 | 2.2/2.3 | |
| IA738 | 7361 | 324 | 26.1/25.5 | 1.5/1.7 | |
| IA767 | 7694 | 365 | 25.6/25.1 | 2.1/2.2 | |
| IB827 | 8243 | 343 | 25.6/25.1 | 2.4/2.6 | |
| NB711 | 7121 | 72 | 25.5/24.9 | 1.2/1.4 | |
| NB816 | 8150 | 120 | 25.6/25.1 | 2.3/2.5 | |
| VIRCAM | YUD | 10216 | 923 | 26.6/26.1 | 2.8/3.1 |
| /VISTA | YDeep | 25.3/24.8 | 2.7/2.8 | ||
| UltraVISTA | JUD | 12525 | 1718 | 26.4/25.9 | 2.7/2.9 |
| DR4 | JDeep | 25.2/24.7 | 2.5/2.7 | ||
| HUD | 16466 | 2905 | 26.1/25.5 | 2.6/2.9 | |
| HDeep | 24.9/24.4 | 2.4/2.6 | |||
| 21557 | 3074 | 25.7/25.2 | 2.4/2.6 | |
| 25.3/24.8 | 2.4/2.6 | |||
| NB118 | 11909 | 112 | 24.8/24.3 | 2.8/2.9 | |
| IRAC | ch1 | 35686 | 7443 | 26.4/25.7 | ⋯ |
| /Spitzer | ch2 | 45067 | 10119 | 26.3/25.6 | ⋯ |
| ch3 | 57788 | 14082 | 23.2/22.6 | ⋯ | |
| ch4 | 79958 | 28796 | 23.1/22.5 | ⋯ | |
Notes.
a Median of the transmission curve. b Full width of the transmission curve at half maximum. c 3σ depth computed on PSF-homogenized images (except for IRAC images) in empty apertures with the given diameter, averaged over the UltraVISTA area. d Multiplicative correction factor for photometric flux uncertainties in the Classic catalog, averaged over the UltraVISTA area (see Section 3.1.3). e 3σ depth derived from the 5σ depth from http://cesam.lam.fr/galex-emphot/. f 3σ depth derived from the 5σ depth in Koekemoer et al. (2007).Download table as: ASCIITypeset image
2.2. U-band Data
Several programs have observed the COSMOS field in the U band using the Canada–France–Hawaii telescope (CFHT) and the MegaCam instrument, the most efficient wide-field U-band instrument. For COSMOS2020, all archival MegaCam COSMOS U data are recombined in addition to new data taken as part of the CFHT Large Area U-band Deep Survey 42 (CLAUDS), which uses a new bluer u filter (Sawicki et al. 2019) that lacks the red ∼5000 Å leakage present in the older and now retired u* filter. The methodology employed in the reprocessing is similar to that used by CLAUDS. For completeness, u* corresponds to the u band used in Laigle et al. (2016). The depths 43 of the u and the u* images are reported in Table 1. The main motivations in reprocessing these data are to make deeper U-band images for the field, to make use of the new improved Gaia astrometric reference, and to resample each individual image onto the same COSMOS tangent point.
Starting with the complete data set in both filters, these data were preprocessed by the Elixir pipeline (Magnier & Cuillandre 2004) at the CFHT before being ingested into the Canadian Astronomy Data Center, where the astrometric and photometric calibrations are recomputed using the image-stacking pipeline MegaPipe (Gwyn 2008). Images with sky fluxes above
were rejected. The images were visually inspected and those with obvious flaws (bad tracking, bad seeing) were rejected. Several images were rejected during the calibration stage, having seeing worse than 14. In total, there were 649 u*-band images and 500 u-band images. The median seeing of this final sample is 0
9. The two final stacked images were separately resampled onto the COSMOS tangent point and pixel scale, and each was combined using a weighted 2.8σ clipping. The astrometric calibration used the Gaia DR2 reference catalog (Gaia Collaboration et al. 2018). The final images have an absolute astrometric uncertainty of 20 mas. The u-band calibration has been improved over earlier versions by carefully mapping the zero-point variation across the mosaic for each observing run. Without this correction, the zero point could vary as much as 0.05 mag across the field. After the correction, the variation is reduced to an estimated 0.005 mag, a 10 fold improvement. This correction does not alter the average zero point. While the Sloan Digital Sky Survey (SDSS) is used as the photometric reference, it is not used as in-field standards to avoid propagating any local errors in the SDSS u-band calibration. Instead, zero points are computed per night using all available images. Images taken on photometric nights were used to calibrate data taken in nonphotometric conditions (see Section 3 of Sawicki et al. 2019 for more details). In summary, both u and u* images have equivalent average depths; however the newer u images do not cover the entire COSMOS field but have two gaps at the left and right middle edges of the field (Figure 1). However, compared to the older u* data, which are around 0.3 mag deeper in the field center and substantially shallower outside of it, the newer u data have uniform depth over the whole survey area.
2.3. Optical Data
Wide-field optical data have played a key role in measuring COSMOS photometric redshifts. The commissioning of Subaru’s 1.8 deg2 HSC (Miyazaki et al. 2018) instrument has enabled more efficient and much deeper broadband photometric measurements over the entire COSMOS area. HSC/y data were already included in Laigle et al. (2016). COSMOS2020 uses the second public data release (PDR2) of the Hyper Suprime-Cam Subaru Strategic Program (HSC-SSP) comprising the g, r, i, z, and y bands (Aihara et al. 2019).
The public stacks in COSMOS suffer from scattered light from the presence of bright stars in the field and the small dithers used. These are not removed at the image combination stage. Therefore, all the individual calibrated prewarp CCD images (calexp data) from the SSP public server are processed. These images were recombined with SWarp using COMBINE_TYPE set to CLIPPED with a 2.8σ threshold (see Gruen et al. 2014 for details). This removes a large fraction of the scattered light and satellite trails. As for the other data, images are centered on the COSMOS tangent point with a 015 pixel scale. The Gaia DR1 astrometric solution computed by the HSC-SSP team agrees well with the solutions used here in other bands.
Finally, the Subaru Suprime-Cam data used in COSMOS2015 are retained for this work (Taniguchi et al. 2007, 2015), including 7 broad bands (B, g+, V, r+, i+, z+, and z++), 12 medium-bands (IB427, IB464, IA484, IB505, IA527, IB574, IA624, IA679, IB709, IA738, IA767, and IB827), and 2 narrow bands (NB711 and NB816). However, because the COSMOS2015 stacks had been computed with the old COSMOS astrometric reference, it was necessary to return to the individual images and recompute a new astrometric solution using Gaia DR1 with Scamp (Bertin 2006). The opportunity was taken to perform a tile-level PSF homogenization on the individual images. (see Section 3.1.2).
2.4. Near-infrared Data
The YJHKs
broadband and NB118 narrowband data from the fourth data release
44
(DR4) of the UltraVISTA survey (McCracken et al. 2012; Moneti et al. 2019) are used. This release includes the images taken from 2009 December to 2016 June with the VIRCAM instrument on the VISTA telescope. Compared to DR2, the images are up to 0.8 mag deeper in the ultradeep stripes for the J and H bands, and 1 mag in the deep stripes for the Ks
band, effectively homogenizing the Ks
depth across the full field. The additional NB118 narrowband image only covers the ultradeep region. Characterization of the NB118 filter is in Milvang-Jensen et al. (2013). Only the publicly available stacks are used. These public stacks are aligned to the COSMOS tangent point described previously and have a 015 pixel scale. Gaia DR1 has been used to compute the astrometric solution.
2.5. Mid-infrared Data
The infrared data comprise Spitzer/IRAC channel 1, 2, 3, 4 images from the Cosmic Dawn Survey (A. Moneti et al. 2022, in preparation). This consists of all IRAC data taken in the COSMOS field up to the end of the mission in 2020 January. This includes the Spitzer Extended Deep Survey (Ashby et al. 2013), the Spitzer Large Area Survey with Hyper Suprime-Cam (SPLASH; Steinhardt et al. 2014), the Spitzer-Cosmic Assembly Deep Near-infrared Extragalactic Legacy Survey (S-CANDELS; Ashby et al. 2015), and the Spitzer Matching Survey of the UltraVISTA ultradeep Stripes survey (SMUVS; Ashby et al. 2018). The resulting images have a 06 pixel scale and are resampled to the 0
15 pixel scale of the optical and near-infrared images. The astrometric calibration used the Gaia DR2 reference. This work adopts the processed mosaics with stellar sources removed. A full listing of included programs and details of this processing are given in A. Moneti et al. (2022, in preparation).
2.6. X-Ray, Ultraviolet, and HST Data
The COSMOS2020 catalog provides basic measurements from ancillary data sets in COSMOS, including data unchanged from various source catalogs. Sources in COSMOS2020 are matched with ancillary photometric catalogs using positional cross-matching within a conservative radius of 06 consistently for all ancillary catalogs, adopting only the most reliable sources, as described below. Measurements of the near-UV (0.23 μm) and far-UV (0.15 μm) are taken from the COSMOS GALEX catalog (Zamojski et al. 2007), and X-ray photometry are taken from the Chandra COSMOS Legacy survey (Civano et al. 2016; Marchesi et al. 2016). With the exception of the GALEX near-UV photometry of Zamojski et al. (2007), these ancillary data are not used in deriving photo-z, or physical parameters. Sources with significant X-ray detections are not used to assess photo-z performance, presented in Section 5. HST/ACS morphological measurements are used in identifying stellar contaminants. Summaries of the ancillary photometric data sets can be found in the README files accompanying the COSMOS2020 catalogs. Also included are column descriptions and corresponding reference literature where details of these ancillary data including their construction and caveats can be found.
The HST/ACS F814W high-resolution photometry from Leauthaud et al. (2007) covering 1.64 deg2 of the COSMOS field are included for only unblended sources, as well as their morphological parameters. The ACS observations in the F475W and F606W bands cover about 5% of the field, so these are not included in the catalog.
Unlike Laigle et al. (2016), far-infrared to millimeter photometry from the COSMOS Super-deblended catalog (Jin et al. 2018) are not included as ancillary data in COSMOS2020. This is because the photometry was computed partly using a higher-resolution prior catalog from COSMOS2015, and as such, the identification of correct matches with COSMOS2020 is uncertain. Future work including Spitzer/MIPS (24 μm), Herschel/PACS (100, 160 μm) and SPIRE (250, 350, 500 μm), JCMT/SCUBA2 (850 μm), ASTE/AzTEC (1.1 mm), IRAM/MAMBO (1.2 mm), and VLA (1.4, 3 GHz) photometry will be provided in an updated super-deblended catalog using the COSMOS2020 positions as priors (S. Jin et al. 2022, in preparation).
2.7. Masking
Photometric extraction of sources can be significantly affected by the spurious flux of nearby bright stars, galaxies, and various other artifacts in the images. Thus, it is of interest to mark these sources. For this purpose, the COSMOS2020 catalogs provide flags for objects in the vicinity of bright stars, and for objects affected by various artifacts.
The bright-star masks from the HSC-SSP PDR2 (Coupon et al. 2018) are used to flag these sources. In particular, masks are taken from the Incremental Data Release 1 revised bright-star masks that uses Gaia DR2 as a reference star catalog, where stars brighter than G = 18 mag are masked. About 18% of sources in the catalog are found within the masked regions in the vicinity of bright stars. Furthermore, artifacts in the Suprime-Cam images are masked using the same masks as in COSMOS2015.
Masks indicating the area covered by the observations for the UltraVISTA deep and ultradeep regions are provided as shown in Figure 1. Also included is a mask corresponding to coverage by Suprime-Cam. A conservative combined mask is prepared for sources within 1.27 deg2 that have coverage from HSC, UltraVISTA, and IRAC but that are not close to bright stars or large artifacts.
The most up-to-date descriptions of these masks and their respective flags can be found in the README files that accompany the catalogs.
2.8. Astrometry
The astrometry in the previous COSMOS catalogs was based on radio interferometric data. However, with the advent of Gaia, a new, highly precise astrometric reference is available. For COSMOS2020, astrometric solutions were computed using Gaia data for every data set described here. In the case where data presented in previous papers are included, the astrometric solutions were recomputed and the data resampled. The UltraVISTA, HSC, and the reprocessed Suprime-Cam images were calibrated using the Gaia DR1 astrometric reference (Gaia Collaboration et al. 2016). Figure 4 shows the difference in position between sources in the catalog with HSC i-band total magnitudes between 14 and 19 mag and sources in Gaia DR2. The agreement with the reference catalog is excellent, with a standard deviation in both axes of ∼10 mas and an offset of ∼1 mas. This is much better than any previous COSMOS catalog; for example, the size of the residuals shown in Figure 9 of Laigle et al. (2016) is ∼100 mas. Furthermore, there are no systematic trends of these offsets in either R.A. or decl. over the entire field, unlike previous catalogs. Consequently, this improved astrometric precision enables photometric measurements in smaller apertures for faint, unresolved sources.
Figure 4. Coordinate offset between sources in the Gaia DR1 catalog and sources extracted in the combined detection image as measured in the aperture-based Classic catalog (see Section 3.1.1) The spacing between the dashed lines corresponds to the linear dimension of a pixel in the resampled images. Light and dark shaded regions are ellipses containing 68% and 99% of all sources respectively. For clarity, only 1 in 10 sources are plotted.
Download figure:
Standard image High-resolution image2.9. Spectroscopic Data
The spectroscopic data are collected from several spectroscopic surveys, conducted with different target selection criteria and instruments. In this paper, the spectroscopically confirmed redshifts (spec-z hereafter) are used to evaluate the accuracy of the photo-z. Therefore, this work only includes spec-z with the highest confidence level. If the observation of one object is duplicated, only the spec-z associated to the highest confidence level is used.
The spectroscopic surveys presented below share a common system to define the confidence level in the redshift measurement (Lilly et al. 2007; Le Fèvre et al. 2015; Hasinger et al. 2018; Kashino et al. 2019; Masters et al. 2019; Rosani et al. 2019). They follow a flagging system described in Section 6 of Le Fèvre et al. (2005). Each spectrum is inspected visually by two team members, who attribute a flag to the spec-z, depending on the robustness of the measurement. A flag 3 or 4 is associated to the spec-z if several prominent spectral features (e.g., emission and absorption lines, continuum break) support the same spec-z. While such flagging system is subjective, a posteriori analysis based on duplicated spectroscopic observations indicates that the confidence level of flag 3 and 4 spec-z is above 95%.
Two large programs were conducted at ESO-VLT with the VIMOS instrument (Le Fèvre et al. 2003) to cover the COSMOS field. The zCOSMOS survey (Lilly et al. 2007) gathered 600 hr of observation and is split into a bright and a faint component. The zCOSMOS-bright surveys targeted 20,000 galaxies selected at i* ≤ 22.5, which by construction is highly representative of bright sources. The zCOSMOS-faint survey (D. Kashino et al. 2022, in preparation) targeted star-forming galaxies selected with BJ < 25 and falling within the redshift range 1.5 ≲ z ≲ 3. The VIMOS Ultra Deep Survey (VUDS; Le Fèvre et al. 2015) includes a randomly selected sample of galaxies at i < 25, as well as a preselected component at 2 < z < 6. Included are 8280, 739, and 944 galaxies from the zCOSMOS-bright, zCOSMOS-faint, and VUDS surveys, respectively.
Data from the Complete Calibration of the Color–Redshift Relation Survey (C3R2; Masters et al. 2019) are also used. The galaxies were selected to fill the color space using the self-organizing map algorithm (Kohonen 1982). Depending on the expected redshift range, various instruments from the Keck telescopes were used, specifically LRIS, DEIMOS, and MOSFIRE. While this sample of 2056 galaxies is representative in colors, it is not designed to be representative in brightness.
A large sample of 4353 galaxies taken at Keck with DEIMOS, with various selections over a large range of wavelengths from the X-ray to the far-infrared and radio (Hasinger et al. 2018) are used. Such diversity of selection is crucial to estimate the quality of the photo-z for specific populations known to provide less robust results (e.g., Casey et al. 2012).
The FMOS near-infrared spectrograph at Subaru enables tests of the photo-z in the redshift range 1.5 < z < 3 sometimes referred to as the “redshift desert” (e.g Le Fèvre et al. 2013). The sample from Kashino et al. (2019) contains 832 bright star-forming galaxies at z ∼ 1.6 with stellar masses
following the star-forming main sequence.
Also adopted are 447 sources observed with MUSE at ESO/VLT (Rosani et al. 2020). The sample includes faint star-forming galaxies at z < 1.5 and Lyα emitters at z > 3 and can be used to test the photo-z in a magnitude regime as faint as i > 26.
Finally, other smaller size samples are added including B. Darvish et al. (2022, in preparation) and J. Chu et al. (2022, in preparation) with MOSFIRE, passive galaxies at z > 1.5 (Onodera et al. 2012), and star-forming galaxies at 0.8 < z < 1.6 from Comparat et al. (2015). The full compilation of spec-z in the COSMOS field, including the contributing survey programs, is described in M. Salvato et al. (2022, in preparation).
3. Source Detection and Photometry
3.1. The Classic Catalog
3.1.1. Source Detection
The “chi-squared” izYJHKs
detection image (Szalay et al. 1999) is created with SWarp from the combined original images without PSF homogenization using the CHI_MEAN option. The inclusion of the HSC/i, z-band data increases the catalog completeness for bluer objects. In particular, the HSC/i-band image is very deep and has excellent seeing of around 06. The previous 2015 catalog (Laigle et al. 2016) did not include i-band data in their detection image. The inclusion of the deep i band in this detection strategy is the main reason for the higher number of sources detected in the COSMOS2020 catalog compared to COSMOS2015, likely driven by small, blue galaxies at low and intermediate redshift. The increased depth of the near-infrared bands also contributes to the greater number of detected sources.
For the Classic catalog, the detection is performed using SExtractor (Bertin & Arnouts 1996) with parameters listed in Table 4. The main difference with respect to COSMOS2015 is DETECT_MINAREA set to 5 pix instead of 10 pix, which is made possible thanks to the lower number of spurious sources in the detection image compared to COSMOS2015, owing to the addition of the i band and deeper imaging in general. The number of detected sources reaches 1,720,700 over the whole field, with 790,579 sources in the UltraVISTA region outside the HSC bright-star masks.
3.1.2. Point-spread Function Homogenization
The procedure to homogenize the PSF in the optical/near-infrared images is similar to the one presented in Laigle et al. (2016). In the first step, SExtractor is used to build a catalog of bright sources. Stars are identified by cross-matching coordinates with point-like sources from the HST/ACS catalog in COSMOS (Koekemoer et al. 2007; Leauthaud et al. 2007). Saturated stars are removed in the masks (see Section 2.7). Bright, but not saturated stars are identified by their position in the half-light radius versus apparent magnitude diagram. The PSF of each image is modeled using PSFEx (Bertin 2013) adopting the polar shapelet basis functions (Massey & Refregier 2005). The same code also provides a convolution kernel that can modify the image’s response into a “target PSF,” which is modeled as a Moffat profile (Moffat 1969) with parameters θ = 08 and β = 2.5 (the former being the FWHM while β is the atmospheric scattering coefficient). These two parameters are identical to Laigle et al. (2016), whereas the PSF_SAMPLING parameter is now set to 1 in order to fix the kernel pixel scale. The core of the homogenization process consists in convolving the entire images with these kernels, so that all of them are affected by the same Moffat-shaped PSF.
Figure 5 illustrates the precision of the PSF homogenization as a function of distance from the center of the source. The integral of the best-fitting PSF within different apertures is plotted for every band, before and after the homogenization; all of these functions are normalized by the integral of the target Moffat profile within the same apertures. The ratios of the integrals differ from 1 by less than 5% for all apertures with the exception of Suprime-Cam/g+, which has a particularly broad initial PSF. In this case, PSF homogenization kernels can still be consistently computed even when the input PSF is wider than the target PSF and will give a fraction of the weight to the wings (as opposed to the central region) of the PSF. Although the difference between the Suprime-Cam/g+ PSF and the target PSF is below 10% in all apertures, it is poor enough that the band is excluded from SED fitting.
Figure 5. Best-fitting Moffat profile PSF integrated in circular apertures,
, normalized to the target PSF
, as a function of the aperture radius for all bands. Top: before PSF homogenization, for all bands except Suprime-Cam. Middle: after PSF homogenization, for all bands except Suprime-Cam. Bottom: after PSF homogenization, for Suprime-Cam bands. The horizontal dashed lines indicate a ±5% relative offset. The color map reflects the PSF FWHM before homogenization for all bands and after homogenization for the Suprime-Cam bands.
Download figure:
Standard image High-resolution imageIn principle, spatial variability of the PSF should be taken into account. For CLAUDS, HSC, and UltraVISTA bands, this effect is negligible. However, for Suprime-Cam bands the resulting impact of the PSF variability on aperture photometry can be as high as 0.1 mag (as discussed in Laigle et al. 2016). As an example, Figure 6 presents the variation of the PSF across the sky for the Suprime-Cam/IB464 band, which has the greatest spatial variability before homogenization among the considered bands.
Figure 6. Distribution of the difference between the local and the global median half-light radius for the selected stars in the IB464 band, as a function of position, before (top) and after (bottom) PSF homogenization.
Download figure:
Standard image High-resolution imageIn this work, the spatially dependent PSF homogenization of Suprime-Cam bands is performed starting from individual exposures, as they cover different patches of the field. First, the single exposure files (SEFs) at the original pixel scale of 02 are resampled to the target tangent point with the pixel scale of 0
15, to remove astrometric distortions. Then, the bright object extraction, PSF modeling, and kernel computation are done in the same way as for the other images. Stars are identified in the half-light radius versus apparent magnitude diagram, automatically adjusting the radius threshold using sigma clipping. The PSF-homogenized SEFs are finally coadded to build the final stacks. Frames with high sky noise (>3.5×the median noise) are rejected, representing 1, 5, 28, 16, and 4 images in the B, g+, z+, z++, and NB816 bands, respectively, out of a total of 2219 images. In these high-noise images, only a few objects are detected making it difficult to compute an astrometric solution.
3.1.3. Aperture Photometry
Optical and near-infrared fluxes measured in 2″ and 3″ diameter apertures are extracted using SExtractor in “dual-image” mode from PSF-homogenized images, using the CHI_MEAN as the detection image. Fixed apertures ensure that the same structures are sampled in different bands for each source, which is necessary for reliable measurement of colors and photometric redshifts.
The photometric errors computed with SExtractor are underestimated in the case of correlated noise in the image (e.g., Leauthaud et al. 2007). The aperture flux errors and magnitude errors are therefore rescaled with band-dependent correction factors applied to all sources (Bielby et al. 2012); see Mehta et al. (2018) for a detailed description. In the PSF-homogenized images, the flux is measured in empty apertures (using the segmentation map estimated in each image) randomly placed over the field. The depths are computed from the standard deviation (3σ clipped) of the fluxes in empty apertures inside the UltraVISTA area. The correction factors are then the ratio between the standard deviations of the fluxes measured in empty apertures and the median flux errors in the source catalog, as in Laigle et al. (2016). This is performed separately for 2″ and 3″ diameter apertures, and in the case of UltraVISTA photometry, the deep and ultradeep regions are treated separately. The 3σ depth estimates for each band computed over the central UltraVISTA area are listed in Table 1 and illustrated in Figure 3. Also included in Table 1 are the photometric uncertainty correction factors used in the Classic catalog. The flux and the magnitude errors are already corrected in the Classic catalog, as it was done for the COSMOS2015 catalog. The 3σ depth of the IRAC bands are computed using the same approach, after tuning the SExtractor configuration to the IRAC images.
Aperture photometry may underestimate the total flux of the sources. Optical and near-infrared aperture fluxes (and flux uncertainties) are converted to total fluxes using a source-dependent correction equivalent to the one adopted by Laigle et al. (2016). The correction for each object is computed from the pseudo-total flux fAUTO provided by SExtractor and defined as the flux contained within the band-independent Kron radius (Kron 1980) as set by PHOT_AUTOPARAMS (see Table 4), and the aperture flux fAPER, also provided by SExtractor. The ratio of these two measurements are then averaged over the HSC/g, r, i, z, y and UltraVISTA/Y, J, H, Ks broad bands and weighted by the inverted quadratic sum of the pseudototal and the aperture signal-to-noise ratio:

where the weights are defined as

with σAUTO the fAUTO uncertainties, and σAPER the fAPER uncertainties (corrected for correlated noise). The sum only includes the filters in which both fAUTO and fAPER are positive and unsaturated. As a result, the optical and near-infrared colors remain unaffected. Because photometry from GALEX and IRAC are measured in total fluxes, this step is required in order to obtain meaningful colors using these bands. Offsets are available (in magnitude units) in the Classic catalog for both 2″ and 3″ diameter apertures.
3.1.4. IRAC Photometry
Photometry is performed on the Spitzer/IRAC channels 1 and 2 images using the IRACLEAN software (Hsieh et al. 2012).The infrared images of IRAC have a larger PSF (with FWHM between 16 and 2
0) compared to the optical data and are significantly affected by source confusion, which prevents reliable photometric extraction. To tackle this issue, IRACLEAN uses a high-resolution image (and its segmentation map) as a prior to identify the centroid and the boundaries of the source and iteratively subtract a fraction of its flux (“cleaning”) until it reaches some convergence criteria specified by the user. IRACLEAN works in the approximation that an IRAC source can be modeled as a scaled Dirac delta function convolved with the PSF.
For each source identified in the segmentation map, the software uses a box of fixed size as a filter in the low-resolution image to find the centroid and estimate the flux within a given (square) aperture. The PSF is convolved with a Dirac delta function with an amplitude equal to a fraction of that aperture flux and then subtracted from the image. Filtering and centroid positioning are executed within the object’s boundaries as defined by the prior high-resolution segmentation map. This procedure is repeated on the residual image produced by the previous iteration until the flux of the treated source becomes smaller than a specified threshold. In this case, a minimum signal-to-noise ratio of 2.5 is set so that an object will be considered completed once its aperture flux, compared to the background, becomes smaller than that value. This also implies that not all sources detected in the prior image will be extracted by IRACLEAN. Moreover, because the global sky background is recomputed at each iteration, the signal of a faint source—initially disregarded—may emerge from the background after several passes on the nearby objects. The iterative procedure of centroid positioning within the object's boundaries allows extended sources to be treated, and the fact that the flux is subtracted by convolving the PSF with a Dirac delta function centered on the centroid controls the contamination by neighbors. For more details on the workings of IRACLEAN, the reader is referred to Section 7 of Hsieh et al. (2012) and their Figure 16 for an example of residual images.
User-controlled parameters are the threshold below which to stop cleaning, the filtering box size, the square aperture to measure IRAC flux, and the fraction of flux to subtract at each iteration. In this configuration, a box of size 7 × 7 pixel is adopted to filter and to find the centroid, and a square aperture of size 9 × 9 pixel to estimate the aperture flux; the fraction of flux subtracted for each cleaning step is 20%. The final flux of each object is the sum of the fluxes subtracted at each step. Because the centroid position is allowed to change at every iteration, the source is eventually modeled by a combination of Dirac delta functions that are not necessarily centered at the same point. The flux error is computed using the residual map by measuring the fluctuations in a local area around the object.
This implementation adopts the high-resolution izYJHKs
detection image and its segmentation map produced by SExtractor. In order to parallelize the processing of the images, a mosaic of 14 × 14 tiles is made with a
overlap in each direction. The PSF is modeled on a grid with spacing of 29″ across the full IRAC image in order to take into account its spatial variation using the software PRFmap (A. Faisst 2019, private communication). When modeling the PSF at each grid point, the code takes into account that the final IRAC mosaic is made of multiple overlapping frames that can have different orientations with a PSF that is not rotationally symmetric. PRFmap models the PSF in each of the frames that overlap at a grid point and stacks them to produce the PSF model of the mosaic at that location. IRACLEAN thus provides photometry in channels 1 and 2 for more than a million sources over the whole field.
3.2. The Farmer Catalog
3.2.1. Source Detection
The source detection step is entirely equivalent to the procedure adopted for the Classic catalog. The Farmer utilizes the SEP code (Barbary 2016) to provide source detection, extraction, and segmentation, as well as background estimation with near-identical performance to classical SExtractor. Given their near-identical performance, The Farmer uses SEP as both are written in Python and hence SEP is readily integrated into the existing workflow.
The detection parameters are configured identically between SExtractor and SEP where possible. Crucially, given that model-based photometry from The Farmer cannot be readily applied to saturated bright stars and sources contaminated by stellar halos, the HSC PDR2 bright-star masks are adopted a priori to ensure the reliability of the derived photometry (see Section 2.7). Photometric extraction with The Farmer for COSMOS2020 is limited to the UltraVISTA footprint as this area contains all the bands used in the detection image, which are used by The Farmer to construct galaxy models. Including areas that lack complete izYJHKs
coverage introduces undesirable inhomogeneities to the model constraints and hence may adversely change the selection function. Photometry of sources within the HSC bright-star masks is also not attempted with The Farmer as the halo light and the saturated stars are difficult to account for in a model, resulting in poor measurements and exponentially longer computational times. While there are 964,506 sources in the entire The Farmer catalog, only 816,944 sources lie within the UltraVISTA footprint but outside the conservative HSC bright-star halo masks. This is marginally larger than the number of sources detected in the Classic catalog (difference ∼3%). Of these, ∼95% have counterparts in the Classic catalog within 06. Conversely, virtually all (>99%) Classic catalog sources have counterparts in The Farmer catalog within the same radius over the same area. Generally, sources only included in The Farmer catalog are concentrated around unmasked bright-star halos and their diffraction spikes (further underscoring the need for accurate a priori masking) and which are unlikely to possess well-fit models and so are easily flagged. Some, however, appear to be result of comparably more accurate deblending of nearby sources by SEP, which, given the ability to easily identify nonphysical detections, is advantageous for the important reason that two blended sources will not be well fit by models unless they are identified as separate objects at detection. This will be further discussed in the context of The Farmer in J. R. Weaver et al. (2022, in preparation).
Once sources are detected, The Farmer identifies crowded regions with multiple nearby sources that, although deblended at detection (i.e., have their own centroids), may have some overlapping flux which must be separated by the models. Hence, to avoid double-counting flux and to achieve the most robust modeling possible, these sources are modeled simultaneously. Such crowded regions are identified by dilating the source segmentation map, which assigns pixels to sources, in order to form groups of sources defined by contiguous dilated pixels. Sources that are not in crowded areas are expected to be a group of one source, whereas sources in crowded regions end up as members of larger groups to be modeled together.
3.2.2. PSF Creation
In contrast with the PSF-homogenization strategy employed in the Classic catalog for all optical and NIR bands, The Tractor does not operate on images that are PSF-homogenized. Because the models it uses are purely parametric, The Tractor can simply convolve a given model with the PSF of a given band, which is generally a more tractable operation than PSF homogenization. The approach to generate PSFs for The Farmer catalog follows similarly to that of Classic, using spatially constant PSFs for the broad bands and spatially varying PSFs for the Subaru medium bands and IRAC bands.
A spatially constant PSF is computed for u, u*, as well as all HSC and UltraVISTA bands with PSFEx. Point-source candidates are selected as described in Section 3.1.2. Because models are sensitive to the wings of sources, The Farmer benefits from particularly large PSF renderings. Typical unsaturated point sources in optical and NIR images in this work are well described by PSF stamps generated with 201 pixel diameters (3015).
Another consideration, introduced for the Classic catalog in Section 3.1.2, is the highly variable PSF of the Suprime-Cam medium bands. Although The Farmer does not use any kind of PSF-homogenization procedure and hence cannot overcome this variability in the same way as for the Classic catalog, it is still possible to overcome highly variable PSFs in model-based photometry by providing a particular PSF to a group of sources, similar to PRFMap, which produces a theoretical PSF sampled over a fixed grid. However, this exact approach cannot be readily replicated for other bands, as there is a lack of sufficient theoretical PSFs for the Subaru medium bands. Instead, a spatial grid is constructed using the PSF FWHM measured from a sample of point-like sources nearest to each grid point. The FWHM distribution is then discretized to form a set of PSFs at a gauge small enough to provide accurate PSFs for each grid point while maintaining the spatial sampling required to describe the variations across the field. Hence, for each medium band a 20 × 20 grid consisting of 10 PSFs is built with a typical resolution of less than a tenth of a pixel. Then, for a particular group of sources The Farmer provides the nearest PSF sample to be used in the forced photometry modeling.
Lastly, for IRAC, The Farmer employs PRFMap to provide a spatially varying PSF to each group of sources based on their nearest PRF sampling point, consistent with the IRACLEAN procedure described in Section 3.1.4. The PSFs are then resampled to match the 015 pixel scale of the mosaics.
3.2.3. Model Determination
Details of the model determination procedure will be found in J. R. Weaver et al. (2022, in preparation). This is a brief summary. The Farmer employs five discrete models to describe resolved and unresolved, stellar and extragalactic sources:
- 1.PointSource models are taken directly from the PSF used. They are parameterized by flux and centroid position and are appropriate for unresolved sources.
- 2.SimpleGalaxy models use a circularly symmetric, exponential light profile with a fixed 0
45 effective radius such that they describe marginally resolved sources and mediate the choice between a PointSource and a resolved galaxy model. They are parameterized also by flux and centroid position.
- 3.ExpGalaxy models use an exponential light profile. They are parameterized by flux, centroid position, effective radius, axis ratio, and position angle.
- 4.DevGalaxy models use a de Vaucouleurs light profile. They are parameterized by flux, centroid position, effective radius, axis ratio, and position angle.
- 5.CompositeGalaxy models use a combination of ExpGalaxy and DevGalaxy models. They are concentric and hence share one centroid. There is a total flux parameter as well as a fraction of total flux parameter to distribute the flux between the two components. Components have their own effective radii, axis ratios, and position angles.
These five models form The Farmer's decision tree, whose goal is to both determine the most suitable model for a given source and provide an optimized set of parameters to describe the shape and position of the source. Unlike some other model-based photometric techniques, the models in The Tractor are purely parametric and hence do not require a high-resolution image stamp that must undergo PSF kernel convolution when photometering a different band. Although the exact implementation of the modeling can vary (e.g., choice of bands, library of models, etc.), for the present catalog The Farmer attempts to jointly model a group of nearby sources, using simultaneous constraints from each of the six individual izYJHKs bands used in the detection image. This ensures that the selection function is preserved by providing a model even for sources detected from one band.
The Farmer then uses its decision tree to select the most appropriate model type for each source in the group. The decision tree starts with unresolved or marginally resolved models (1, 2) and moves toward more complex, resolved ones (3, 4, 5). Each level of the decision tree assumes the same initial conditions, excepting that some sources may already be assigned a model type in the latter stages. The tree must be tuned according to the data being used. In this work, marginally resolved SimpleGalaxy models must achieve a lower
by a margin of 0.1 compared with an unresolved PointSource model, thereby preferring the PointSource model whenever possible. If either model achieves a
, then the next level is tried. If the ExpGalaxy and DevGalaxy models are not indistinguishable by
or neither achieves a
, the most complex CompositeGalaxy is tried (see J. R. Weaver et al. 2022, in preparation, for more details). Once a model type has been assigned to each source, the final ensemble of models is reoptimized to ensure that the derived model parameters reflect the actual model ensemble. If instead the parameters were adopted during the initial stages of the decision tree, then it would be possible for one source that has not yet been fit with the appropriate model type to influence the parameters of another nearby source. By recomputing the model parameters at the very end, when all the model types have been assigned, this case is avoided.
An example of the modeling procedure is shown in Figure 7, whereby two models are jointly determined for two nearby sources using each of the individual izYJHKs bands, simultaneously. It is stressed that the models are not constructed on the detection image itself, which suffers from PSF inhomogeneity, which makes it not suitable for deriving morphologically sensitive model constraints. The i band is shown as it is the deepest high-resolution band in the detection image and hence provides the greatest constraints on the morphology. Forced photometry on IRAC channel 1 (see Section 3.2.4) is shown to demonstrate the extent to which the prior information derived jointly from izYJHKs can adequately model IRAC flux, even for the most severely blended sources that apertures cannot accurately photometer.
Figure 7. Demonstration of the model-fitting method from The Tractor. A pair of detected but overlapping sources is shown in the HSC i band (top). They are jointly modeled using The Farmer with constraints from each of the izYJHKs images in order to provide a parameterized solution that is suitably optimized and from which the total flux is measured. The same pair of sources is shown in the less resolved IRAC channel 1 (bottom), where the two models are convolved with the channel 1 PSF and reoptimized using the channel 1 image to measure the flux contributed by each source. The extremely blended nature of this pair is underscored by the overlapping 2″ apertures, consistent with the methodology of the Classic catalog. Pixel values are logarithmically scaled between the rms level and 95% of the peak flux per pixel.
Download figure:
Standard image High-resolution image3.2.4. Forced Photometry
With the model catalog complete for all detected sources, The Farmer can measure total model fluxes for every band of interest. The Farmer does this in a “forced photometry” mode, similar to the “dual-image” mode in SExtractor. In brief, the model catalog of a given group is initialized with the optimized parameters from the preceding stage. For each band, model centroids are allowed to vary with a strict Gaussian prior of 0.3 pix to prevent catastrophic failures. By doing so, The Farmer can overcome subtle offsets in astrometric frames between different images, and this can be done on an object-by-object basis to even overcome spatially varying offsets that may arise due to bulk flows in the astrometry. The optimization of these models produces total fluxes and flux uncertainties for each band of interest, keeping the shape parameters fixed. The flux measurement is obtained directly from the scaling factor required to match the models, which are normalized to unity, to the source in question. However, the flux uncertainties are derived by computing a quadrature sum over the weight map, weighted by the unit profile of the model, producing a similar result to traditional aperture methods but where the model profile is used in place of a fixed aperture. The weight maps are the same as those used by Classic. Importantly, the flux uncertainties reported in The Farmer catalog are not corrected with empty apertures, in contrast with the Classic catalog (see Section 3.1.3). The aperture-derived procedure used in Classic is inappropriate for model-based photometry, and although it may be expected that model-based methods would produce more precise measurements, they may still underestimate the true extent of correlated noise in the images and hence underestimate the uncertainty. This will be further discussed in J. R. Weaver et al. (2022, in preparation) and briefly evaluated later in Section 5.3 in terms of photometric redshift precision.
Photometry is performed with The Farmer for all CFHT, HSC, VISTA, and IRAC bands, as well as the Suprime-Cam intermediate bands. As such, there are two main differences with respect to Classic. First, the older Suprime-Cam broad bands suffer from high spatial PSF variability, which is resolved in the Classic catalog by PSF-homogenizing each tile (see Section 3.1.3). However, this cannot be done for profile-fitting methods like The Farmer that do not operate on PSF-homogenized images. Combined with the fact that these broad bands are eclipsed by deeper imaging from HSC in almost all cases, they contribute very little to improving photo-z precision and can indeed even decrease accuracy if the PSF variability is not properly controlled. For these reasons the Suprime-Cam broad bands are only used when deriving photo-z’s from the Classic photometry using LePhare, as described in Section 5. Second, photometry for IRAC channels 3 and 4 are performed with The Farmer to extend the wavelength baseline. This is largely due to the significantly cheaper computational power required for The Farmer relative to IRACLEAN. Although relatively shallow, in limited cases they can help place constraints on the rest-frame optical emission of potentially high-z sources. Details as to precisely which bands are available with each catalog can be found in associated README files.
3.2.5. Advantages and Caveats
An important distinction between the two catalogs is that The Farmer provides total fluxes natively, without the need to correct for aperture sizes or perform PSF homogenization. Because this advantage can be leveraged over different resolution regimes, The Farmer computes photometric measurements that are self-consistent. Additional metrics are also readily available from The Farmer. This includes the goodness-of-fit reduced
estimate computed for the best-fit model of each source on a per-band basis, obtained by dividing the χ2 value by the number of degrees of freedom, i.e., the pixels belonging to the segment for each source minus the number of fitted parameters. Measurements of source shape are provided for resolved sources, and as such they yield estimates of effective radii, axis ratios, and position angles. These measurements are directly fitted in The Farmer, unlike in SExtractor where they are estimated from moments of the flux distribution. Uncertainties on shape parameters are deliverable as well, in the sense that they are a fitted parameter, which is the result of a likelihood maximization and not a directly calculated quantity. Likewise, centroids for both the modeling and forced photometry stages are also fitted parameters and are delivered with associated uncertainties.
Another important consideration is that given the diversity of galaxy shapes and source crowding across ultradeep imaging, it is inevitable that a model, or group of models, will fail to converge. Often it is due to either a bright, resolved source not being well described by smooth light profiles, an extremely dense group of sources, or a failure at detection to separate nearby sources (and hence assign the correct number of models to use), or a combination of all three. This problem is endemic to these methods and one that cannot be practically solved by manually tuning each fit, nor at this time by selecting tuning parameters based on statistics, which are unlikely to be effective in the most ill-conditioned cases. Thankfully, as in SExtractor, which indicates failures by a combination of Boolean flags, model-based photometry can also be accompanied by a flag to indicate a failure to converge. Importantly, for those that do converge, however, model-based methods can provide more information about untrustworthy measurements than any aperture-based method by leveraging the statistical properties of the residual pixel distribution (e.g., χ2 and other χ-pixel statistics) to precisely indicate the extent of these failures, and hence convey in comparably greater detail the extent to which the user can rely on any given measurement.
4. Photometry Comparison
With the photometry from the two independent methods in hand, this section presents a comparison of the photometric catalogs as measured by differences in magnitudes, colors, and photometric uncertainties. In addition, a comparison is made with literature results of galaxy number counts. The primary motivation for these tests is to validate the two catalogs, in particular the performance of the relatively newer photometry from The Tractor generated with The Farmer. The performance of The Tractor code has been demonstrated previously (see Lang et al. 2016), hence this work focuses on additional validation of the performance particular to The Farmer configuration used here. Additional validation of The Farmer where its performance is benchmarked against simulated galaxy images is provided in J. R. Weaver et al (2022, in preparation).
A matched sample of sources common to both The Farmer and Classic is constructed consisting of 854,734 sources matched within 06, for which The Farmer obtained a valid model and hence has extracted photometry. The sample contains 95.8% of valid The Farmer sources, most of which are matched well below 0
6. As explained in Section 3.2.1, those that are unmatched are typically marginally detected sources or blends that are deblended by only one of the detection procedures.
4.1. Magnitudes
A comparison of broadband magnitudes derived independently with the two methods is shown in Figure 8. One medium band is included for reference. Here the rescaled 2″ total aperture magnitudes are used to compare with the model magnitudes from The Farmer. The comparison is limited only to sources brighter than the 3σ depth as reported in Table 1 and indicated by the vertical dashed lines. For bands not included in the detection CHI_MEAN, these depths are upper bounds. The quadrature-combined ±3σ and ±1σ uncertainty envelopes on ΔMag, computed by the quadrature addition of the photometric uncertainties from both catalogs, are shown for reference by the gray dotted curves.
Figure 8. Summary of the difference between broadband magnitudes measured by The Farmer and Classic catalogs, ΔMag. Magnitudes for Classic are the rescaled 2″ total magnitudes. For UltraVISTA, sources in both the ultradeep and deep regions are shown. Agreement for individual sources is shown by the underlying density histogram, which is described by the overlaid median binned by 0.2 AB with an envelope containing 68% of points per bin (solid line and shaded area). 1σ and 3σ photometric uncertainty estimates on ΔMag are indicated by the gray dotted curves. The 3σ depths measured with 3″ diameter apertures as reported in Table 1 are shown by vertical dashed lines. The median Δmagnitudes for sources brighter than the depth limit are reported in each panel.
Download figure:
Standard image High-resolution imageIn general, there is excellent agreement between the photometric measurements from the two methods. As shown in Figure 8, the median systematic difference taken over all magnitudes is typically below 0.1 mag in all bands and in some cases is noticeably smaller. If one were to remove this systematic median difference, then the remaining median differences in each magnitude bin would, for all bands, lie within the 3σ uncertainty threshold expected given the stated photometric uncertainties. In other words, the two sets of photometry are consistent within the expected uncertainties. The largest median differences occur for the faintest sources, but in most cases this is found to be ≲0.25 mag, which is on the order of the expected uncertainty at these magnitudes. There is also noticeably low scatter between the measurements, as illustrated by the tight 68% range envelopes about the medians. In most cases, the 68% range envelope on the median spans the same range as the expected ±1σ uncertainty envelope, the coincidence of which provides the first evidence validating the photometric uncertainties, discussed in full later in this section. Hence, it is established by multiple quantitative means that the two photometric measurements are broadly consistent.
A closer inspection, however, reveals a minor second-order curvature observed in all comparisons (including IA484) at the threshold where sources become unresolved in our ground-based NIR detection images, around ∼24.5 mag. At these magnitudes, photometry from The Farmer tends to be slightly fainter than that reported by SExtractor (or IRACLEAN for channel 1 and channel 2). However, these differences are generally very small and by median estimate are within the 3σ uncertainties for all bands. The fact that these features occur around the magnitude of each band where increasingly fainter sources are more likely to be point sources may suggest that these sources are inadequately modeled because The Farmer chose a resolved model for a point source or conversely an unresolved model for a resolved source. If a resolved source is fitted with an unresolved model, then the flux may be underestimated. Differences (in bands other than IRAC) may also arise from imperfections in rescaling the 2″ apertures to total fluxes, compared to the native total fluxes obtained with The Tractor. This is particularly relevant given the high density of sources that can lead to inaccurate estimates of object size, consequently producing inaccurate total flux measurements.
Regarding the IRAC photometry, which was obtained in both instances by profile-fitting techniques, discrepancies for faint sources cannot arise from aperture corrections. However, whereas IRACLEAN performs iterative subtraction of the PSF until convergence and sums all of the flux that has been subtracted, The Farmer solves for the flux as a model parameter without iterative subtraction. Yet, there is no evidence that any residual flux remaining from The Farmer fitting is significant enough to explain the observed discrepancy. Another potential difference that might explain the trend with brightness is that IRACLEAN performs iterative local background subtraction whereas The Farmer performs a static background subtraction before performing photometry. However, it remains unclear as to exactly which methodology is most accurate. Definitively elucidating the cause of this observed discrepancy can only be obtained through simulation and is hence included in detail in J. R. Weaver et al. (2022, in preparation).
4.2. Colors
A comparison of six colors that contribute significantly to constraining an SED is shown in Figure 9. In a similar fashion to the previous comparison, the distributions are described with a running median and 68% range up to the nominal 3σ depth, which is averaged for the two bands of interest. The expected ±3σ and ±1σ uncertainty thresholds on Δcolor, computed by the quadrature addition of the color uncertainties for each catalog, are shown by the gray dotted curves.
Figure 9. Comparison of broadband colors between the The Farmer and Classic catalogs, Δcolor. The Farmer magnitudes of the first color term in each panel are shown on the x-axis. Colors for individual sources are shown by the underlying density histogram, which is described by the overlaid median binned by 0.2 AB with a 68% confidence interval. 1σ and 3σ photometric uncertainty estimates on the colors are indicated by the gray dotted curves, and the mean 3σ depth computed from both bands of interest and measured with 3″ diameter apertures as reported in Table 1 are shown by vertical dashed lines; brighter than the median Δ are reported.
Download figure:
Standard image High-resolution imageThere is excellent agreement in colors, in some cases well beyond the level of agreement achieved between individual bands. The median difference in Δcolor is below 0.1 mag for all colors, with the best agreement seen for u* − g, g − r, and r − z. Indeed, there is a lack of systematic difference in color, and the observed scatter is well below the 1σ uncertainty expected for the color difference. The remaining panels show some level of systematic disagreement, which is significant for bright sources. However, colors for faint sources are statistically consistent as they lie within the ±1σ thresholds on the color uncertainty. This may be helped by the fact that the Classic catalog does not require aperture-to-total rescaling to compute colors, thereby eliminating any relevant uncertainties present when comparing magnitudes only. In general, there is no evidence for a significant systematic difference in colors obtained by the two methods. Second-order curvatures are only visible at the faintest magnitudes and are not significant even at the 1σ level after correcting for median shifts. The most significant deviation in color shown here is Ks − ch1, which features a relatively large systematic offset for bright sources and a strong second-order curvature for faint sources whereby The Farmer obtains systematically bluer colors. Given that Ks magnitudes are well matched between the two catalogs, this discrepancy in color must originate from the disagreement in faint IRAC channel 1 fluxes demonstrated in Figure 8. However, after correcting for the systematic median offset, the median curvature of the Ks − ch1 lies between the 3σ color uncertainty thresholds.
4.3. Photometric Uncertainties
One critically important aspect to compare is photometric uncertainties. The uncertainties from SExtractor are measured by quadrature summation of the 1/σ2 inverse-variance per-pixel (i.e., weight) map corresponding to the aperture on the source in the image. In contrast, The Tractor reports minimum variance estimates on the photometric uncertainty, although still using the same weight map. The Tractor computes flux uncertainties by a quadrature summation of weight map pixels, weighted by the unit-normalized model profile, which for point sources is simply the PSF. This thereby prioritizes the per-pixel uncertainty directly under the peak of the model profile and places less weight on the per-pixel uncertainty near the edges of the model.
Figure 10 shows a comparison of magnitude uncertainties between The Farmer and the Classic catalogs. Unlike the magnitude and color comparisons, the sources that constitute this particular comparison are not matched between catalogs. They are however restricted to sources within the UltraVISTA area and clear of stellar halos indicated by the HSC bright-star masks.
Figure 10. Growth of photometric uncertainties as a function of magnitude. The colored curves indicate the distributions for individual sources in The Farmer catalog, described by the running median and a tight envelope containing 68% of sources. The gray curves represent the median growth of uncertainty for the total magnitudes in the Classic catalog derived from 2″ aperture photometry, shown by the dashed and dotted curves for the uncorrected and corrected uncertainties, respectively. The 3σ depths measured with 3″ diameter apertures as reported in Table 1 are shown by vertical dashed lines. The two curves shown for each band in YJHKs are due to different depths of the deep and ultradeep regions.
Download figure:
Standard image High-resolution imageThe distributions of magnitude uncertainties as a function of magnitude as measured by The Farmer for the primary broad bands, as well as a medium band for reference, are shown by colored binned medians with an envelope enclosing 68% of sources per bin. The uncertainties in the UltraVISTA bands grow more quickly for the deep region compared to the ultradeep region, and hence they are visualized here separately. The greatest differences between the rate of growth of uncertainties can be seen most noticeably for the Y, J, and H bands which feature the greatest difference in depth (see Table 1). Ks does not feature a significantly different growth rate between the deep and ultradeep regions due to the near-homogeneous coverage in DR4, a fact that will be useful when determining the mass completeness of the catalog.
For comparison, binned medians on the uncorrected magnitude uncertainties from the Classic catalog are indicated by the gray dashed curves. As described in Section 3.1.3, the uncertainties for most bands were then corrected using empty apertures and are indicated by the gray dotted curves. The exception is IRAC, where the uncertainties for Classic are computed with IRACLEAN (see Section 3.1.4). As with The Farmer, the magnitude uncertainties for the UltraVISTA bands are split by depth. The faster growing curve is from the deep region, and the slower is from the ultradeep region.
Photometric uncertainties smoothly and monotonically increase for fainter sources. For The Farmer, there is no evidence for discontinuities related to the transition between the resolved and unresolved regimes. There is, however, a difference between the magnitude uncertainties in that those measured with SExtractor and corrected are always larger than those from The Farmer for all bands except IRAC, where IRACLEAN was used. Yet in the case of the initial, uncorrected SExtractor uncertainties, this difference is much smaller. Moreover the two sets of uncertainties are in better agreement in the bluest bands (e.g., u, u*, and HSC) where the spatial resolution is generally better than in the UltraVISTA bands. The opposite is true when comparing IRAC photometry, whereby The Farmer reports larger uncertainties than IRACLEAN. However, a noticeable level of consistency is achieved by The Farmer in that uncertainties from IRAC are similar to those from UltraVISTA, which should be expected given the similarity in the depths reported in Table 1. This consistency is not present in the Classic catalog, due to the difference between the methods of extraction from UltraVISTA and IRAC images.
Given that the photometric uncertainties measured with The Farmer are intrinsically linked to the underlying weight map, it is possible to quantify the internal consistency of these uncertainties using the reduced
statistic, described in Section 3.2.5. In general,
values are roughly unity for all bands. While this provides one measure of internal consistency, both the uncertainties reported by The Farmer and the χ2 statistics fail to take into account pixel co-variance, which may be quite large, particularly in the lower-resolution UltraVISTA mosaics, which have been upsampled from their native 034 per pixel to 0
15 per pixel. It is then reasonable to conclude on this basis that although the uncertainties provided by The Farmer may be underestimated, they are indeed internally consistent with measurements that likewise ignore correlated noise, such as χ2, and are in general suitable for use in SED fitting. Additional correction of the photometric uncertainties from both The Farmer and Classic catalogs appropriate for SED fitting is discussed further in Section 5.
4.4. Galaxy Number Counts
The galaxy number counts measured in COSMOS2020 are now compared to measurements in the literature. Figure 11 shows the galaxy number counts measured for bands on the bluest and reddest ends of the CHI_MEAN detection image, namely Ks (left panel) and i (right panel). The star–galaxy classification is adopted from the photometric redshift code LePhare, as described in Section 5.1, and is carried out similarly for both catalogs.
Figure 11. i- and Ks -band galaxy number counts of the izYJHKs -detected galaxies in the UltraVISTA ultradeep and deep regions, compared to a selection of literature measurements, including previous COSMOS catalogs. The bins follow increments of 0.5 mag, with the exception of Fontana et al. (2014), who use 0.25 mag.
Download figure:
Standard image High-resolution imageThe effective area of The Farmer catalog is smaller than that of the Classic as photometry is not returned in the case of model failure with The Tractor, most often due to the presence of unexpected bright stars or large resolved galaxies, which cannot be adequately modeled with one of the assumed smooth galaxy profiles (see Section 3.2.5). In this case the effective survey area is corrected by subtracting the area occupied by sources for which a model is not available. Galaxy counts from COSMOS2015 are included for the deep and ultradeep regions as the detection and photometry are equivalent to the Classic approach. The left panel of Figure 11 shows the izYJHKs -detected Ks -band galaxy number counts computed over the 0.812/0.757 deg2 of the HSC-masked ultradeep region of UltraVISTA and over the 0.592/0.536 deg2 of the deep region as measured by photometry from both the Classic and The Farmer catalogs, respectively (see the corresponding README file for most up-to-date areas). There is good agreement with previous studies both within COSMOS (McCracken et al. 2012; Laigle et al. 2016) and from other surveys (Aihara et al. 2011; Bielby et al. 2012; Fontana et al. 2014) over the regime where comparison is possible. The counts from both COSMOS2020 catalogs are in excellent agreement. The Farmer counts have slightly better completeness, which may be due to the larger number of deblended sources at faint magnitudes. Notably, the COSMOS2020 completeness limit is ∼1 mag deeper compared to COSMOS2015, which is due to a combination of both deeper infrared data and a much deeper detection image.
Similarly, the right panel of Figure 11 shows the izYJHKs -detected i-band galaxy number counts computed over the entire 1.403/1.234 deg2 of the HSC-masked UltraVISTA region for the Classic and The Farmer catalogs, respectively. Literature results from the i-selected counts of Ilbert et al. (2009) are included for reference. At the bright end, these counts are in excellent agreement with our measurements. At the faint end, however, Ilbert et al. are above our COSMOS2020 measurements. To identify the cause of this disagreement, a representative sample of 24 < i < 25 objects detected only in Ilbert et al. was visually inspected in the detection CHI_MEAN, i, and Ks images, finding virtually all to be within the halos of bright foreground objects and stars. This is especially true for the Ks image, whose halos are significantly more extended relative to i, which in a CHI_MEAN construction can lead to noise structures resembling real sources even at i ∼ 25. A reasonable explanation, therefore, is that the higher counts of Ilbert et al. are due to spurious sources created by an overly aggressive deblending threshold. Values used in both the Ks and i-band galaxy number counts are shown in Table 2.
5. Photometric Redshifts
Photometric redshifts are computed using both Classic and The Farmer catalogs. First, photometric measurements are corrected for Galactic extinction at each object position using the Schlafly & Finkbeiner (2011) dust map. 45 In the next sections, photometric redshifts are computed using both LePhare (Arnouts et al. 2002; Ilbert et al. 2006) and EAZY (Brammer et al. 2008), followed by a comparison between the two methods.
5.1. LePhare
The first set of photo-z is computed following the same method in Laigle et al. (2016). Both galaxy and stellar templates are fitted to the observed photometry using the code LePhare 46 (Arnouts et al. 2002; Ilbert et al. 2006) with the same configuration as Ilbert et al. (2013).
Before fitting, 0.02 mag is added in quadrature to the photometric errors of the data in the optical; 0.05 mag for J, H, Ks , ch1, and the three narrow bands; and 0.1 mag for ch2. Such an approach is common in numerous surveys (e.g., Arnouts et al. 2007), i.e., to include uncertainties in the color modeling (more important near-infrared and in the narrow bands, due to the emission lines). Fluxes are used to perform the fit (as opposed to magnitudes), with the clear advantage of not introducing upper limits. Given the uncertainties in the calibration of the Suprime-Cam/g+, and the availability of deeper HSC images covering the same wavelength, this band is not included. Similarly, the shallow z+ photometry is not used, because the Suprime-Cam/z++ and HSC/z images are deeper and already cover this wavelength range. IRAC channels 3 and 4 are not included given the difficulty in modeling the emission from polycyclic aromatic hydrocarbon (PAH) in the mid-infrared 47 and their shallower depth (Sanders et al. 2007).
Stellar templates include the library from Pickles (1998), the white dwarf templates of Bohlin et al. (1995), and the brown dwarf templates from Chabrier et al. (2000), Baraffe et al. (2015, BT-Settl/CIFIST2011_2015), and Morley et al. (2012, 2014). All the brown dwarf templates extend to at least 10 μm in the infrared. The blue limit of these templates is between 0.3 and 0.6 μm, and the flux density at bluer wavelengths is set to zero. Indeed, cool brown dwarfs belong to the very faint population of sources and are expected to not be detected in the optical. Stellar templates with an effective temperature Teff < 4000 K are rejected in the case where the physical parameters do not satisfy the constraints from Saumon & Marley (2008).
Regarding galaxy templates, the original library (Ilbert et al. 2009) includes elliptical and spiral galaxy models from Polletta et al. (2007) interpolated into 19 templates to increase the resolution, and 12 blue star-forming galaxy models from Bruzual & Charlot (2003, hereafter BC03). Two additionalBC03 templates with exponentially declining SFR were added to improve the photo-z of quiescent galaxies (Onodera et al. 2012). Extinction is a free parameter with reddening E(B − V) ≤ 0.5, and the considered attenuation curves are those of Calzetti et al. (2000), Prevot et al. (1984), and two modifications of the Calzetti law including the bump at 2175 Å (Fitzpatrick & Massa 1986) with two different amplitudes. Emission lines are added using the relation between the UV luminosity and [O ii] emission-line flux, as well as fixed ratios between dust-corrected emission lines following Ilbert et al. (2009). It is imposed that the absolute magnitude in the rest-frame Suprime-Cam/B band is MB ≥ –24 mag which acts as a unique prior. The predicted fluxes for the templates are computed using a redshift grid with a step of 0.01 and a maximum redshift of 10.
Table 2. Bin Centers and Values of the izYJHKs -selected Logarithmic Galaxy Number Counts Widths
| Ks Deep | Ks Ultradeep | i | ||||
|---|---|---|---|---|---|---|
| Mag | Farmer | Classic | Farmer | Classic | Farmer | Classic |
| 19.25 | 3.64 | 3.61 | 3.65 | 3.66 | 3.01 | 3.04 |
| 19.75 | 3.85 | 3.80 | 3.86 | 3.86 | 3.23 | 3.27 |
| 20.25 | 4.03 | 3.99 | 4.02 | 4.02 | 3.44 | 3.47 |
| 20.75 | 4.18 | 4.14 | 4.16 | 4.16 | 3.64 | 3.66 |
| 21.25 | 4.29 | 4.26 | 4.29 | 4.29 | 3.85 | 3.86 |
| 21.75 | 4.42 | 4.40 | 4.42 | 4.43 | 4.03 | 4.03 |
| 22.25 | 4.56 | 4.53 | 4.54 | 4.54 | 4.21 | 4.21 |
| 22.75 | 4.68 | 4.66 | 4.66 | 4.68 | 4.38 | 4.38 |
| 23.25 | 4.79 | 4.78 | 4.78 | 4.80 | 4.54 | 4.55 |
| 23.75 | 4.90 | 4.90 | 4.88 | 4.90 | 4.71 | 4.71 |
| 24.25 | 5.00 | 5.00 | 4.97 | 4.99 | 4.86 | 4.87 |
| 24.75 | 5.11 | 5.08 | 5.07 | 5.07 | 4.97 | 5.00 |
| 25.25 | 5.22 | 5.15 | 5.18 | 5.14 | 5.08 | 5.10 |
| 25.75 | 5.21 | 4.12 | 5.24 | 5.16 | 5.20 | 5.19 |
| 26.25 | 5.03 | 4.96 | 5.13 | 5.06 | 5.29 | 5.25 |
| 26.75 | ⋯ | ⋯ | ⋯ | ⋯ | 5.35 | 5.26 |
| 27.25 | ⋯ | ⋯ | ⋯ | ⋯ | 5.22 | 5.00 |
Note. Shown in Figure 11 for both The Farmer and Classic catalogs, in units of mag−1 deg−1 with bin widths of 0.5 mag.
Download table as: ASCIITypeset image
Also included are a set of templates to account for active galactic nuclei (AGNs) as well as quasars (see Table 3 of Salvato et al. 2009, and Salvato et al. 2011 for details). A measure of the goodness of fit and photo-z are provided for the best-fit AGN template, which can be readily compared with that of the galaxy template to identify cases where the SED can be explained by emission from an AGN. This is especially important when considering stellar-mass estimates, which can be inflated in the case of an undiagnosed AGN where the stellar continuum emission is unknowingly contaminated.
An initial run of LePhare fitting galaxies with spectroscopic redshifts provides a method for optimizing the absolute calibration in each band. The method is the same as Ilbert et al. (2006): after having fixed the redshift to the spec-z value, the photometric offset of each band is derived by minimizing the difference between the predicted and observed fluxes. This procedure is applied iteratively until the offsets converge. The offset values are given in Table 3.
Table 3. Values of the Magnitude Offsets used to Optimize the Absolute Calibration in Each Band
| Band | LePhare | LePhare | EAZY | EAZY |
|---|---|---|---|---|
| The Farmer | Classic | The Farmer | Classic | |
| NUV | −0.145 | 0.005 | ⋯ | ⋯ |
| u | −0.092 | 0.001 | −0.128 | −0.097 |
| u* | −0.002 | 0.058 | −0.182 | −0.151 |
| g | 0.058 | 0.133 | −0.010 | 0.020 |
| r | 0.081 | 0.133 | 0.046 | 0.057 |
| i | 0.018 | 0.102 | 0.006 | 0.054 |
| z | 0.019 | 0.090 | 0.038 | 0.078 |
| y | 0.070 | 0.105 | 0.091 | 0.103 |
| B | ⋯ | −0.069 | ⋯ | ⋯ |
| V | ⋯ | 0.128 | ⋯ | ⋯ |
| r+ | ⋯ | 0.044 | ⋯ | ⋯ |
| i+ | ⋯ | 0.058 | ⋯ | ⋯ |
| z++ | ⋯ | 0.101 | ⋯ | ⋯ |
| IB427 | −0.111 | −0.007 | −0.187 | −0.135 |
| IB464 | −0.057 | 0.014 | −0.119 | −0.094 |
| IA484 | −0.036 | 0.027 | −0.086 | −0.066 |
| IB505 | −0.035 | 0.031 | −0.074 | −0.051 |
| IA527 | −0.062 | 0.009 | −0.092 | −0.066 |
| IB574 | −0.104 | −0.027 | −0.120 | −0.089 |
| IA624 | −0.015 | 0.037 | −0.027 | −0.012 |
| IA679 | 0.145 | 0.213 | 0.146 | 0.174 |
| IB709 | −0.043 | 0.015 | −0.036 | −0.017 |
| IA738 | −0.054 | 0.009 | −0.047 | −0.021 |
| IA767 | −0.052 | −0.009 | −0.038 | −0.032 |
| IB827 | −0.087 | 0.007 | −0.060 | −0.008 |
| NB711 | −0.030 | 0.028 | ⋯ | ⋯ |
| NB816 | −0.082 | −0.016 | ⋯ | ⋯ |
| Y | 0.039 | 0.055 | 0.065 | 0.058 |
| J | 0.005 | 0.028 | 0.037 | 0.050 |
| H | −0.049 | −0.043 | −0.029 | −0.023 |
| Ks | 0.000 | 0.000 | 0.000 | 0.000 |
| NB118 | −0.034 | −0.013 | ⋯ | ⋯ |
| ch1 | −0.184 | −0.067 | −0.127 | −0.119 |
| ch2 | −0.186 | −0.091 | −0.200 | −0.174 |
| ch3 | ⋯ | ⋯ | −0.168 | ⋯ |
| ch4 | ⋯ | ⋯ | −0.265 | ⋯ |
Note. When no value is indicated, the band was not used in the fit. The relative calibrations are normalized in Ks . Although included in The Farmer catalog, IRAC channels 3 and 4 are not used during the zero-point calibration by LePhare. Observed photometry may be corrected by adding the appropriate values.
Download table as: ASCIITypeset image
A key output of the photo-z code is the likelihood of the observed photometry given the redshift,
, after having marginalized over the template set. The official photo-z estimate included in the catalog, noted zphot hereafter, is defined as the median of the likelihood distribution. The zphot error bar comprises
and
, which are defined as 34% of the likelihood surface below and above the median, respectively. The galaxy spectroscopic sample can be used to verify that these error bars actually represent 68% confidence level intervals (see Section 5.3 for more details).
Galaxies are separated from stars and AGNs in LePhare by combining morphological and SED criteria. The stellar sequence is isolated by comparing half-light radii and magnitude for bright sources in the HST/ACS and Subaru/HSC images. All the point-like sources falling on this sequence are classified as stars at i < 23 and i < 21.5 for ACS and HSC images, respectively. Point-like AGN sources are also removed by this criterion. Sources with
are also classified as a star, with
and
being the best χ2 obtained using the stellar and galaxy templates, respectively. This criterion is applied only for sources detected at 3σ in the Ks
band or IRAC channel 1, because the lack of near-infrared data could increase the risk of stellar contamination in the galaxy sample (Daddi et al. 2004; Coupon et al. 2009). We do not apply the criteria based on the χ2 if the source is resolved to avoid creating incompleteness in the galaxy sample.
The result of this star–galaxy separation is shown in Figure 12. Here again The Farmer photometry is used, and the result is unchanged with Classic. Most of the sources classified as stars fall on the expected stellar locus of the two color–color diagrams (e.g., Figure 2 of Arcila-Osejo & Sawicki 2013).
Figure 12. Color–color diagrams showing stars (black) and galaxies (colored by zphot) classified by LePhare for The Farmer photometry, shown in gzKs (top) and gzch1 (bottom) color–color diagrams. For simplicity, galaxies with zphot > 5 are shown also by red points. Only sources with S/N > 3 in g, z, Ks , and ch1 in the UltraVISTA area outside the HSC bright-star halos are shown.
Download figure:
Standard image High-resolution imageAlthough these classifications are made available in the catalogs (and explained in detail in the accompanying release documentation), it should be cautioned that this precise classification scheme may be suboptimal for certain science investigations (e.g., where galaxies with stellar-like SEDs are science targets). Hence, this star–galaxy separation method is aimed at providing a baseline, conservative galaxy population from which to demonstrate the overall effectiveness of these catalogs, for instance with the galaxy number counts in Figure 11.
5.2. EAZY
Photometric redshifts are computed along with physical parameters using an updated version of the EAZY code 48 (Brammer et al. 2008) rewritten in Python. EAZY shares much of the strategy outlined for LePhare in the previous section, with the primary difference being the source of the population synthesis templates and how they are fit to the observed photometry. This computation uses a set of 17 templates derived from the Flexible Stellar Population Synthesis models (Conroy et al. 2009; Conroy & Gunn 2010) with a variety of dust attenuation and ages from log-normal star formation histories that are chosen to broadly span the rest-frame UVJ color space populated by galaxies over 0 < z < 3. For each galaxy in the catalog, EAZY fits a nonnegative linear combination of these templates integrated through the redshifted filter bandpasses to the observed flux densities and associated uncertainties. In this way, EAZY fits combinations of dust attenuation and star formation histories to efficiently span the continuous color space populated by the majority of galaxies across the survey. For the EAZY photo-z estimates, the Subaru Suprime-Cam broadband photometric measurements are not used, as these are generally significantly shallower than other nearby filters. Furthermore, the GALEX FUV and NUV are ignored, as these bands are relatively shallow and have broad PSFs that are difficult to combine with the other deeper filters.
As with LePhare, EAZY iteratively derives multiplicative corrections to the individual photometric bands (Table 3). For such a task, galaxies without a spec-z are also used to mitigate the possible bias due to selection effects in the spectroscopic sample. At each step of the iteration, the median fractional residual is computed both for all bands individually and for all measurements in all bands sorted as a function of rest-frame wavelength. With many filters that overlap in the observed frame and galaxies across a broad range of redshifts, the catalog can largely break the degeneracy between systematic offsets in individual filters (e.g., from poor photometric calibration) and systematic effects resulting from the properties of the template set (e.g., continuum shape and emission-line strengths). The correction routine is stopped after five iterations, where the updates are generally less than 1%. For the final photometric redshift estimates, EAZY uses the “template error function” and apparent magnitude prior as described by Brammer et al. 2008.
Regarding star–galaxy separation, the current Python implementation of EAZY provides functionality for fitting stellar templates to the observed photometry, similar to LePhare. By default, EAZY uses a set of theoretical PHOENIX BT-Settl stellar templates (Allard et al. 2012) spanning a range of effective temperatures and calculates the χ2 goodness of fit for each template individually (i.e., not as linear combinations). Included in the catalog is the minimum χ2 of the fits to the stellar templates, as well as the effective temperature of the best-fit stellar model, which together may be used to separate stars from galaxies, possibly with the addition of morphological information to determine point-like sources.
5.3. Photometric Redshift Validation
One unique aspect of this work different from Laigle et al. (2016) is the availability of two photometric catalogs created with different photometric extraction methods (see Section 3). By applying the same photo-z code to the Classic and The Farmer catalogs, it is possible to assess if one method to extract the photometry produces better results than the other. This is done by quantifying the precision of the photo-z using the normalized median absolute deviation (NMAD, Hoaglin et al. 1983), defined as

following Brammer et al. (2008) as it is less sensitive to outliers compared to the normal definition (e.g., Ilbert et al. 2006). The fraction of outliers is denoted by η and defined, following Hildebrandt et al. (2012), as galaxies whose photo-z deviates from their spec-z by ∣Δz∣ > 0.15(1 + zspec). Lastly, the bias b is computed as the median difference between photo-z and spec-z.
Comparisons between photo-z and spec-z are shown for both Classic and The Farmer catalogs in combination with LePhare and EAZY in Figure 13. In general, the photo-z precision (given by σNMAD) is on the order of 0.01(1 + z) at i < 22.5, and the precision is degraded at fainter magnitudes, but is still better than 0.025(1 + z) at i < 25. For both catalogs, there is a population of galaxies with zspec > 2 and zphot < 1. This population is explained by the misidentification between the Lyman and Balmer breaks in the observed SED. This degeneracy appears clearly when comparing the photo-z derived for the full catalogs in Figure 14, especially for fainter objects where the lower signal-to-noise ratio is not sufficient to constrain the identity of the break. The figure provides a straightforward demonstration of the remarkable similarity between the catalogs computed using the same photo-z code (LePhare) and the photo-z codes with the same catalog (The Farmer). The photo-z quality is similar between both catalogs, with a slight trend of having better results at i < 22.5 for the Classic catalog, while The Farmer catalog provides better results at fainter magnitudes. These results are summarized in Figure 15.
Figure 13. Photometric redshifts computed with LePhare and EAZY, split by apparent magnitude bin (from i < 22.5 on the left to 25 < i < 27 on the right). Top: photo-z versus spec-z for the Classic and The Farmer photometric catalogs computed with LePhare. Bottom: photo-z versus spec-z for the Classic and The Farmer photometric catalogs computed with EAZY. The red solid line corresponds to the one-to-one relation, and the dashed lines correspond to the photo-z at ±0.15 (1 + zspec). The fraction of sources outside the dashed lines (noted η), the precision measured with the normalized absolute deviation (noted σ), and the overall bias (noted b) are indicated in each panel. The nature of the off-diagonal points, shown individually, are discussed in the text. Bin color increases on a log10 scale. The spec-z of i > 26 comprise 18% of sources shown in the rightmost 25 < i < 27 panels.
Download figure:
Standard image High-resolution imageFigure 14. Photometric redshifts computed with LePhare and EAZY for the Classic and The Farmer photometric catalogs, split by apparent magnitude bin (from i < 22.5 on the left to 25 < i < 27 on the right). Top: comparison between the photometric redshifts computed with LePhare and EAZY for the full The Farmer photometric catalog. Bottom: Comparison between the photo-z derived from the Classic and The Farmer full catalogs computed with LePhare (excluding masked regions). The nature of the two groups of off-diagonal points is discussed in the text. Bin color increases on a log10 scale. Note that the magnitude bins are different than in Figure 13 to illustrate the behavior at faint magnitudes.
Download figure:
Standard image High-resolution imageFigure 15. Comparison between the precision (σNMAD) and the outlier fraction for the two catalogs (the Classic in blue and The Farmer in red) and for the two photo-z codes (LePhare with circles and EAZY with stars). The statistics are computed per i-band apparent magnitude bin, as indicated on the side of the points.
Download figure:
Standard image High-resolution imageThe photo-z uncertainties are also an important aspect of the photo-z quality. If correctly estimated (i.e., representing the 1σ uncertainty) the fraction of spec-z that belong to the interval
should be 0.68. Initially, this fraction was significantly smaller due to the photometric uncertainties being underestimated; therefore, the error bars associated with the observed fluxes have been multiplied by a factor of 2 for the SED fitting. Figure 16 shows the cumulative distribution of the ratio between ∣zphot − zspec∣ and the 1σ uncertainty derived for the LePhare photo-z solutions after boosting the flux error bars. The 1σ uncertainty is defined as the maximum between
and
. The cumulative distribution of the bright sample (i < 22.5) now reaches 0.68 as expected, while the photo-z uncertainties of objects at i > 22.5 are still underestimated. This effect was already discussed in Laigle et al. (2016) and is seen also in EAZY. Because it is limited to faint galaxies, it may be due to a selection bias in the spectroscopic sample rather than a problem in the photo-z uncertainties (see Laigle et al. 2019). For this reason, no further correction is applied to the uncertainties of i > 22.5 objects. The effect is more pronounced in the The Farmer catalog because its photometric uncertainties are typically smaller, as they are not rescaled to the same extent as in the Classic catalog (see Section 4.3). These larger uncertainties explain the more realistic photo-z errors in Classic and may also help to explain the lower precision for faint sources as the photo-z are more uncertain.
Figure 16. Cumulative distribution of the ratio between ∣zphot − zspec∣ and the photo-z 1σ uncertainty for both photometric catalogs and using LePhare. The photo-z 1σ uncertainty is taken as the maximum between
and
. The solid and dashed lines correspond to the uncertainties from the Classic and The Farmer catalogs, respectively. For an unbiased estimate of the photo-z 1σ uncertainties, the cumulative number should reach 0.68 when the ratio equals 1 (black dotted line). The distributions are shown per bin of i-band magnitude.
Download figure:
Standard image High-resolution imageFigure 17 illustrates the evolution with redshift of the 1σ photo-z uncertainties in several i-band magnitude bins, as derived from the LePhare photo-z. There is an increase of the 1σ uncertainty between z < 1 and 1.5 < z < 2.5. This increase is explained by the Balmer break being shifted out of the medium-band coverage, as well as blue galaxies at high redshift with low signal-to-noise ratio in the near-infrared bands. As the photo-z based on the Classic catalog are estimated using similar techniques to Laigle et al. (2016), the photo-z uncertainties computed with both catalogs can be compared. For this comparison, the photo-z uncertainties in both catalogs are rescaled in order to make them consistent with 68% of the spec-z falling into the 1σ error. 49 The result is that the photo-z are improved at 1.4 < z < 3 at all magnitudes owing to the gain in UltraVISTA depth and at faint magnitudes (i > 25) over the full redshift range thanks to the new HSC and CFHT data. While COSMOS2015 photo-z were unreliable at i > 26, the new catalog can be used also at fainter magnitudes, depending on the scientific application. In summary, photo-z uncertainties reported in COSMOS2020 match those found 0.7 mag brighter in COSMOS2015, a considerable gain.
Figure 17. Median of the photo-z 1σ uncertainties (defined as in Section 5.3) shown as a function of redshift. The shaded areas correspond to the COSMOS2020 Classic catalog computed with LePhare, and the dashed lines correspond to the COSMOS2015 catalog. The distributions are shown per bin of i-band magnitude.
Download figure:
Standard image High-resolution imageFigure 18 shows the photo-z distribution of sources common to both the Classic and The Farmer catalogs in four selections of H-band magnitude. As expected, the mean redshift increases toward faint magnitude from z ∼ 0.82 at H < 22 to z ∼ 1.37 at H < 25. There is an excellent agreement between the mean redshifts of both catalogs, within ∼0.01–0.02. The mainly near-infrared selection in izYJHKs allows for the detection of a significant sample of galaxies above z > 6 (100–300 at H < 25 depending on the catalog). The Farmer catalog includes a higher density of z > 6 sources (by a factor of almost 2 in the faintest bin). This is discussed in detail in O. B. Kauffmann et al. (2022, in preparation).
Figure 18. Redshift distribution for the Classic (blue) and The Farmer (red) full catalogs computed with LePhare. Each panel corresponds to a different magnitude limit in H band from The Farmer.
Download figure:
Standard image High-resolution image6. Physical Properties of COSMOS Galaxies
Now a first characterization of the sources classified as galaxies in Section 5.1 can be presented. Physical properties such as absolute magnitudes and stellar mass are computed using LePhare with the same configuration as COSMOS2015: a template library generated by BC03 models is fit to the observed photometry after fixing the redshift of each target to the photo-z estimated in the previous LePhare run (for more details, see Laigle et al. 2016). It should be noted that this standard configuration has been selected to be consistent with previous SED-fitting results, even though recent work shows that the resulting stellar masses could be underestimated. For example, Leja et al. (2019b) find that M* estimates are 0.1–0.3 dex larger when using complex SFHs to build their library, instead of the standard templates of FAST (Kriek et al. 2018). However, integrated fluxes (as provided by these catalogs) merge together the light of young stellar populations outshining the older ones. Sorba & Sawicki (2018) show that when these different stellar components can be resolved (e.g., in the Hubble eXtreme Deep Field), a pixel-by-pixel SED fitting results in a galaxy stellar mass a factor of 2–5 larger (see also Abdurro’uf & Akiyama 2018; Mosleh et al. 2020). On the other hand, tests with mock galaxy catalogs in Laigle et al. (2019) did not find such a significant bias, with an underestimation <20%.
The present analysis is limited to a classification of COSMOS2020 galaxies between star-forming and quiescent, and a subsequent determination of their stellar-mass completeness as a function of redshift; further investigation is deferred to future studies. Moreover, the following illustrates only the results generated with The Farmer and LePhare to provide the most direct comparison to the Laigle et al. (2016) template fitting while demonstrating the effectiveness of the new The Farmer photometry. There are no significant differences when repeating the analysis with either Classic photometry or with EAZY.
6.1. Galaxy Classification
Previous studies have devised a variety of techniques to identify quiescent galaxies using broadband photometry. Williams et al. (2009) provide a prescription utilizing U − V and V − J rest-frame colors, which has been broadly adopted in the literature (e.g., Muzzin et al. 2013a; Tomczak et al. 2014). Ilbert et al. (2013) and Arnouts et al. (2013) proposed improving the selection by replacing U − V with NUV − r, as the latter can better separate galaxies with different star formation histories (see also Leja et al. 2019a).
This analysis adopts the rest-frame NUV − r versus r − J diagram described in Ilbert et al. (2013), where quiescent galaxies are defined to be those with MNUV − Mr > 3(Mr − MJ ) + 1 and MNUV − Mr > 3.1. Measurements are provided by LePhare by convolving the best-fit template with the appropriate passband in the observed frame. Figure 19 shows the rest-frame NUVrJ color–color diagram in six redshift bins from z = 0.1 to 6. The assembly of the quiescent population at late cosmic times is evident. Quiescent galaxies are rare at z > 2 (e.g., Ilbert et al. 2013; Muzzin et al. 2013a; Tomczak et al. 2014; Davidzon et al. 2017) but the large cosmic volume probed by COSMOS allows us to identify a significant number of candidates. However, a portion of them are expected to be star-forming galaxies that contaminate the high-z quiescent locus due to large uncertainties in their rest-frame colors (especially at z > 2.6, where MJ corresponds to observed wavelengths redder than channel 2).
Figure 19. Identification of quiescent galaxies in bins of redshift by selection in rest-frame NUV − r and r − J colors using the LePhare results, computed with The Farmer for sources that lie above their respective mass-completeness limit. The selection is made using the prescription of Ilbert et al. (2013), shown in orange. For clarity, quiescent galaxies at z > 2.25 are shown by individual red points. r − J colors are highly uncertain at z > 2.6 where the rest-frame J band is extrapolated redward of the available photometry and hence have an uncertain classification marked by an orange dashed line.
Download figure:
Standard image High-resolution image6.2. Stellar-mass Completeness
The stellar-mass completeness of our galaxy sample is empirically computed following the method described in Pozzetti et al. (2010), discriminating between star-forming and quiescent populations. This method is commonly used in the literature (e.g., Ilbert et al. 2013; Moustakas et al. 2013). It converts the detection limit of a given survey, given by the apparent magnitude mlim, into a redshift-dependent threshold in stellar mass Mlim computed using the mass-to-light ratio of galaxies brighter than mlim. Their stellar masses, estimated via template fitting, are rescaled by a factor of
, where mi
is the magnitude of the ith galaxy. One can determine Mlim in a given redshift bin from the distribution of such rescaled masses: e.g., their 95th percentile can define the smallest mass at which most of the objects would still be observable.
The case of COSMOS2020 is more complicated because it is now possible to quantify mlim not in a single band but for the CHI_MEAN izYJHKs detection image itself. Adopting the sensitivity limit in the Ks band (Table 1) is a conservative choice that disregards the numerous NIR-faint objects detected thanks to the deep HSC photometry. This bias has already been discussed for COSMOS2015 (see Davidzon et al. 2017), and it is now more relevant after the addition of the i band in the CHI_MEAN image, which was not considered in 2015. Therefore, the analysis proceeds as in Davidzon et al. (2017) by computing mlim in IRAC channel 1, using the CANDELS-COSMOS catalog (Nayyeri et al. 2017) as a reference parent catalog. 50 Source completeness in channel 1 is related not only to the properties of the IRAC mosaic itself but also to the depth of the izYJHKs image, which is used as a prior for source extraction (Sections 3.1.4 and 3.2.1). The choice to use channel 1 over Ks is motivated by the fact that channel 1 probes the bulk of stellar mass at z > 2.5, where the Balmer break is shifted beyond the optical–NIR bands. While LePhare and EAZY estimate broadly similar masses, this particular mass completeness is computed with masses reported by LePhare with The Farmer. Other combinations may produce a marginally different mass completeness limit, and should be rederived for specific science applications.
A common sample is constructed by cross-matching IRAC channel 1 sources of COSMOS2020 to the deeper CANDELS catalog in the ∼200 arcmin2 where the two overlap. At
mag, about 75% of the CANDELS sources are also recovered by The Farmer
51
the completeness at that magnitude was < 50 % in COSMOS2015. With mlim in hand, galaxy masses are rescaled to compute Mlim in bins of redshift (see Figure 20), to which a polynomial function in 1 + z is fitted. The result is

for z < 6, which is more complete by ∼0.5 dex compared to Davidzon et al. (2017). Because the boundary used here is the 95th percentile of the rescaled mass distribution and the choice of mlim already implied that about 25% of the objects are missing, it is expected that Mlim corresponds to a 70% completeness threshold.
Figure 20. Mass completeness for the total sample (yellow), as well as the star-forming (blue) and quiescent (red) populations using quantities derived from The Farmer and LePhare considering magnitude limits of IRAC channel 1. Limits are calculated based on the method introduced in Pozzetti et al. (2010) in a manner consistent with COSMOS2015 (Davidzon et al. 2017, yellow dashed). For clarity, the total sample limit has been raised by 0.02 dex so that both it and the star-forming limit are visible.
Download figure:
Standard image High-resolution imageThe procedure is repeated separately for the star-forming and the quiescent sample, both shown in Figure 20. Quiescent galaxies start to be incomplete at stellar masses ∼0.4 dex higher than the total sample because they have larger mass-to-light ratios. Mlim at z < 2.5 is additionally computed starting from the Ks limit (Table 1) and following precisely the procedure of Laigle et al. (2016). However, due to the nearly uniform coverage of the new data set, there is not a significant difference between the completeness limits of the ultradeep and deep regions. The Ks -based completeness is well described by the function

for z < 2.5 and is more complete by ∼0.5 dex compared to the same threshold found in COSMOS2015 (Laigle et al. 2016).
7. Conclusions
This paper describes the creation and validation of COSMOS2020, a new set of two multiwavelength catalogs of the distant universe, each of which includes photometric redshifts and other physical parameters computed from two independent codes. COSMOS2020 builds on more than a decade of panchromatic observations on the COSMOS field. Compared to previous releases, COSMOS2020 features significantly deeper optical, infrared, and near-infrared data all tied to a highly precise astrometric reference frame, Gaia.
Starting from a very deep multiband detection image and using two different photometric extraction codes, one based on aperture photometry and one based on a profile-fitting technique, two photometric catalogs have been extracted. These photometric catalogs were then used to estimate photometric redshifts and stellar masses using two different codes, LePhare and EAZY. This enables us, for the first time, to make a robust estimate of the systematic errors introduced by photometric extraction and photometric redshift estimation over a large redshift baseline with an unprecedented number of objects over 2 deg2. Our results show that all methods are in remarkable agreement. COSMOS2020 gains almost one order of magnitude in photometric redshift precision compared to COSMOS2015 (Laigle et al. 2016). In the brightest bin, i < 22.5, the catalogs reach redshift precision and outlier fraction are both below 1%. Even in the faintest 25 < i < 27 bins, photometric redshift precision is still ∼4% with an outlier fraction of ∼20%. A detailed comparison in Section 5 shows that at bright magnitudes, the classic aperture catalog is marginally superior whereas at faint magnitudes the trend is reversed with the profile-fitting technique providing a better result. This close agreement provides a unique validation of our measurement and photometric redshift techniques. Superseding our previous catalogs, COSMOS2020 represents an unparalleled deep and wide picture of the distant universe. It will be of invaluable assistance in preparing for the next generation of large telescopes and surveys.
One can already start to imagine what COSMOS2025 might contain. After 15 years of observations, the UltraVISTA survey will have been completed, providing an unparalleled near-infrared view of COSMOS. These data, combined with the Spitzer data presented here, will lay the foundation for a next-generation catalog combining deep high-resolution optical and infrared imaging data from Euclid and the James Webb Space Telescope with ultradeep optical data from Rubin. Such a catalog will be an important step toward producing a mass-complete survey comprising every single galaxy in a representative volume from the present day to the epoch of reionization.
This paper is dedicated to Olivier Le Fèvre. Spectroscopic redshifts from his VIMOS instrument (often collected in surveys that he designed and led) played an invaluable role in preparing this catalog.
The authors thank Nathaniel Strickley, Dustin Lang, Clara Giménez Arteaga, Istvan Szápudi, Andrew Repp, and Emmanuel Bertin for helpful discussions. We are also grateful for the many helpful and constructive comments from the anonymous referee. We gratefully acknowledge the contributions of the entire COSMOS collaboration consisting of more than 100 scientists. The HST COSMOS program was supported through NASA grant HST-GO-09822. More information on the COSMOS survey is available at https://cosmos.astro.caltech.edu/.
The Cosmic Dawn Center (DAWN) is funded by the Danish National Research Foundation under grant No. 140. S.T., G.B., and J.W. acknowledge support from the European Research Council (ERC) Consolidator Grant funding scheme (project ConTExt, grant No. 648179). O.I. acknowledges the funding of the French Agence Nationale de la Recherche for the project “SAGACE”. H.J.Mc.C. acknowledges support from the PNCG. I.D. has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 896225. This work used the CANDIDE computer system at the IAP supported by grants from the PNCG and the DIM-ACAV and maintained by S. Rouberol. B.M.J. is supported in part by Independent Research Fund Denmark grant DFF—7014–00017. C.M.C. thanks the National Science Foundation for support through grants AST-1714528, AST-1814034 and AST-2009577, and additionally the University of Texas at Austin College of Natural Sciences, and the Research Corporation for Science Advancement from a 2019 Cottrell Scholar Award sponsored by IF/THEN, an initiative of Lydia Hill Philanthropies. The work of D.S. was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with NASA. J.D.S. is supported by KAKENHI (26400221) through the Japan Society for the Promotion of Science (JSPS), the World Premier International Research Center Initiative (WPI), MEXT, Japan. G.E.M. acknowledges the Villum Fonden research grant 13160 “Gas to stars, stars to dust: tracing star formation across cosmic time.” D.R. acknowledges support from the National Science Foundation under grant numbers AST-1614213 and AST-1910107. D.R. also acknowledges support from the Alexander von Humboldt Foundation through a Humboldt Research Fellowship for Experienced Researchers. M.S. acknowledges the support of the Natural Sciences and Engineering Research Council of Canada (NSERC).
The authors wish to recognize and acknowledge the very significant cultural role and reverence that the summit of Maunakea has always had within the indigenous Hawaiian community. We are most fortunate to have the opportunity to conduct observations from this mountain. This work is based on data products from observations made with ESO Telescopes at the La Silla Paranal Observatory under ESO program ID 179.A-2005 and on data products produced by CALET and the Cambridge Astronomy Survey Unit on behalf of the UltraVISTA consortium. This work is based in part on observations made with the NASA/ESA Hubble Space Telescope, obtained from the Data Archive at the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-26555. Some of the data presented herein were obtained at the W.M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W.M. Keck Foundation. This research is also partly supported by the Centre National d’Etudes Spatiales (CNES). These data were obtained and processed as part of the CFHT Large Area U-band Deep Survey (CLAUDS), which is a collaboration between astronomers from Canada, France, and China described in Sawicki et al. (2019, MNRAS 489, 5202). CLAUDS is based on observations obtained with MegaPrime/ MegaCam, a joint project of CFHT and CEA/DAPNIA, at the CFHT which is operated by the National Research Council (NRC) of Canada, the Institut National des Science de l’Univers of the Centre National de la Recherche Scientifique (CNRS) of France, and the University of Hawaii. CLAUDS uses data obtained in part through the Telescope Access Program (TAP), which has been funded by the National Astronomical Observatories, Chinese Academy of Sciences, and the Special Fund for Astronomy from the Ministry of Finance of China. CLAUDS uses data products from TERAPIX and the Canadian Astronomy Data Centre (CADC) and was carried out using resources from Compute Canada and Canadian Advanced Network For Astrophysical Research (CANFAR).
Authors contributed to the paper as follows: A.M., H.J.Mc.C., P.C., S.G. processed the imaging data; J.W., O.K., I.D., M.Sh., and B.C.H. produced the photometric catalogs; J.W., O.I., and G.B. produced the photometric redshifts and physical parameters catalogs; H.J.Mc.C. and S.T. supervised this study. All of these authors contributed to the validation and testing of the catalogs. The second group of authors (C.L. to Z.L.) covers those who have either made a significant contribution to assemble the data products or to the scientific analysis. The remaining authors (S.A. to G.Z.) contributed in a some way to the data products, conceptualization, validation, and/or analysis of this work.
Facilities: ESO:VISTA - European Southern Observatory's 4.1 meter Visible and Infrared Survey Telescope for Astronomy, Subaru(HSC) - , Spitzer(IRAC) - , CFHT - , GALEX - , HST - , Gaia. - Software: numpy (van der Walt et al. 2011), matplotlib (Hunter 2007), astropy (Astropy Collaboration et al. 2013, 2018), SExtractor (Bertin & Arnouts 1996), PSFEx (Bertin 2013), The Tractor (Lang et al. 2016), and The Farmer (J. R. Weaver et al. 2022, in preparation)
Appendix A: Data Release
Both the Classic and The Farmer catalogs detailed in this work are publicly available in FITS format through the ESO Phase 3 System (http://eso.org/rm/publicAccess#/dataReleases under the label UltraVISTA DR4.1) and through servers at the Institut d’Astrophysique de Paris (https://cosmos2020.calet.org). Each catalog includes object positions, region mask flags, photometry, limited ancillary data (e.g., HST/ACS, GALEX), as well as photometric redshifts and physical parameters measured by both LePhare and EAZY, for each set of photometry. Four additional files will contain the redshift probability distributions for the two photometric catalogs in combination with both photometric redshift codes. Corresponding documentation will include information about the use of mask flags, and their respective regions. This data set will also be made available through the IPAC-IRSA and CDS VizieR systems. Each catalog is a distinct item in the digital object identifier (DOI) system in work relying on COSMOS2020 data, the DOI name(s) should be cited, in addition to a reference to the present article, to keep track of which file(s) are actually used.
Science investigators who publish software analyzing these catalogs are encouraged to link their (e.g., github) repository to the website https://paperswithcode.com/; in this way the code will be also visible in the arXiv entry of the present publication under the section “Code & Data—Community Code”.
Appendix B: Source Detection Parameters
A listing of detection parameters from SExtractor is shown in Table 4.
Table 4. SExtractor Parameters Used for the Aperture Detection and Photometry
| Name | Value |
|---|---|
| ANALYSIS_THRESH | 1.5 |
| BACKPHOTO_THICK | 30 |
| BACKPHOTO_TYPE | LOCAL |
| BACK_FILTERSIZE | 3 |
| BACK_SIZE | 128 |
| BACK_TYPE | AUTO |
| CLEAN | Y |
| CLEAN_PARAM | 1.0 |
| DEBLEND_MINCONT | 0.00001 |
| DEBLEND_NTHRESH | 32 |
| DETECT_MAXAREA | 100000 |
| DETECT_MINAREA | 5 |
| DETECT_THRESH | 1.5 |
| DETECT_TYPE | CCD |
| FILTER | Y |
| FILTER_NAME | gauss_4.0_7x7.conv |
| GAIN | band-dependent |
| MAG_ZEROPOINT | band-dependent |
| MASK_TYPE | CORRECT |
| PHOT_APERTURES | 13.33,20.00,47.33 |
| PHOT_AUTOAPERS | 13.3,13.3 |
| PHOT_AUTOPARAMS | 2.5,3.5 |
| PHOT_FLUXFRAC | 0.2,0.5,0.8 |
| RESCALE_WEIGHTS | N |
| SATUR_LEVEL | 30000 |
| THRESH_TYPE | ABSOLUTE |
| WEIGHT_GAIN | N |
| WEIGHT_TYPE | MAP_WEIGHT,MAP_WEIGHT |
Download table as: ASCIITypeset image
Appendix C: Comparison with Reference Photometry
The comparisons shown in Section 4 are here supplemented by comparing selected bands in this work to two well-known COSMOS-field literature catalogs for which this work is readily comparable: CANDELS (Nayyeri et al. 2017, using UltraVISTA DR1 and IRAC/SPLASH) and COSMOS2015 (Laigle et al. 2016, using UltraVISTA DR2 and IRAC/SPLASH). As shown in Figure 21, broadband Ks
and IRAC channel 1 magnitudes and their colors are compared up to the depth limit of the shallower literature data set indicated by the vertical dashed line. For fairness, the sample includes only the ∼18,000 sources, which are common to all three catalogs with 06.
Figure 21. Comparison of broadband Ks and IRAC channel 1 magnitudes and color between the The Farmer catalog of this work with those of CANDELS (Nayyeri et al. 2017) and COSMOS2015 (Laigle et al. 2016). Individual sources are shown by the underlying density histogram that is described by the overlaid median binned by 0.2 AB with an envelope containing 68% of sources per bin. For the magnitudes, depths are shown for the comparison sample (dashed) and for COSMOS2020 (dotted), corresponding to 3σ depths measured with 3″ diameter apertures. For colors, averaged 3σ depth computed from both bands of interest measured with 3″ diameter apertures. The median Δmag offsets are reported for sources below the dashed magnitude limit.
Download figure:
Standard image High-resolution imageA brief analysis reveals three main points. First, the COSMOS2020 depths in the bands considered exceed both those in CANDELS and COSMOS2015, as indicated by the vertical dashed and dotted lines, which manifests in the high scatter beyond the brightest magnitude limit. This restricts a meaningful comparison to sources below this limit. Second, the comparison with COSMOS2015 looks identical to the comparison of those bands between The Farmer and Classic, both in terms of offset and any trends with magnitude. This suggests that the Classic photometry is highly consistent with COSMOS2015, as verified directly during the catalog preparation process. Finally, the comparison of the The Farmer photometry with CANDELS is broadly similar. Although the Ks offset is larger than in comparisons with COSMOS2015 and Classic, the trend with magnitude in channel 1 is more constant than with either COSMSO2015 or Classic. The differences in Ks and channel 1 are similarly reflected in the colors, being more constant when comparing with CANDELS but not COSMOS2015. The similarity in the comparison with COSMOS2015 and Classic is expected, because both employed the same methodologies, by design. Similarly, the model-fitting employed in the IRAC photometry in CANDELS is more similar to that used by The Farmer and hence their agreement is unsurprising.
Footnotes
- 41
- 42
- 43
The reported u*-band depth is deeper than COSMOS2015 because this work averages over the UltraVISTA layout, compared to the entire field in Laigle et al. (2016).
- 44
- 45
- 46
- 47
The 6.2 and 7.7 μm PAH lines contribute to the IRAC channel 4 photometry at z < 0.3, and the 3.3 μm line to both channels 3 and 4 with a lower contribution.
- 48
- 49
The COSMOS2020 photo-z uncertainties are rescaled by a factor of 1 + 0.1(i − 21) for galaxies fainter than i > 21. Applying the same method and using the new spec-z sample, the COSMOS2015 photo-z uncertainties are rescaled by a factor of 1.3.
- 50
In the COSMOS field, the CANDELS detection image HST/F160W has a 5σ limit at 27.56 mag within 0
34 diameter apertures, corresponding to twice the PSF FWHM.
- 51
The fraction of recovered CANDELS sources is the same with the Classic catalog.






















