T. Jarrett, IPAC
(980410)
Several nights of 2MASS data have been specifically designated as RTB data sets (to be used to debug and check different versions of the 2MAPPS pipeline reduction). For one such night, 971205n, the extended source results (GALWORKS output) has been fully analyzed to gauge the performance of GALWORKS on a much larger scale than previously attempted. The plots and tables given below summarize the results.
The 971205n consists of 82 full 6 degree science scans (and of course, a whole slew of 1-degree calibration scans). Most of the scans are from low stellar source density fields (< 10**3.2 stars/deg^2, K < 14), with only 16 scans located in mid-to high source density fields. For the most part, only the scans subject to low stellar source density are discussed here.
Each source deemed "extended" by GALWORKS was carefully examined by 'eye' using tools developed by Jarrett. Nearly 3000 sources were part of the 971205n extended source output (from low density scans). Each source was 'classified' by viewing the J,H and K postage stamp images, as well as the corresponding DSS (optical) stamp images and the '3-color' image generated from the JH&K images (the latter is quite powerful since it helps the user see "color" as well as increase the snr for the low surface brightness "fuzz" around galaxies). Sources were classified into the following catagories:
GALWORKS is 'tuned' to optimize galaxy completeness at the
expense of reliability. The minimum acceptible reliability
chosen to optimize completeness is about 80%. That is to say,
we tolerate a 20% contamination of false galaxies
to our extended source database in order to be >95% complete
in galaxy detection/extraction. However, in order to meet
the level-1 specification for catalog generation
(reliab > 98%)
we will apply various methods to weed out the false sources
(which, as will be shown below, are double stars and artifacts,
like meteor streaks). These methods will be applied in
a 'post-processing" phase; they include application of
decision trees,
There are in total about 2700 extended source candidates with J, H or K brighter than 15.5 mag. A number count summary of the sources are given in the tables below. The first table shows the sources binned according to their J, H and K mags, where the mags correspond to the fixed circular radius = 7" aperture photometry. No limits are applied to the "shape" score limit. The second table is the same as the first except that now we have applied a "shape" score limit of 10.0 (part of the level-1 spec) to give some idea how many (in %) sources are relevant to the spec. It turns out that 75 to 85% of the total number of extended source candidates have "shape" scores greater than 10.0. Notes: nG = # galaxies, nBog = total number of false galaxies, nS = # stars, nD = # doubles, ntr = # triple stars, nA = # artifacts, nU = # unknowns.
J- | J+ | nG | nBog | reliab | nS | nD | nA | ntr | nU |
---|---|---|---|---|---|---|---|---|---|
5.00 | 9.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
9.00 | 10.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
10.00 | 11.00 | 2 | 4 | 0.33 | 0 | 4 | 0 | 0 | 0 |
11.00 | 12.00 | 15 | 10 | 0.60 | 0 | 10 | 0 | 0 | 0 |
12.00 | 13.00 | 51 | 30 | 0.63 | 0 | 29 | 0 | 1 | 0 |
13.00 | 13.50 | 117 | 23 | 0.84 | 0 | 22 | 0 | 1 | 0 |
13.50 | 14.00 | 195 | 38 | 0.84 | 1 | 34 | 0 | 3 | 0 |
14.00 | 14.50 | 439 | 71 | 0.86 | 3 | 65 | 1 | 2 | 6 |
14.50 | 15.00 | 730 | 127 | 0.85 | 22 | 93 | 4 | 8 | 59 |
15.00 | 15.50 | 376 | 59 | 0.86 | 27 | 26 | 5 | 1 | 115 |
15.50 | 16.00 | 18 | 5 | 0.78 | 0 | 2 | 3 | 0 | 7 |
H- | H+ | nG | nBog | reliab | nS | nD | nA | ntr | nU |
5.00 | 9.00 | 0 | 1 | 0.00 | 1 | 0 | 0 | 0 | 0 |
9.00 | 10.00 | 0 | 1 | 0.00 | 0 | 1 | 0 | 0 | 0 |
10.00 | 11.00 | 13 | 6 | 0.68 | 0 | 6 | 0 | 0 | 0 |
11.00 | 12.00 | 36 | 16 | 0.69 | 0 | 16 | 0 | 0 | 0 |
12.00 | 13.00 | 218 | 47 | 0.82 | 0 | 46 | 0 | 1 | 0 |
13.00 | 13.50 | 275 | 40 | 0.87 | 2 | 34 | 0 | 4 | 1 |
13.50 | 14.00 | 564 | 85 | 0.87 | 8 | 73 | 0 | 4 | 18 |
14.00 | 14.50 | 729 | 176 | 0.81 | 67 | 98 | 4 | 7 | 130 |
14.50 | 15.00 | 113 | 33 | 0.77 | 17 | 14 | 2 | 0 | 43 |
15.00 | 15.50 | 0 | 7 | 0.00 | 1 | 1 | 5 | 0 | 2 |
15.50 | 16.00 | 0 | 4 | 0.00 | 1 | 1 | 2 | 0 | 0 |
K- | K+ | nG | nBog | reliab | nS | nD | nA | ntr | nU |
5.00 | 9.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
9.00 | 10.00 | 2 | 0 | 1.00 | 0 | 0 | 0 | 0 | 0 |
10.00 | 11.00 | 18 | 7 | 0.72 | 0 | 7 | 0 | 0 | 0 |
11.00 | 12.00 | 60 | 19 | 0.76 | 0 | 19 | 0 | 0 | 0 |
12.00 | 13.00 | 369 | 60 | 0.86 | 0 | 55 | 2 | 3 | 2 |
13.00 | 13.50 | 530 | 62 | 0.90 | 3 | 57 | 0 | 2 | 11 |
13.50 | 14.00 | 799 | 165 | 0.83 | 64 | 94 | 3 | 4 | 115 |
14.00 | 14.50 | 165 | 113 | 0.59 | 52 | 55 | 2 | 4 | 56 |
14.50 | 15.00 | 2 | 14 | 0.12 | 3 | 7 | 3 | 1 | 3 |
15.00 | 15.50 | 0 | 5 | 0.00 | 0 | 2 | 1 | 2 | 1 |
15.50 | 16.00 | 0 | 1 | 0.00 | 0 | 0 | 1 | 0 | 1 |
J- | J+ | nG | nBog | reliab | nS | nD | nA | ntr | nU |
---|---|---|---|---|---|---|---|---|---|
5.00 | 9.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
9.00 | 10.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
10.00 | 11.00 | 2 | 1 | 0.67 | 0 | 1 | 0 | 0 | 0 |
11.00 | 12.00 | 15 | 3 | 0.83 | 0 | 3 | 0 | 0 | 0 |
12.00 | 13.00 | 51 | 12 | 0.81 | 0 | 12 | 0 | 0 | 0 |
13.00 | 13.50 | 117 | 10 | 0.92 | 0 | 9 | 0 | 1 | 0 |
13.50 | 14.00 | 194 | 19 | 0.91 | 1 | 15 | 0 | 3 | 0 |
14.00 | 14.50 | 424 | 31 | 0.93 | 1 | 28 | 1 | 1 | 6 |
14.50 | 15.00 | 679 | 66 | 0.91 | 9 | 51 | 3 | 3 | 42 |
15.00 | 15.50 | 313 | 26 | 0.92 | 11 | 9 | 5 | 1 | 73 |
15.50 | 16.00 | 13 | 5 | 0.72 | 0 | 2 | 3 | 0 | 5 |
H- | H+ | nG | nBog | reliab | nS | nD | nA | ntr | nU |
5.00 | 9.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
9.00 | 10.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
10.00 | 11.00 | 13 | 2 | 0.87 | 0 | 2 | 0 | 0 | 0 |
11.00 | 12.00 | 36 | 5 | 0.88 | 0 | 5 | 0 | 0 | 0 |
12.00 | 13.00 | 214 | 16 | 0.93 | 0 | 16 | 0 | 0 | 0 |
13.00 | 13.50 | 263 | 21 | 0.93 | 1 | 16 | 0 | 4 | 1 |
13.50 | 14.00 | 513 | 36 | 0.93 | 1 | 32 | 0 | 3 | 12 |
14.00 | 14.50 | 594 | 79 | 0.88 | 19 | 52 | 4 | 4 | 71 |
14.50 | 15.00 | 68 | 6 | 0.92 | 3 | 2 | 1 | 0 | 19 |
15.00 | 15.50 | 0 | 4 | 0.00 | 0 | 0 | 4 | 0 | 1 |
15.50 | 16.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
K- | K+ | nG | nBog | reliab | nS | nD | nA | ntr | nU |
5.00 | 9.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
9.00 | 10.00 | 2 | 0 | 1.00 | 0 | 0 | 0 | 0 | 0 |
10.00 | 11.00 | 18 | 3 | 0.86 | 0 | 3 | 0 | 0 | 0 |
11.00 | 12.00 | 60 | 5 | 0.92 | 0 | 5 | 0 | 0 | 0 |
12.00 | 13.00 | 333 | 20 | 0.94 | 0 | 17 | 2 | 1 | 2 |
13.00 | 13.50 | 415 | 30 | 0.93 | 0 | 29 | 0 | 1 | 7 |
13.50 | 14.00 | 555 | 47 | 0.92 | 10 | 33 | 3 | 1 | 56 |
14.00 | 14.50 | 84 | 27 | 0.76 | 7 | 17 | 2 | 1 | 15 |
14.50 | 15.00 | 2 | 4 | 0.33 | 0 | 3 | 1 | 0 | 1 |
15.00 | 15.50 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
15.50 | 16.00 | 0 | 0 | 0.00 | 0 | 0 | 0 | 0 | 0 |
Star - Galaxy Discrimination Parameters
Notes:
galaxies == filled white circles
stars == small red triangles
doubles == large red triangles
triples == blue crosses
artifacts == red crosses
unknowns == yellow dots
Extended Source Colors
Notes: colors computed using fixed circular radius=7" apertures; galaxies are denoted by the white filled circles; double stars by the red triangles; mean colors are represented K-corrections for spirals are represented by the magenta line and triangle points -- each point is 0.1 in redshift -- and ellipticals by the grey line and square points; the main sequence (dwarf and giant branches) are shown in green.
Preliminary Effort at Automated Classification
We can use the "truth" set of 971205n stars, galaxies, and artifacts to build an automated "classification" scheme that uses all or nearly all of the 2-D and 1-D information we have for each source. The basic method is explained in Optimized Star-Galaxy Discrimination .
To summarize the method (with updated changes):
Using RTB data fields, we can fully explore the space formed by the score weighted average and determine the optimum weights by maximizing the reliability.
For each source, (note: since we are using RTB data, we know what each source is: galaxy, double star, triple star, artifact, etc), we perform the following calculation (weighted average):
SS(i) is the "super" or combined score for the ith source. We do this for all of the sources, then apply a threshold to SS, and compute the resultant C & R. Our goal is to find the best set of W(j) that maximizes R (with C >= Clim, say 95%).
Let's suppose that we may assign a weight for each score between 0 and 9 (with integer spacing). And let's suppose that we have a total of 7 scores, including "mxdn", "sh", "flux ratio", "wsh", "r23", "r1", and "J-K color". Our score space is then 7 dimensions (ignoring integrated flux). Then the total number of possible combinations is 10**7, or 10 million. We thus explore 10 million different combinations of the 7 score parameters to arrive at the best combination of weights. We can use these optimized weights to compute a "super" score for each object and threshold on this score to separate bogies from real extended sources (galaxies). Since the relative importance of the scores changes with both integrated flux and with source density, we want to perform this operation for different mag bins and for different source densities (as noted above). This study considers three different mag bins: bright (everything brighter than the last level-1 spec bin), the last level-1 spec bin (e.g., 13.0 < K < 13.5), and a faint band beyond the lev-1 spec (e.g., 13.5 < K < 13.8). And of course we do the operation for each band, JHK, separately.
After normalizing the scores (so that they can be inter-compared) and finding the optimized weights (possible weights explored: 0 to 6, or 7 ** 7 total possible combination; truth table used included sources from 971205n, one scan of VIRGO and several scans from Abell 262), we get the following "super" scores and galaxy probabilites for 971205n.