Summary of GALWORKS Results for RTB 971205n
-- low source density fields --

T. Jarrett, IPAC
(980410)

Several nights of 2MASS data have been specifically designated as RTB data sets (to be used to debug and check different versions of the 2MAPPS pipeline reduction). For one such night, 971205n, the extended source results (GALWORKS output) has been fully analyzed to gauge the performance of GALWORKS on a much larger scale than previously attempted. The plots and tables given below summarize the results.

The 971205n consists of 82 full 6 degree science scans (and of course, a whole slew of 1-degree calibration scans). Most of the scans are from low stellar source density fields (< 10**3.2 stars/deg^2, K < 14), with only 16 scans located in mid-to high source density fields. For the most part, only the scans subject to low stellar source density are discussed here.

Each source deemed "extended" by GALWORKS was carefully examined by 'eye' using tools developed by Jarrett. Nearly 3000 sources were part of the 971205n extended source output (from low density scans). Each source was 'classified' by viewing the J,H and K postage stamp images, as well as the corresponding DSS (optical) stamp images and the '3-color' image generated from the JH&K images (the latter is quite powerful since it helps the user see "color" as well as increase the snr for the low surface brightness "fuzz" around galaxies). Sources were classified into the following catagories:

GALWORKS is 'tuned' to optimize galaxy completeness at the expense of reliability. The minimum acceptible reliability chosen to optimize completeness is about 80%. That is to say, we tolerate a 20% contamination of false galaxies to our extended source database in order to be >95% complete in galaxy detection/extraction. However, in order to meet the level-1 specification for catalog generation (reliab > 98%) we will apply various methods to weed out the false sources (which, as will be shown below, are double stars and artifacts, like meteor streaks). These methods will be applied in a 'post-processing" phase; they include application of decision trees, neural nets, and N-space classification. The latter will be discussed in more detail in a coming memo by Jarrett.

There are in total about 2700 extended source candidates with J, H or K brighter than 15.5 mag. A number count summary of the sources are given in the tables below. The first table shows the sources binned according to their J, H and K mags, where the mags correspond to the fixed circular radius = 7" aperture photometry. No limits are applied to the "shape" score limit. The second table is the same as the first except that now we have applied a "shape" score limit of 10.0 (part of the level-1 spec) to give some idea how many (in %) sources are relevant to the spec. It turns out that 75 to 85% of the total number of extended source candidates have "shape" scores greater than 10.0. Notes: nG = # galaxies, nBog = total number of false galaxies, nS = # stars, nD = # doubles, ntr = # triple stars, nA = # artifacts, nU = # unknowns.

Binned Number Counts for GALWORKS Extracted Sources; no "sh" limit

J- J+ nG nBog reliab nS nD nA ntr nU
5.00 9.00 0 0 0.00 0 0 0 0 0
9.00 10.00 0 0 0.00 0 0 0 0 0
10.00 11.00 2 4 0.33 0 4 0 0 0
11.00 12.00 15 10 0.60 0 10 0 0 0
12.00 13.00 51 30 0.63 0 29 0 1 0
13.00 13.50 117 23 0.84 0 22 0 1 0
13.50 14.00 195 38 0.84 1 34 0 3 0
14.00 14.50 439 71 0.86 3 65 1 2 6
14.50 15.00 730 127 0.85 22 93 4 8 59
15.00 15.50 376 59 0.86 27 26 5 1 115
15.50 16.00 18 5 0.78 0 2 3 0 7
H- H+ nG nBog reliab nS nD nA ntr nU
5.00 9.00 0 1 0.00 1 0 0 0 0
9.00 10.00 0 1 0.00 0 1 0 0 0
10.00 11.00 13 6 0.68 0 6 0 0 0
11.00 12.00 36 16 0.69 0 16 0 0 0
12.00 13.00 218 47 0.82 0 46 0 1 0
13.00 13.50 275 40 0.87 2 34 0 4 1
13.50 14.00 564 85 0.87 8 73 0 4 18
14.00 14.50 729 176 0.81 67 98 4 7 130
14.50 15.00 113 33 0.77 17 14 2 0 43
15.00 15.50 0 7 0.00 1 1 5 0 2
15.50 16.00 0 4 0.00 1 1 2 0 0
K- K+ nG nBog reliab nS nD nA ntr nU
5.00 9.00 0 0 0.00 0 0 0 0 0
9.00 10.00 2 0 1.00 0 0 0 0 0
10.00 11.00 18 7 0.72 0 7 0 0 0
11.00 12.00 60 19 0.76 0 19 0 0 0
12.00 13.00 369 60 0.86 0 55 2 3 2
13.00 13.50 530 62 0.90 3 57 0 2 11
13.50 14.00 799 165 0.83 64 94 3 4 115
14.00 14.50 165 113 0.59 52 55 2 4 56
14.50 15.00 2 14 0.12 3 7 3 1 3
15.00 15.50 0 5 0.00 0 2 1 2 1
15.50 16.00 0 1 0.00 0 0 1 0 1

Binned Number Counts for GALWORKS Extracted Sources w/ "sh" limit = 10.0

J- J+ nG nBog reliab nS nD nA ntr nU
5.00 9.00 0 0 0.00 0 0 0 0 0
9.00 10.00 0 0 0.00 0 0 0 0 0
10.00 11.00 2 1 0.67 0 1 0 0 0
11.00 12.00 15 3 0.83 0 3 0 0 0
12.00 13.00 51 12 0.81 0 12 0 0 0
13.00 13.50 117 10 0.92 0 9 0 1 0
13.50 14.00 194 19 0.91 1 15 0 3 0
14.00 14.50 424 31 0.93 1 28 1 1 6
14.50 15.00 679 66 0.91 9 51 3 3 42
15.00 15.50 313 26 0.92 11 9 5 1 73
15.50 16.00 13 5 0.72 0 2 3 0 5
H- H+ nG nBog reliab nS nD nA ntr nU
5.00 9.00 0 0 0.00 0 0 0 0 0
9.00 10.00 0 0 0.00 0 0 0 0 0
10.00 11.00 13 2 0.87 0 2 0 0 0
11.00 12.00 36 5 0.88 0 5 0 0 0
12.00 13.00 214 16 0.93 0 16 0 0 0
13.00 13.50 263 21 0.93 1 16 0 4 1
13.50 14.00 513 36 0.93 1 32 0 3 12
14.00 14.50 594 79 0.88 19 52 4 4 71
14.50 15.00 68 6 0.92 3 2 1 0 19
15.00 15.50 0 4 0.00 0 0 4 0 1
15.50 16.00 0 0 0.00 0 0 0 0 0
K- K+ nG nBog reliab nS nD nA ntr nU
5.00 9.00 0 0 0.00 0 0 0 0 0
9.00 10.00 2 0 1.00 0 0 0 0 0
10.00 11.00 18 3 0.86 0 3 0 0 0
11.00 12.00 60 5 0.92 0 5 0 0 0
12.00 13.00 333 20 0.94 0 17 2 1 2
13.00 13.50 415 30 0.93 0 29 0 1 7
13.50 14.00 555 47 0.92 10 33 3 1 56
14.00 14.50 84 27 0.76 7 17 2 1 15
14.50 15.00 2 4 0.33 0 3 1 0 1
15.00 15.50 0 0 0.00 0 0 0 0 0
15.50 16.00 0 0 0.00 0 0 0 0 0


Star - Galaxy Discrimination Parameters

Notes:


Extended Source Colors


Preliminary Effort at Automated Classification

We can use the "truth" set of 971205n stars, galaxies, and artifacts to build an automated "classification" scheme that uses all or nearly all of the 2-D and 1-D information we have for each source. The basic method is explained in Optimized Star-Galaxy Discrimination .

To summarize the method (with updated changes):

Using RTB data fields, we can fully explore the space formed by the score weighted average and determine the optimum weights by maximizing the reliability.

For each source, (note: since we are using RTB data, we know what each source is: galaxy, double star, triple star, artifact, etc), we perform the following calculation (weighted average):

SS(i) is the "super" or combined score for the ith source. We do this for all of the sources, then apply a threshold to SS, and compute the resultant C & R. Our goal is to find the best set of W(j) that maximizes R (with C >= Clim, say 95%).

Let's suppose that we may assign a weight for each score between 0 and 9 (with integer spacing). And let's suppose that we have a total of 7 scores, including "mxdn", "sh", "flux ratio", "wsh", "r23", "r1", and "J-K color". Our score space is then 7 dimensions (ignoring integrated flux). Then the total number of possible combinations is 10**7, or 10 million. We thus explore 10 million different combinations of the 7 score parameters to arrive at the best combination of weights. We can use these optimized weights to compute a "super" score for each object and threshold on this score to separate bogies from real extended sources (galaxies). Since the relative importance of the scores changes with both integrated flux and with source density, we want to perform this operation for different mag bins and for different source densities (as noted above). This study considers three different mag bins: bright (everything brighter than the last level-1 spec bin), the last level-1 spec bin (e.g., 13.0 < K < 13.5), and a faint band beyond the lev-1 spec (e.g., 13.5 < K < 13.8). And of course we do the operation for each band, JHK, separately.

After normalizing the scores (so that they can be inter-compared) and finding the optimized weights (possible weights explored: 0 to 6, or 7 ** 7 total possible combination; truth table used included sources from 971205n, one scan of VIRGO and several scans from Abell 262), we get the following "super" scores and galaxy probabilites for 971205n.