T. Jarrett, IPAC
(980121)
As the surface density of stars exponentially increases near the disk of the Galaxy, the "confusion noise" becomes appreciable and is one of the primary deterrents toward galaxy detection and extraction. So as to not waste valuable processor time on objects fainter than the confusion limit (defined below), it is part of GALWORKS to estimate the confusion noise (via the stellar number density) and throttle back the magnitude (sensitivity) threshold limits accordingly. In addition to the mag limits, it is necessary to adjust the star-galaxy discrimination score thresholds with density in order to minimize stellar contamination as well as cap the total number of sources that galworks churns out. The discussion on score thresholding, or "tuning" of the scores, follows the discussion of confusion noise.
The confusion noise calculation is explained briefly in the GALWORKS SDS document. The reader may also want to refer to the similar topic of Expected Number of Multiple Star Systems as a Function of Galactic Latitude .
The crux of the confusion noise matter is reproduced below:
Starcounts are performed for each coadd and the number density
as a function of total integrated flux is computed. Flux thresholds are
set by the coadd source density. If the density is sufficiently high
(e.g. near the Galactic plane), then the limiting flux (i.e., the total
integrated flux) is modified according to the estimated confusion, as given in the memo
by Beichman dated 02-19-93. The estimated
confusion noise is illustrated in Figure 5 of the noted memo (where the confusion+sky noise
is plotted vs. source density; here the
confusion noise is computed as a mean value in a 5 arcsec aperture).
For example, for a stellar density of 1100 stars/deg2 (K < 14) the sky
surface brightness due to confusion noise has increased by 0.2 mag
compared to the pole (300 stars/deg2); for 5000 stars/deg2 the
increase is 0.5 mag; for 10000 stars/deg2 the increase is 1.3 mag; for 20000 stars/deg2
the increase is 2.0 mag.
The estimated confusion noise as a function of the K stellar number density (log10 of cumulative stars per degree**2, with Kmag < 14.0) is given below. The solid white line represents the nominal confusion noise, while the red dashed line represents the actual value used by GALWORKS.
mid30, located at glat = 30 deg (glong = 40)
mid20, located at glat = 19 deg
mid10, located at glat = 8 to 9 deg
low5, located at glat = 3 to 6 deg
NAN (north american nebula), located at -2 deg glat
msx, located near 0 to 1 deg glat
field | density | dmag | J | H | K |
---|---|---|---|---|---|
NAN | 4.3 | 2.0 | 13.5 | 12.75 | 12.25 |
low5 | 4.1 | 1.4 | 14.1 | 13.40 | 12.80 |
mid10 | 3.8 | 0.9 | 14.6 | 13.85 | 13.40 |
mid20 | 3.35 | 0.3 | 15.2 | 14.45 | 14.00 |
mid30 | 2.90 | 0.1 | 15.4 | 14.65 | 14.15 |
abell262 | <2.70 | 0.0 | 15.5 | 14.75 | 14.25 |
coma | <2.50 | 0.0 | 15.5 | 14.75 | 14.25 |
Threshold Tuning: Case Study of "low5"
For low density fields where the level-1 specs apply, the "score" thresholds are tuned such that the database (galaxy candidates) is maximized by the internal completeness, with reliability minimized without adversely affecting completeness. The lev-1 spec is >90% for low density fields (~glat > 30 deg) and 80% for mid-density fields (glat between ~10 and 30 deg). We therefore choose our score thresholds such that the internal completeness is at least 95 to 99% for low density and 90% for mid density. For mid-density and high density fields (see below) the critical score thresholds are on "msh" and "r23".
For high density fields, (~glat < 20), the "score" thresholds are tuned such that the total number of galaxy candidates is no more than 50 or so objects per scan. The critical score threshold is on "r23".
The following plot shows the "r23" score for galaxy candidates in the "low5" field, which has a very high source density. The objects include real galaxies (filled circles), double stars (red triangle)and triple stars (blue crosses). This score is the most robust discriminator between real extended sources and compact double & triple stars. The "msh" works well with double stars, but is insensitive to triple stars. It does, however, function as the "preliminary" score threshold in the GALWORKS processor, so it is important to set this score as high as possible (to reduce runtime) without affecting completeness. Note: yellow crosses are sources that cannot be accurately classified (they tend to be at the faint limit of the survey).
One can see from the plot that double/triple stars are a major headache toward galaxy detection and extraction. The best we can do is minimize the total numbers of these "false" galaxy candidates while preserving most of our brighter galaxies (brighter than the mag limit after adjusting for the confusion noise).
The source counts for the galaxy candidates found in the "low5" fields are given below. The first gif image shows the counts before any "tuning" of "r23" or "msh" is attempted. The important column to look at is the last one, entitled "GTOT/scan", which is the total number of sources per scan. The total reflects a "cumulative" count, including real galaxies, "bogies" (stars, doubles, triples), artifacts (pieces of trails), "unknown" objects and "no verify" objects. We desire no more than 50 objects per scan (6 degree scan) total up to the mag limit. For "low5" the mag limit is around 14.1, 13.4 and 12.8, JH & K, respectively. The 50 objects per scan number comes from runtime considerations (as well as disk space consideration).
Raising the threshold for "msh" and "r23" to values of 3 and 4, respectively, gives a satisfactory total number of objects per scan at the limit we desire:
Notice that the total number of sources per scan is about 50 at the mag thresholds. Note however, the reliability is still rather horrible.
With "moderate" tuning, r23 threshold of 4.0, cuts the total number of objects by nearly one half, with very little affect on the completeness:
Finally, setting r23 to 5.0 results in a completeness hit in the last relevant mag bin:
Note that the total counts are now down to 20 or so objects per scan brighter than the mag threshold limit.
Since we would like to error on the side of completeness, the "moderate" score thresholds will be used. After applying the confusion noise adjustment to the mag limits and employing the "moderate" score thresholds (msh = 3.0, r23 = 4.0), the resulting galaxy database counts are:
and the 2-D score plots are the following:
As we go deeper into the plane of the Milkey Way, the confusion noise rises accordingly (since the near-IR bands are relatively insensitive to extinction). The odds of finding a galaxy in any given scan are remote. We did, however, find one beautiful galaxy in the NAN field scans. The following plots show the scores for this galaxy along with all of the other junk (re: false galaxies) that galworks extracts in one scan, as well as the galaxy candidate source counts. Note: there is one other possible galaxy (it is very difficult to classify, but it looks compelling nonetheless).
Using the "moderate" score thresholds determined for "low5" (which is at a slightly lower stellar number density) and the confusion noise adjusted mag limits of 13.5, 12.75, and 12.5, JHK, respectively, we arrive at the following source counts:
... And so we arrive at a satisfactory conclusion: the total number of sources per scan is less than 50 and our one beautiful galaxy is preserved. Hurray!
The "mid10" field has an average confusion noise of about 0.9 mag. False detections still dominate the total source counts. Since there is no lev-1 spec on reliability, we want to again be as complete as possible, while limiting the total number counts to less than 50 or so per scan.
The optimum tuned thresholds for this field are
msh = 2 and r23 = 2.5.
The "mid20" field has an average confusion noise of about 0.3 mag. Real galaxies now dominate the total source counts. Since there is no lev-1 spec on reliability, we want to again be as complete as possible, while limiting the total number counts to less than 50 or so per scan.
The optimum tuned thresholds for this field are
msh = 2 and r23 = 1.5.
The "mid30" field has an average confusion noise of about 0.1 mag. The level-1 specs now apply, with a reliability limit of 99% and a completeness limit of 90%. Since the 99% reliability can only be achieved with post-processing tricks (e.g., using OBDT blackboxes or N-cube classifications and a human in the loop to eliminate extraneous artifacts) our goal for the database is to achieve excellent completeness (at least 95%), while minimizing the impact of bogus sources as much as possible.
The optimum tuned thresholds for this field are
msh = 1.5 and r23 = 1.5.
To summarize the "tuning" results for confusion noise and score thresholding ("r23" and "msh"):
field | density | dmag | J | H | K | msh | r23 |
---|---|---|---|---|---|---|---|
NAN | 4.3 | 2.0 | 13.5 | 12.75 | 12.25 | >3.0 | >4.0 |
low5 | 4.1 | 1.4 | 14.1 | 13.40 | 12.80 | 3.0 | 4.0 |
mid10 | 3.8 | 0.9 | 14.6 | 13.85 | 13.40 | 2.0 | 2.5 |
mid20 | 3.35 | 0.3 | 15.2 | 14.45 | 14.00 | 2.0 | 1.5 |
mid30 | 2.90 | 0.1 | 15.4 | 14.65 | 14.15 | 1.5 | 1.5 |
coma | <2.50 | 0.0 | 15.5 | 14.75 | 14.25 | 1.5 | 1.0 |
The confusion noise, score thresholds and densities are also conveniently plotted below:
Here the confusion noise is denoted by the white solid line.
The "r23" thresholds by the blue crosses and the "msh" thresholds
by the green filled circles. It turns out that a convenient "fit"
to the thresholds is given by a scaling of the confusion mag.
For "r23" the formula is:
The "fits" are shown with the green (msh) and blue (r23) dashed lines.
. | . | . | |-- | pre-tune | ---| | |--- | tuned | ------| | . | |--tuned | ----- | -short--| |
---|---|---|---|---|---|---|---|---|---|---|---|---|
field | density | scans | real | user | dt/scan | real | user | dt/scan | improve | #stamps | #/scan | #/scan |
ab262 | 2.70 | 15 | 720 | 448 | 30 | 564 | 482 | 32 | 0.94 | 1960 | 131 | .. |
coma | 2.50 | 5 | 283 | 170 | 34 | 183 | 156 | 31 | 1.10 | 818 | 164 | 127 |
herc | 2.80 | 5 | 382 | 246 | 49 | 258 | 235 | 47 | 1.04 | 1350 | 270 | 193 |
mid30 | 2.90 | 9 | 461 | 291 | 32 | 310 | 290 | 32 | 1.00 | 1017 | 113 | .. |
mid47 | 2.75 | 8 | 353 | 237 | 30 | 211 | 200 | 25 | 1.20 | 886 | 111 | 43 |
mid40 | 2.80 | 5 | 209 | 138 | 28 | 131 | 124 | 25 | 1.12 | 506 | 101 | 37 |
misc | 3.05 | 6 | 333 | 206 | 34 | 182 | 151 | 25 | 1.36 | 562 | 94 | .. |
mid20 | 3.35 | 4 | 326 | 200 | 50 | 190 | 181 | 45 | 1.11 | 590 | 148 | 77 |
mid10 | 3.80 | 6 | 903 | 655 | 109 | 623 | 376 | 63 | 1.73 | 257 | 43 | .. |
maffei | 3.75 | 3 | 400 | 290 | 97 | 305 | 201 | 69 | 1.41 | n/a | n/a | .. |
low5 | 4.10 | 7 | 1640 | 1351 | 193 | 884 | 611 | 87 | 2.22 | 455 | 65 | .. |
neb | 4.20 | 1 | 282 | 226 | 226 | 104 | 49 | 49 | 4.62 | n/a | n/a | .. |
nan | 4.30 | 1 | 260 | 216 | 216 | 103 | 59 | 59 | 3.66 | 30 | 30 | .. |
msx | 4.50 | 1 | 264 | 209 | 209 | 100 | 50 | 50 | 4.18 | 13 | 13 | .. |
Notes: