Star-Galaxy Discrimination in the Galactic Plane:
Memo 2: COMA + MSX

T. Jarrett, IPAC

The is the second experiment in a series of analysis runs on the abundantly star rich MSX field. The first memo, Star-Galaxy Discrimination in the Galactic Plane: Memo 1: 9th mag Galaxy Placed within MSX should be viewed before this memo in order to become familiar with the technique and jargon. This memo concerns the GALWORKS processing results of MSX scans with COMA galaxies added to the fields.

I have chosen one coadd that contains a large number of Coma galaxies to add to the MSX scans. In addition to the one Coma coadd of galaxies, I did add a rather bright Coma spiral (from an adjacent coadd) to MSX. In total, there are at least 20 easily identifiable galaxies, ranging in K mag from 10.2 to 13.9. The fainter galaxies will obviously be lost in the confusion noise of MSX, but there will be plenty of galaxies that will be picked up by GALWORKS. The Coma coadds are shown here:

To mimic extinction, I will apply an attenuation factor to the J and H images, leaving K as is. Assuming a uniform visual extinction of 5 magnitudes, the J image is attentuated by 0.86 mag compared to the K image, and H is attentuated by 0.32 mag compared to K (e.g., A(H) - A(k) = 0.20 * A(v) / R ). I do not attempt to model variable extinction (which undoubtedly exists in the MSX field) or the change in extinction as we move from the bottom of the scan (glat = 1 degree) to the top of the scan (glat = 3 degrees). These sorts of complications are second-order effects in a admittedly first-order experiment. Our goal is to measure the parametric characteristics of the galaxies and compare with stars and multiples. In this way we can better tune the thresholds to eliminate triples stars without eliminating the galaxies.

Coma + MSX

Two big COMA galaxies (10th mag) are easily visible in the image (center left and center right). The bright spiral is barely visible (upper right, near bright star). The remaining galaxies are not easily seen in this image. The bright galaxies should be detected (nearly) every time -- EXCEPT when it is located near a very bright star in which case it can be "blanked" away along with the star. Bright star blanking is unavoidable and devastating to anything in its vicinity as will be shown below. From memo 1, we found that the bright 9th mag galaxy repeated 84% of the time, where 16% of the time is was lost in the bright star blanking. We would expect similar results with the two bright Coma galaxies. The remaining Coma galaxies may be lost (at times) in the confusion noise.

Bright Star Blanking

One of the first operations that GALWORKS performs is bright star blanking, which is necessary in order to accurately compute the background and to eliminate false detections associated with the numerous artifacts generated by bright stars. See GALWORKS Bright Star Cleansing for more information on this topic.

MSX has many bright stars in each coadd; thus a large fraction of each coadd is blanked away, particularly at K band. The following images show typical blankings in the MSX scan (note: I have applied a blanking algorithm in which the thresholds for blanking have been increased according to the stellar density, in this case, the thresholds have been raised about 1 mag, thus fewer stars are blanked; for more information on this step, see Bright Stars in the MSX Field .

Only one of the big Coma galaxies survives in this coadd after bright star blanking. The detailed subimage processing of this galaxy is given below:


Coma Galaxy Repeat Detections

The brightest galaxy repeated more than 80% (consistent with our previous result from adding a bright M51-scan galaxy into the MSX scans; see Star-Galaxy Discrimination in the Galactic Plane: Memo 1: 9th mag Galaxy Placed within MSX ). The remaining bright galaxies (including the big spiral) repeated about 70% of the time. The fainter galaxies, as expected, repeated only a small fraction of the time, while the faintest galaxies, K > 13.5, did not repeat. We conclude that the completeness for galaxy detection is very good (>50%) for galaxies brighter than K = 12 or so.

The plot below shows the repeatibility. The K(10) refers to the mean measured fixed radii=10 circular aperture mag. The dotted green line demarks the value computed for the 9th mag M51-scan galaxy. The maximum number of repeats is 23.


The following images show the galaxy repeat JHK postage stamp images. Note the change in 2-D shape for the repeats due to star contamination, as well as the change in integrated flux.

A repeat is simply defined as a detection made in at least one band; many repeats are, in fact, only one or two band detections.

The first column is the J band image, second column the H band image and the third column the K band image. The dark blue elliptical contour represents the 20 mag per sq. arcsec isophotal area, and the light blue contour the "flux growth" elliptical area. Sources that had been "subtracted" from the object fields are circled in red with the size of the circle given by the subtraction radius. Sources circled with a green circle/ellipse represent sources that were previously processed and subsequently blanked from the object field (blanked pixels are then substituted with corresponding isophotal values given by the object of interest, thereby recovering pixel information).

The galaxies are presented in order of K brightness as measured in a radius=10 circular aperture. Note that this does not translate perfectly to its effective brightness as far as GALWORKS is concerned (that is, the gal might be brighter in J than K, or the r=10 aperture might be too small to reflect the real brightness of the gal). The order presented here is for convenience only.

Coma Galaxy: K(10) = 10.28

The brightest galaxy in the Coma coadd repeats 20 times out of a possible 23 coadds == 87% repeatibility. The following images correspond to 18 repeats of this giant elliptical galaxy.

Coma Galaxy: K(10) = 10.9

The second brightest galaxy in the Coma coadd, a giant elliptical galaxy, repeats 15 times out of a possible 23 coadds == 65% repeatibility. 14 repeats are shown below.

Coma Galaxy: K(10) = 11.4

The third brightest galaxy in the Coma coadd, a big spiral galaxy, repeats 15 times out of a possible 23 coadds == 65% repeatibility. Notice how the elliptical 2-D shape changes according to the star contamination, either too many stars are subtracted or, usually the case, too few (faint) stars are subtracted.

Coma Galaxy: K(10) = 11.9

The fourth brightest galaxy in the Coma coadd repeats 11 times out of a possible 23 coadds == 48% repeatibility.

Coma Galaxy: K(10) = 12.0

The fifth brightest galaxy in the Coma coadd repeats 12 times out of a possible 23 coadds == 52% repeatibility.

Coma Galaxy: K(10) = 12.0

The sixth brightest galaxy in the Coma coadd repeats 11 times out of a possible 23 coadds == 48% repeatibility.

Coma Galaxy: K(10) = 12.1

The seventh brightest galaxy in the Coma coadd repeats 16 times out of a possible 23 coadds == 70% repeatibility.

Coma Galaxy: K(10) = 12.2

The eighth brightest galaxy in the Coma coadd repeats 16 times out of a possible 23 coadds == 70% repeatibility.

Coma Galaxy: K(10) = 12.3

The nineth brightest galaxy in the Coma coadd repeats 16 times out of a possible 23 coadds == 70% repeatibility.

Coma Galaxy: K(10) = 12.4

The 10th brightest galaxy in the Coma coadd did not have any repeats.

Coma Galaxy: K(10) = 12.4

The 11th brightest galaxy in the Coma coadd repeats 15 times out of a possible 23 coadds == 65% repeatibility.

Coma Galaxy: K(10) = 12.8

The 13th brightest galaxy in the Coma coadd repeats 10 times out of a possible 23 coadds == 44% repeatibility.

Coma Galaxy: K(10) = 13.0

The 14th brightest galaxy in the Coma coadd repeats 4 times out of a possible 23 coadds == 17% repeatibility.

Coma Galaxy: K(10) = 13.0

The 15th brightest galaxy in the Coma coadd repeats 13 times out of a possible 23 coadds == 56% repeatibility.

Coma Galaxy: K(10) = 13.0

The 16th brightest galaxy in the Coma coadd repeats 1 times out of a possible 23 coadds == 4% repeatibility.

Coma Galaxy: K(10) = 13.1

The 17th brightest galaxy in the Coma coadd repeats 3 times out of a possible 23 coadds == 13% repeatibility.

Coma Galaxy: K(10) = 13.2

The 18th brightest galaxy in the Coma coadd repeats 3 times out of a possible 23 coadds == 13% repeatibility.

Coma Galaxy: K(10) = 13.3

The 19th brightest galaxy in the Coma coadd does not have any repeats.

Coma Galaxy: K(10) = 13.4

The 19th brightest galaxy in the Coma coadd repeats 1 times out of a possible 23 coadds == 4% repeatibility.

Coma Galaxy: K(10) = 13.9

The 20th brightest galaxy in the Coma coadd does not have any repeats.


False Galaxy Detections

The following images show the false detection JHK postage stamp images. The false detections include double, triple, quadruple, etc, grouped stars, as well as stars contaminated by nearby bright stars and Coma galaxies.


Star - Galaxy Discrimination

A brief description of the various scoring parameters to discriminate stars from galaxies can be found in Star - Galaxy Discrimination Parameters and additional information may also be found in the GALWORKS SDS.

The following plots show the multiple detections of the Coma galaxies as a function of its integrated flux (circular fix aperture, radius = 10) and the various discrimination parameters. The plots can be used to "tune" the scoring thesholds in order to eliminate the star systems with minimal effect upon the real galaxies.

"sh" Score

Also notice that there are few points beyond K = 13 mag due to a flux threshold set early on in GALWORKS (it is tied to the stellar density; the GALWORKS SDS explains this step).

The galaxies and the false detections have a similar distribution in this score. The false detections appear as 'extended' as the galaxies because they are triple stars and worse. See false galaxy images.

"wsh" Score

The "wedge" shape is designed to minimize contamination from double stars. The "wsh" is very effective parameter for eliminating nearly all sources early in the GALWORKS processing (remember doubles dominate multiple systems in shear numbers; see Expected Number of Multiple Star Systems as a Function of Galactic Latitude ).

Again, the false detections do not readily separate from the galaxies (except for K < 11).

"r23" Score

The "r23" score is very effective against triple stars. This parameter, however, can also eliminate plenty of real galaxies (this is based on past experience and simulation results).

It can be seen that only a few false galaxy detections separate out from the real galaxy distribution. Apparently we have hit our limit with culling out non-galaxies. The remaining "triple" killers bear this out.

"vint" Score

Similarly "vint" is an effective parameter separating galaxies from multiple stars. However, like "r23" this parameter (threshold) must be tuned carefully to avoid eliminating any real galaxies.

"trip" Score

The "trip" score is a weighted combination of the "wsh", "r23", "vmean" and "vint" scores, with "wsh" having the largest weight. Although the four scores are slightly correlated, the combination does provide a convenient measure of the "extendedness" of an object.


Reliability

We may conclude that the non-galaxies (false detections) that remain in our sample cannot be distinguished from real galaxies without severely impacting the completeness for galaxies fainter than K = 11 or so. However, things are not quite as bad as they may seem. We detect so many galaxies relative to the false detections that our reliability is in fact very good. See the reliability plot below (note the bottom panel). The reliability is better than 80% for K < 13, H < 14 and J < 14.

But we must keep in mind that I have created a somewhat artificial situation by placing a rich galaxy cluster (Coma) in the galactic plane. I did attentuate the J and H images relative to K, but did not attenuate K. Thus, if we are to detect galaxies in the plane with decent reliability, then the galaxies must be big and bright (K < 12 or 13). If we were to take away the Coma cluster from the MSX scans, then all that would be left is multiple star groups and multiples contamined by brighter stars (i.e., one of an inumerable permutations of star groupings). Our reliablity would be zero, since we do not detect any galaxies (as expected; galaxies in the plane will be rare). Consequently, we must apply strict thresholds on the triple score parameters to minimize the number of non-galaxies detections.

-- I await comment from S. Scneider and Co. before adding any further details on what should be done --


Photometric Repeatability

The photometric repeatiblilty for the brighter galaxies (K < 12) is characterized by (1) tight gaussian distribution with a dispersion of about 5- 10% and (2) deviant points well offset from the gaussian distribution. The deviant points are caused by star contamination. The following histograms show the photometric repeatibility for four of the galaxies, the first three correspond to the brightest galaxies (thus the most repeats) and the fourth to the galaxy wity K10 = 12.1 (the seventh brightest galaxy). The apertures highlighted here are the fixed circular with radius = 10, the adaptive elliptical aperture (total mag) and the isophotal aperture.

Fixed Circular Aperture, Radius = 10

20/21 mag/arcsec**2 Isophotal Aperture

Adaptive Elliptical Aperture Photometry And finally the 2-D elliptical orientation (axial ratio) parameter: The spread in the axial ratio is very large; again due to star contamination of the 3-sigma isophot.


Conclusions

The MSX scan 041 (night of 970418) covers one square degree extending from the teeth of the galactic plane (glong = 33 deg, glat = 1.1 deg) to just above the plane (glong = 38 deg, glat = 3.8 deg). The stellar density is frightening, 62000 srcs/deg^2 for K < 14 near the plane, exponentially decreasing to 35000 srcs/deg^2 for K < 14 at glat =4 deg. As such, finding galaxies will be a major challenge -- primarily due to the exceeding presence of triples, quadruples and any number of multiple star permutations. The confusion noise due to stars completely washes out any galaxies fainter than K = 13. For bright galaxies, K < 11, the completeness is limited by the presence of bright stars, which are blanked from the coadd images early in the GALWORKS production pipe.

In order to understand the unique problems with galaxy detection in the plane, the MSX coadds have been modified to include a number of real galaxies belonging to the Coma cluster. Operationally, I have added one Coma coadd (the central core, containing many bright galaxies, including two giant ellipticals and one large spiral) to each MSX coadd (there are 23 coadds), thus providing a means at measuring repeatiblity/completeness -- each galaxy repeats 23 times. The sample consists of 20 easily identifiable galaxies (in Coma that is!!), ranging in K brightness between 10 and 14 mag.

The brightest galaxy in the sample repeats 87% of the time. This galaxy is lost 13% of the time due to the close proximity of bright stars. The other bright galaxies (K < 12) in the sample repeated about 70% of the time. These results are consistent with previous results found with bright galaxy detection in the MSX field (see text). Bright star blanking is the fundamental limiting factor of completeness for the brightest galaxies. We must be very careful to tune the blanking parameters to minimize extraneous blanking while avoiding the pitfalls associated with bright stars (halos, diffraction spikes, ghosts, etc).

The reliability of the total sample of galaxy candidates (real Coma galaxies plus false detections) is very good, >80% up to our limiting mag (K = 13). The non-galaxy (false detection) objects consist of triple stars or worse and multiples contaminated by nearby bright stars (that are not blanked) and the Coma galaxies themselves. These false detections appear 'extended" in every score parameter (star-galaxy discriminent) and thus represent the limit of what GALWORKS can classify as real and extended. It is these sources that will dominate the extended source catalog (assuming that there are no Coma clusters hiding behind the Milky Way) and as such will probably have to be cleaned away by a "human-in-the-loop" if we are to find real galaxies in the plane of the Galaxy.