Star-Galaxy Classification Using Data Mining Techniques with Considerations for Unbalanced Datasets

Astronomy and Astrophysics – Astronomy

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

2

Scientific paper

We used a range of data-mining techniques in an effort to improve the classification of stars and galaxies for imaging data from the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS), and extracted with SExtractor. We found that the Artificial Neural Network (ANN) achieved higher accuracies than Support Vector Machines, but was outperformed by the Random Forest and Decision Tree data-mining techniques on 5000 randomly sampled objects. This has potentially negative implications for SExtractor which uses an ANN to produce a measure of stellarity for each object. We found that the classification of stars and galaxies can be improved by voting (between Decision Trees, Random Forests and ANNs) and using balanced datasets. For the balanced datasets that we created, the three data mining techniques agreed over 80% of the time on the type of object.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Star-Galaxy Classification Using Data Mining Techniques with Considerations for Unbalanced Datasets does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Star-Galaxy Classification Using Data Mining Techniques with Considerations for Unbalanced Datasets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Star-Galaxy Classification Using Data Mining Techniques with Considerations for Unbalanced Datasets will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-834373

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.