Statistics – Applications
Scientific paper
Jan 2010
adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2010aas...21543814b&link_type=abstract
American Astronomical Society, AAS Meeting #215, #438.14; Bulletin of the American Astronomical Society, Vol. 42, p.394
Statistics
Applications
Scientific paper
We present the details of an image processing pipeline and a new Python library providing a convenient interface to Planetary Data System (PDS) data products. The library aims to be a useful tool for general purpose PDS processing. Test images have been extracted from existing PDS data products using the library but will work with lunar images from LRO/LROC. To process high-volume data sets we employ Hadoop, an open-source framework implementing the Map/Reduce paradigm for writing data intensive distributed applications. By harnessing a cluster of processing nodes we are able to extract raw images from data products and convert them to web-friendly formats at the rate of gigabytes per minute. The resultant images have been converted using the Python Image Library. Additionally, the images have been cropped to postage stamp images supporting various zoom levels. The final images, along with some metadata are uploaded to Amazon's S3 data storage system where they are served. Preliminary tests of the pipeline are promising, having processed 10,000 sample files totaling 30 GB in 15 minutes. The resultant jpegs totaled only 3 GB after compression. The code base has not only proven successful in its own right, but also shows Python, an interpreted language, to be a viable alternative to more mainstream compiled languages such as C/C++ or Fortran, especially when combined with Hadoop. This work was funded through NASA ROSES NNX09AD34G.
Armbrust Michael
Balfanz Ryan
Gay Pamela L.
Smith Aaron
No associations
LandOfFree
Creating Data Pipelines for PDS Datasets does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Creating Data Pipelines for PDS Datasets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Creating Data Pipelines for PDS Datasets will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-968803