Creating Data Pipelines for PDS Datasets

Statistics – Applications

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

We present the details of an image processing pipeline and a new Python library providing a convenient interface to Planetary Data System (PDS) data products. The library aims to be a useful tool for general purpose PDS processing. Test images have been extracted from existing PDS data products using the library but will work with lunar images from LRO/LROC. To process high-volume data sets we employ Hadoop, an open-source framework implementing the Map/Reduce paradigm for writing data intensive distributed applications. By harnessing a cluster of processing nodes we are able to extract raw images from data products and convert them to web-friendly formats at the rate of gigabytes per minute. The resultant images have been converted using the Python Image Library. Additionally, the images have been cropped to postage stamp images supporting various zoom levels. The final images, along with some metadata are uploaded to Amazon's S3 data storage system where they are served. Preliminary tests of the pipeline are promising, having processed 10,000 sample files totaling 30 GB in 15 minutes. The resultant jpegs totaled only 3 GB after compression. The code base has not only proven successful in its own right, but also shows Python, an interpreted language, to be a viable alternative to more mainstream compiled languages such as C/C++ or Fortran, especially when combined with Hadoop. This work was funded through NASA ROSES NNX09AD34G.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Creating Data Pipelines for PDS Datasets does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Creating Data Pipelines for PDS Datasets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Creating Data Pipelines for PDS Datasets will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-968803

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.