Reconstructing Websites for the Lazy Webmaster

Computer Science – Information Retrieval

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

13 pages, 11 figures, 4 tables

Scientific paper

Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, "lazy" webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine caches. We introduce the concept of "lazy preservation"- digital preservation performed as a result of the normal operations of the Web infrastructure (search engines and caches). We present Warrick, a tool to automate the process of website reconstruction from the Internet Archive, Google, MSN and Yahoo. Using Warrick, we have reconstructed 24 websites of varying sizes and composition to demonstrate the feasibility and limitations of website reconstruction from the public Web infrastructure. To measure Warrick's window of opportunity, we have profiled the time required for new Web resources to enter and leave search engine caches.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Reconstructing Websites for the Lazy Webmaster does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Reconstructing Websites for the Lazy Webmaster, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reconstructing Websites for the Lazy Webmaster will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-112144

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.