Computer Science – Information Retrieval
Scientific paper
2009-07-20
Computer Science
Information Retrieval
6 pages, 8 figures, 1 table, 1 appendix
Scientific paper
Inaccessible web pages are part of the browsing experience. The content of these pages however is often not completely lost but rather missing. Lexical signatures (LS) generated from the web pages' textual content have been shown to be suitable as search engine queries when trying to discover a (missing) web page. Since LSs are expensive to generate, we investigate the potential of web pages' titles as they are available at a lower cost. We present the results from studying the change of titles over time. We take titles from copies provided by the Internet Archive of randomly sampled web pages and show the frequency of change as well as the degree of change in terms of the Levenshtein score. We found very low frequencies of change and high Levenshtein scores indicating that titles, on average, change little from their original, first observed values (rooted comparison) and even less from the values of their previous observation (sliding).
Klein Martin
Nelson Michael L.
No associations
LandOfFree
Investigating the Change of Web Pages' Titles Over Time does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Investigating the Change of Web Pages' Titles Over Time, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Investigating the Change of Web Pages' Titles Over Time will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-696925