Web Archive Analyses Tool
Oral Defence Date:
Professors Dragutin Petkovic & Ilmi Yoon
The World Wide Web (WWW) is continuously evolving to better serve the needs of internet users. Internet users may now find themselves inundated with relevant digital information at the click of the mouse. However, information and WWW pages available to users online on a remote server today may be gone tomorrow. In order for us to enable users to efficiently find, use, and retrieve the vast amounts of valuable information they have collected from the internet, they need effective web archival tools. We conducted a small market survey and a comprehensive study of the web’s archival services which are available today. We examined the tools and services that web user want in web archival systems and found that current web archival products provided the basic services and fared well in most regards except in good user functionality and user interface design. Most of them for example lack the cataloging and search functionality and are mainly storing the archival data on the user’s machine thus limiting its accessibility. To address to these needs we have designed, developed and tested a catalog like application for archiving WWW content. This project offers several novelties. It lets users store and archive their web pages online on a remote server so that they can access it later on from any computer connected to the internet. It lets users store their web pages with their own names and descriptions. It also features a unique option - it is able to save subsequent version of a given web page over a specific period of time as per user-determined schedule. It even notifies the user of any content difference in the recursively archived web pages. Users can search their archive and even make changes to their settings. The interface for all these services is very user friendly. The project is developed on modern WWW architecture using open source technologies and has easy to use browser based user interface.
World Wide Web archiving, on-line cataloging tools for WWW, online-offline archives.