Chapter 5: Plug-in 23 - Check Links
The two previous plug-ins provide the foundation for being able to crawl the Internet by:
- Reading in a third party web page
- Extracting all URLs from the page
- Converting all the URLs to absolute
Armed with these abilities it's now a simple matter for this plug-in to offer the facility to check all links on a web page and test whether the pages they refer to actually load or not; a great way to alleviate the frustration of your users upon encountering dead links or mistyped URLs.
The Figure shows links on a website being checked.
The plug-in has been run on the alexa.com homepage, with all URLs reported present and correct
|