Checking Links In Your Online Catalog
This guide should help you locate software that can be used to check
links in your catalog if your online catalog does not directly support link
checking. At the time of this writing (5/1999) no online systems could be
identified that directly support the validation of URLs in the MARC 856
field. Hopefully this situation will soon change.
It goes without saying that the World Wide Web is a very changeable
medium; and as more and more libraries begin to catalog networked resources, it
quickly becomes very important to be able to determine which links in the
catalog need up dating. Furthermore, many libraries that participate in
the Federal Depository Library Program subscribe to services such as
Marcive, who have been automatically populating library catalogs with
MARC records that contain 856 fields.
This situation has led many libraries into the predicament of having to validate
thousands of URLs in their catalog...but fortunately each link does not
need to be checked manually.
The task of validating the links in an online catalog is not as simple
as one might initially think. Problems arise because the URLs that you
want to check are buried in the MARC 856 field in your catalog; while the
various software packages that automatically check URLs were designed to
work on HTML documents.
So, in most cases the process can be separated into two steps:
- Create a file (containing URLs or a HTML-based) from your
MARC records
- Select and run a link checking program across the file
Software for creating an HTML page from MARC records
- [http://www.du.edu/~ttyler/freeware/marcxgen.zip] Marcxgen (All
Catalogs) : Marcxgen was created by Tom Tyler (University of
Denver), and is the only general purpose utility for generating HTML from
MARC records (that this author knows of). Marcxgen will take a group of
unblocked MARC records and create an HTML page out of them. Most online
systems will let you output MARC records, and very few systems still use
blocked MARC records, since the blocked format was designed for use with
older magnetic tape systems. You can download the zipped version by
clicking on the link above, which will run in most Windows
environments. It is available for free, and [mailto:ttyler@du.edu] Tom
Tyler (who is very helpful) can be reached at ttyler@odu.edu.
<http://www.du.edu/~ttyler/freeware/marcxgen.zip>
- [http://www.nemonline.org/sirsi/url_checking_options.html] Urlchecker
(Sirsi) : This Perl script was written by Dennis Boone
(Michigan State University), and uses the selcatalog command to
extract links data from a Sirsi catalog, and write the results to a pipe
delimited file.
<http://www.nemonline.org/sirsi/url_checking_options.html>
- Mungeurl (Innovative) : Mike Corlee (University of Missouri)
wrote this Perl script which will transform a review list of 856 fields
into HTML.
As a Perl script, Mungeurl will run in a Unix environment or on Windows
and Mac machines if you have installed Perl (Perl for a variety of
platforms can be downloaded for free from [http://www.perl.com/CPAN-local/ports/index.html] CPAN:
Perl Ports). Mungeurl isn't currently available for download on the
WWW, but
[mailto:mulmcorl@showme.missouri.edu>] Mike Corlee will send
you copy if you contact him at mulmcorle@showme.missouri.edu.
Commercial software for checking HTML links
The benefit of going with a commercial product is that they tend to be
easier
to setup and install, and some of them generate very useful and
attractive repo
rts that can help in the correction of broken URLs
- [http://www.tetranetsoftware.com] Linkbot (Win95/WinNT)
:
An excellent (but costly) linkchecker that is easy to install and run,
which generates very detailed reports on broken links in HTML page(s).
<http://www.tetranetsoftware.com>
- [http://www.visiontec.com/rename/] HTML Rename
(Win95/WinNT/Mac) : this program was initially developed to aid in in
the transfer of web sites between DOS/Windows, Macintosh, and Unix
environments. Part of this package is a link checking utility. It is
reasonably priced, and a shareware evaluation version is available.
<http://www.visiontec.com/rename/>
- [http://www.linkalarm.com] Link Alarm (Web Based) :
This web-based service will email you a detailed report on the broken
links in a specified page at the cost of $20/year. Before subscribing you
might want to check out some of the freely available web-based services
below.
<http://www.linkalarm.com>
- [http://transcend.labs.bt.com/spot] Spot (Sun Solaris)
:
this full featured web site analysis tool will only run in the Sun Solaris
environment, although it can analyze website on any kind of server.
Evaluation version available. <http://transcend.labs.bt.com/spot>
Free software for checking HTML links
The advantage of free software is of course that it is free! Another
benefit is that the programs are open, and can be adapted to your local
needs (if you are so inclined). The downside is that they can take a bit
more time to get up and running (except for the web-based ones!).
- [http://www.netmechanic.com] NetMechanic (Web-based)
: you can submit a URL to NetMechanic, which will check your site,
and email you a report (for free!) <http://www.netmechanic.com>
- [http://www.weblint.org/gateways.html] WebLint Gateways
(Web-based) : this page lists several sites where you can submit a
URL and receive a report on the broken links contained in at that URL.
These services use the WebLint software package listed below, and allow
you to use this software without having to install it locally.
<http://www.weblint.org/gateways.html>
- [http://www.goldwarp.com/bowlin/linklint/] Linklint
(Perl): this shareware program (regular users are encouraged to pay $10) is a
highly configurable, and powerful linkchecking program that will run on
any platform that supports Perl. It has proven to be popular with
libraries at the University of Pennsylvania and the University of
Virginia.<http://www.goldwarp.com/bowlin/linklint/>
- [http://www.ics.uci.edu/pub/websoft/MOMspider] MomSpider Mult
i-Owner Maintenance Spider (Perl/Unix) : MOMSpider is a "web roaming
robot that specializes in the maintenance of distributed hypertext
infostructures"...or HTML pages.
<http://www.ics.uci.edu/pub/websoft/MOMspider>
- [http://members.tripod.com/~rtiess/linkchecker.html] Link
Checker V1.6B (Java) : this Java application requires the
installation of the Java Development Kit, which is freely available from [http://www.javasoft
.com] Sun.
<http://members.tripod.com/~rtiess/linkchecker.html>
- [http://www.weblint.org] Weblint (Perl) : Weblint
is a multipurpose utility that will check the syntax and style of HTML
page(s). One function it performs very well is linkchecking.
<http://www.weblint.org
>
- [http://www.cs.dartmouth.edu/~crow/lvrfy.html] lvrfy
(Unix) : this is a simple Unix shell script which will check the
links in an html page and generate a list of broken links.
<http://www.cs.dartmouth.edu/~crow/lvrfy.htm>
- [http://zoutmijn.bpa.nl/rick/Web/index2.html] Webxref
(Perl)
: will check links in specified page(s), and doesn't require extensive
configuration before running.
<http://zoutmijn.bpa.nl/rick/Web/index2.html>
Created: May 4, 1999
Ed Summers ([mailto:esummers@odu.edu] esummers@odu.edu)
VIVA User Services Committee.
Subcommittee on Cataloging and Intellectual Access
for Spring 1999 Workshop
|