download cadillac man divx OK, so at work, I’ve been perplexed by the myriad of ways we link to things. The link
checkers we have are not very good, and just follow textual links. The one that does follow javascript links is flaky and hangs my honking W2K analysis machine when working locally. I won’t mention any names [ahem ... Coast], but otherwise, it does a very nice job of sending me perl
-parse-able results in email. So, I finally figured out what to do. Write a perl script that loops over an entire directory, building a link to each html and cfm, or jsp page, what-have-you. Then, point the link checker to that site-map page and you’re guaranteed not to miss checking any pages that are orphaned. Whew, that one was giving me heartburn for quite a while.
#!/usr/local/bin/perl
# 1/12/2003 - Greg Rushton
# greg {at} gregrushton(.)com
#
# Designed to map an entire directory of pages
# so that a link checker can find anything that's
# orphaned/linked to unusually and check the links on it.
use CGI ':standard';
use CGI::Carp 'fatalsToBrowser';
use File::Find;
my @dirs = (".");
my @pages = ();
my @rows = ();
find (\&get_html_files, @dirs);
foreach (@pages){
($relurl = $_) =~ s/\./http:\/\/your.domain.com/;
push @rows, td("<a href="$relurl" target="new">$relurl<\/a>");
}
print header,
start_html("Site Listing"),
h3("Site Listing"),
p("Currently looking at @dirs"),
hr,
table ( {-border=>1}, Tr( \@rows ) ),
end_html;
### Subroutines ###
sub get_html_files {
push @pages, $File::Find::name, if (m/\.htm|\.cfm/i);
}