[SlugBug] Testing server links

Alex Hudson home at alexhudson.com
Fri Jul 18 02:57:45 BST 2003


On Fri, Jul 18, 2003 at 01:46:55AM +0100, Beneath wrote:
> > What the OP asked for was something that would test possible links
> > given a list of files in a filesystem.
> 
> I thought the simple problem was just checking seeing if all the <a
> href's (and equivilent JavaScript code and anything else, etc) led to
> files that actually existed. 

Nah. He said, "all the weblinks that it served". Something doesn't actually
have to be on an HTML page for it to be a weblink. After all, you don't know
who is linking to what.

> It seems a kind of overcomplicated way of doing things if you're going
> to try parse every single webserver related file... and it becomes
> EXTREMELY complicated if you have Perl/PHP/etc generated content. The
> logical answer to me is just let the webserver do all that work, and
> yourself write a simple script that accesses the webserver, parses the
> HTML, follows links, and checks HTTP response codes.

Sure, but how to you know what URLs map to which files? The short answer is
that you don't, although you can make decent guesses if you know the various
rules on one site that pertain to how you get something. 

> If he's not trying to do what i just said above (he did /say/
> 'weblinks') then i'm probably stupid and misunderstood the original post
> :)

All I'm saying is that checking <a href="">s isn't necessarily enough - to
check all the weblinks a server is serving requires a lot, lot more magic.
Sadly, that is often the interesting question too.

Cheers,

Alex.



More information about the SlugBug mailing list