WWW Issues


Well I'm certainly sorry that there is only one article in our Web Issues area, but that's one better than none, eh? One day Krazy Dan answered the following question for the 10th time and realized the answer needed to be posted here for the benefit of all. The question is:

"I've got a problem setting up my SY:HTTPD.CON file. My TSX-Online users use the "tilde" construct to identify their web page. When people remember to put a trailing backslash at the end of the URL, you can see the graphics fine, but when people do NOT put a trailing backslash at the end of the URL, the graphics don't show up. What do I need to do in the HTTPD.CON file to make this work?"

Now, John Mott says that the answer to the questions about HTTPD.CON can be answered using one of the following 5 techniques:

  • Look in the HTTPRILE event log.
  • Look in the HTTPRULE event log.
  • Look in the HTTPRULE event log.
  • Look in the HTTPRULE event log.
  • Look in the HTTPRULE event log.

    People think John's just being obstructive, but he's right! Let's take a real live example -- with our thanks to Harry Moyles -- and work through it. Harry reported that when hitting the web site:

        http://evansbbs.com/~sunshine/
    

    everything works hunky-dory -- the graphics show, but when you hit the web site:

        http://evansbbs.com/~sunshine
    

    everything is NOT so hunky-dory, namely, the graphics do NOT show. So what's happening? First, let's have a look in the HTTPRULE event log for the SUCCESSFUL case, with the trailing backslashes, which permits the graphics to show up. I'm only showing you the first request, for the home page, and the first request for a graphics file. Sunshine has several graphics files named in her home page but the principle is the same for all of them:

        Searching for "/~sunshine/"
        Match: applied template "/__c__/web/0/0/19/sunshine.htm"
        Match: result is "/__c__/web/0/0/19/sunshine.htm"
        Passing..... "/__c__/web/0/0/19/sunshine.htm"
    
        Searching for "/~sunshine/sunigm.gif"
        Match: applied template "/__c__/web/0/0/19/*.gif"
        Match: result is "/__c__/web/0/0/19/sunigm.gif"
        Passing..... "/__c__/web/0/0/19/sunigm.gif"
    

    What we ascertain from the first request is that Sunshine's personal web area resides in directory c:\web\0\0\19\, and that her registered home page is sunshine.htm. Now, why the heck would the next request that comes in be for "/~sunshine/sunigm.gif"? Is this what Sunshine put in her home page HTML code? No. In fact, her home page has this line in it:

        <IMG SRC="sunigm.gif">
    

    Now here is a RELEVANT question: why would a web browser (I used Microsoft Internet Explorer for this test) take a name of the form "sunigm.gif" and turn it into a name of the form "~sunshine/sunigm.gif"? The answer is that the web browser decided that "sunigm.gif" must be a partial URL, and that the REST of the URL should come from the name of the html document containing it. In other words, Internet Explorer constructed the complete url for "sunigm.gif" by combining the missing parts of it (i.e. the directory name) with the name of its parent document, "~sunshine/".

    The programmer in me wants to make you think for a minute about how you would go about extracting the directory portion of a filename. Remember, the world wide web consists of many different file systems with many different syntaxes, but they all tend to uses forward slashes to indicate components of the directory name. Consider the file named:

        \cappannari\finances\1997\tax-return.irs
    

    We all agree that the directory portion is:

        \cappannari\finances\1997\
    

    We "visually" scanned backwards, from the right side, until we came to the first slash, and chopped off the filename portion to get to the directory portion, right? So Internet Explorer did the correct, rational, and wise thing when it looked at:

        \~sunshine\
    

    and decided that the directory name is:

        \~sunshine\
    

    There was no filename in this url to chop off, but that's ok. The filename was supplied by TSX when I hit the page. But even if I had connected to:

        \~sunshine\sunshine.html
    

    the browser would still have figured the same directory name. OK, you are in the groove, I hope, let's now look at the HTTPRULE event log for the failing case:

        Searching for "/~sunshine"
        Match: applied template "/__c__/web/0/0/19/sunshine.htm"
        Match: result is "/__c__/web/0/0/19/sunshine.htm"
        Passing..... "/__c__/web/0/0/19/sunshine.htm"
    
        Searching for "/sunigm.gif"
        Match: applied template "/__c__/web/*"
        Match: result is "/__c__/web/sunigm.gif"
        Passing..... "/__c__/web/sunigm.gif"
        Error 30010005 opening c:\web\sunigm.gif
    

    What went wrong? Well, I'll tell you what went wrong: the web browser saw the string "sunigm.gif" and wanted to supply the directory portion. So it scanned the initial url until it hit a backslash, trying to chop off the filename portion, just like before:

        original url was:    "/~sunshine"
        so the directory is: "/"
    

    Hence the final full URL that comes in is "/sunigm.gif". The entire tilde construct has been hacked off by the web browser. How do you think you are going to put something in HTTPD.CON that is going to find Sunshine's directory, from that? YOU CAN'T! IT CAN NOT BE SOLVED! THERE IS NO WAY THAT /mott.htm could map to \0\0\23\mott.htm while \sunigm.htm mapped to \0\0\19\sunigm.htm. There's just not enough information provided by the REQUEST.

    Let me repeat myself: there is nothing you can put in HTTPD.CON to fix this problem. Let me repeat John: if you want to figure out problems finding pieces of web pages, look in the HTTPRULE event log.

    So what constructive advise can we give? Lots:

  • Tell your tilde type users -- the ones whose names are known to the TSX-Online system, authorized via SYSOP, who access your system via LOGON.TPL, and who are authorized by having the TPR.EXP communicate with the name server -- in other words, "bbs users" -- that putting constructs of the form "~sunshine\sunigm.gif" in their web pages will help browsers find the graphic files when people forget to put the trailing backslash on the url.
  • For customers who have big web sites aching to move over to your system, who are big customers, and are going to have their own domain, this won't be a problem. That's because the domain name portion of the url will be different than yours, and no tilde construct is involved. The IPMAP statement in HTTPD.CON will tell the web server that all their stuff is in the \acme directory, or whatever, so no deep \web\0\0\19 stuff will be involved. These people will get authorized via TSAUTH and log on with the LOGON program, but that's the topic of another paper.