Harold Hallikainen wrote: >> Dave Tweed wrote: >> >>>> To find out whether a path is relative or absolute depends on what >>>> kind of paths you can expect on the input. If it's either a >>>> complete http URL (starting with "http:") or a relative path, then >>>> that's it: check for a starting "http:". >>> >>> The presence or absence of the optional scheme field does not >>> determine whether a URL path is absolute or relative -- it's the >>> leading slash after that (single or double) that tells you. >> >> That depends on what the specific spec of the OP determines. It >> could very well be that a path "/NormallyConsideredAbsolutePath/file.ext" >> is considered a "relative" path, relative to a base path, say >> "http://myserver/" (or even relative to >> "http://myserver/MyTopLevelDir/"). It all depends on the specific >> situation and how the "relative" paths are created. >> >> This also determines to what depths he needs to go in parsing the >> paths. > > This is what I'm afraid of... The proposed standard just says the URL can > be absolute or relative. It does not limit how convoluted it may be. Convolution towards the end doesn't have to bother you. The only thing that matters is the beginning. Dave seemed to imply that a relative URL may include a scheme or a server (the "user:password@host:port" part he mentioned). I'm not sure this is correct. If it isn't, you're still back to the simplicity of my post a few posts back: if it starts with a scheme ("http:" in your case) you consider it absolute, if not, consider it relative. However, if your base URL contains a path part in addition to the server (like "http://myserver/basepath/"), things become a bit ambiguous. What means a "relative" URL of the type "/path1/path2/file.ext", in the context of this base URL? Would this expand to (accepting the base URL as base for all paths that don't specify a server) or to (accepting the idea that a starting slash means the top level directory that is accessible on the specified server)? I don't think that there is a standard for this, so it has to be specified for this application. So you need a spec of some kind... Either restricting the base URL to URLs that don't contain a path part, or defining what it means if the base URL contains a path and the "relative" URL starts with a slash. > In evaluating URLs, I'd always thought that // preceded the host, ... AFAIK it does, but it may also be part of the path. > ... and / preceded the path or file. Not really -- it's a separator, and if not preceded by a path part means the root. > In a relative URL, we should never see //. Not at the beginning, but in the middle of the path part it's perfectly legal. > If the URL starts with /, we're starting at the base or root of the > server directory tree. Actually, if it starts with /, you're referencing the local file system. Unless, of course, it is in the context of some base URL -- but then it depends on that base URL. > I just tried accessing a URL by adding the relative URL to the > previous URL, and it seems to work. This will make it a lot easier > than my trying to make the URL absolute. My test URL was > http://www.hallikainen.com/FccRules/2009/36/382/../../../2008/36/382/index.php Yes, this is perfectly normal. So are http://www.hallikainen.com/FccRules/2009/36/382/..//../../2008/36/382/index.php http://www.hallikainen.com/FccRules/2009/36/382/.././../../2008/36/382/index.php Gerhard -- http://www.piclist.com PIC/SX FAQ & list archive View/change your membership options at http://mailman.mit.edu/mailman/listinfo/piclist