Today I've got a quite interesting piece of code, used to expand urls (join them when they are in parts). For example:
expand_url('http://www.test.com/', '/me') => http://www.test.com/me expand_url('http://www.test.com/abc', 'test') => http://www.test.com/abc/test
Code was quite "crazy" (not mine):
import re from urlparse import urlparse, urljoin, urlunparse def expand_url(home, url): if re.match(r"^\w+\://", url): return url else: parts = home.split('/') if len(parts) > 2: if re.match(r"^/", url): return "%s//%s%s" % (parts[0], parts[2], url) else: url = url.split('/') if url[0] == '.': del(url[0]) proto = parts.pop(0) return "%s//%s" % (proto, "/".join(parts[1:-1] + url)) else: return False
and it had one big disadvantage. When expanding urls with relative parts it didn't include hierarchies levels, so the output urls looked like this:
expand_url('http://www.test.com/', './../me') => http://www.test.com/./../me expand_url('http://www.test.com/abc', './../../test') => http://www.test.com/abc/./../../test
Lil bit messy I think. After googling I've found a nice and smaller expanding method (which works like a charm):
import posixpath from urlparse import urlparse, urljoin, urlunparse def expand_url(home, url): join = urljoin(home,url) url2 = urlparse(join) path = posixpath.normpath(url2[2]) return urlunparse( (url2.scheme,url2.netloc,path,url2.params,url2.query,url2.fragment) )
expand_url('http://www.test.com/', './../me') => http://www.test.com/me expand_url('http://www.test.com/abc', './../../test') => http://www.test.com/test