Today I've got a quite interesting piece of code, used to expand urls (join them when they are in parts). For example:
expand_url('http://www.test.com/', '/me')
=> http://www.test.com/me
expand_url('http://www.test.com/abc', 'test')
=> http://www.test.com/abc/test
Code was quite "crazy" (not mine):
import re
from urlparse import urlparse, urljoin, urlunparse
def expand_url(home, url):
if re.match(r"^\w+\://", url):
return url
else:
parts = home.split('/')
if len(parts) > 2:
if re.match(r"^/", url):
return "%s//%s%s" % (parts[0], parts[2], url)
else:
url = url.split('/')
if url[0] == '.':
del(url[0])
proto = parts.pop(0)
return "%s//%s" % (proto, "/".join(parts[1:-1] + url))
else:
return False
and it had one big disadvantage. When expanding urls with relative parts it didn't include hierarchies levels, so the output urls looked like this:
expand_url('http://www.test.com/', './../me')
=> http://www.test.com/./../me
expand_url('http://www.test.com/abc', './../../test')
=> http://www.test.com/abc/./../../test
Lil bit messy I think. After googling I've found a nice and smaller expanding method (which works like a charm):
import posixpath
from urlparse import urlparse, urljoin, urlunparse
def expand_url(home, url):
join = urljoin(home,url)
url2 = urlparse(join)
path = posixpath.normpath(url2[2])
return urlunparse(
(url2.scheme,url2.netloc,path,url2.params,url2.query,url2.fragment)
)
expand_url('http://www.test.com/', './../me')
=> http://www.test.com/me
expand_url('http://www.test.com/abc', './../../test')
=> http://www.test.com/test