expand Archives - Closer to Code

Today I've got a quite interesting piece of code, used to expand urls (join them when they are in parts). For example:

expand_url('http://www.test.com/', '/me') 
  => http://www.test.com/me
expand_url('http://www.test.com/abc', 'test') 
  => http://www.test.com/abc/test

Code was quite "crazy" (not mine):

import re
from urlparse import urlparse, urljoin, urlunparse

def expand_url(home, url):
    if re.match(r"^\w+\://", url):
        return url
    else:
        parts = home.split('/')
        if len(parts) > 2:
            if re.match(r"^/", url):
                return "%s//%s%s" % (parts[0], parts[2], url)
            else:
                url = url.split('/')
                if url[0] == '.':
                    del(url[0])
                proto = parts.pop(0)
                return "%s//%s" % (proto, "/".join(parts[1:-1] + url))
        else:
            return False

and it had one big disadvantage. When expanding urls with relative parts it didn't include hierarchies levels, so the output urls looked like this:

expand_url('http://www.test.com/', './../me')
  => http://www.test.com/./../me
expand_url('http://www.test.com/abc', './../../test') 
  => http://www.test.com/abc/./../../test

Lil bit messy I think. After googling I've found a nice and smaller expanding method (which works like a charm):

import posixpath
from urlparse import urlparse, urljoin, urlunparse

def expand_url(home, url):
    join = urljoin(home,url)
    url2 = urlparse(join)
    path = posixpath.normpath(url2[2])

    return urlunparse(
        (url2.scheme,url2.netloc,path,url2.params,url2.query,url2.fragment)
        )

expand_url('http://www.test.com/', './../me') 
  => http://www.test.com/me
expand_url('http://www.test.com/abc', './../../test') 
  => http://www.test.com/test

Tag: expand

Relative and absolute urls expanding in Python