Unicode Stuff In Django

  1. Creating the database with encoding of utf-8

  2. django.utils.encoding functions:

    • smart_unicode
    • force_unicode
    • smart_str

    Normally, you'll only need to use smart_unicode(). Call it as early as possible on any input data that might be either Unicode or a bytestring, and from then on, you can treat the result as always being Unicode.

  3. The functions django.utils.http.urlquote() and django.utils.http.urlquote_plus() are versions of Python’s standard urllib.quote() and urllib.quote_plus() that work with non-ASCII characters. (The data is converted to UTF-8 prior to encoding.)

    The iri_to_uri() function will not change ASCII characters that are otherwise permitted in a URL. So, for example, the character ‘%’ is not further encoded when passed to iri_to_uri(). This means you can pass a full URL to this function and it will not mess up the query string or anything like that.

    An example might clarify things here:

    urlquote(u'Paris & Orléans') u'Paris%20%26%20Orl%C3%A9ans' iri_to_uri(u'/favorites/François/%s' % urlquote(u'Paris & Orléans')) '/favorites/Fran%C3%A7ois/Paris%20%26%20Orl%C3%A9ans'

If you look carefully, you can see that the portion that was generated by urlquote() in the second example was not double-quoted when passed to iri_to_uri(). This is a very important and useful feature.

  • All strings are returned from the database as Unicode strings.

    In particular, rather than giving your model a __str__() method, we recommended you implement a __unicode__() method.

  • Taking care in get_absolute_url()

    def get_absolute_url(self): url = u'/person/%s/?x=0&y=0' % urlquote(self.location) return iri_to_uri(url)

  • Always return Unicode strings from a template tag's render() method and from template filters. Use force_unicode() in preference to smart_unicode() in these places.

