Generating Non-HTML Content

Usually when we talk about developing Web sites, we’re talking about producing HTML. Of course, there’s a lot more to the Web than HTML; we use the Web to distribute data in all sorts of formats: RSS,
PDFs, images, and so forth.

So far, we’ve focused on the common case of HTML production, but in this chapter we’ll take a detour and look at using Django to produce other types of content. Django has convenient built-in tools that you can use to produce some common non-HTML content:

  • Comma-delimited (CSV) files for importing into spreadsheet applications.
  • PDF files.
  • RSS/Atom syndication feeds.
  • Sitemaps (an XML format originally developed by Google that gives hints to search engines).

We’ll examine each of those tools a little later, but first we’ll cover the basic principles.

The basics: views and MIME types

Recall from Chapter 2 that a view function is simply a Python function that takes a Web request and returns a Web response. This response can be the HTML contents of a Web page, or a redirect, or a 404 error, or an XML document, or an image …or anything, really. More formally, a Django view function must:

  1. Accept an HttpRequest instance as its first argument; and
  2. Return an HttpResponse instance.

The key to returning non-HTML content from a view lies in the HttpResponse class, specifically the content_type argument. By default, Django sets content_type to “text/html”. You can however, set content_type to any of the official Internet media types (MIME types) managed by IANA.

By tweaking the MIME type, we can indicate to the browser that we’ve returned a response of a different format. For example, let’s look at a view that returns a PNG image. To keep things simple,
we’ll just read the file off the disk:

from django.http import HttpResponse

def my_image(request):
    image_data = open("/path/to/my/image.png", "rb").read()
    return HttpResponse(image_data, content_type="image/png")

That’s it! If you replace the image path in the open() call with a path to a real image, you can use this very simple view to serve an image, and the browser will display it correctly.

The other important thing to keep in mind is that HttpResponse objects implement Python’s standard file-like object API. This means that you can use an HttpResponse instance in any place Python (or a third-party library) expects a file. For an example of how that works, let’s take a look at producing CSV with Django.

Producing CSV

Python comes with a CSV library, csv. The key to using it with Django is that the csv module’s CSV-creation capability acts on file-like objects, and Django’s HttpResponse objects are file-like objects. Here’s an example:

import csv
from django.http import HttpResponse

def some_view(request):
    # Create the HttpResponse object with the appropriate CSV header.
    response = HttpResponse(content_type='text/csv')
    response['Content-Disposition'] = 'attachment; 
      filename="somefilename.csv"'

    writer = csv.writer(response)
    writer.writerow(['First row', 'Foo', 'Bar', 'Baz'])
    writer.writerow(['Second row', 'A', 'B', 'C', '"Testing"'])

    return response  

The code and comments should be self-explanatory, but a few things deserve a mention:

  • The response gets a special MIME type, text/csv. This tells browsers that the document is a CSV file, rather than an HTML file. If you leave this off, browsers will probably interpret the output as HTML, which will result in ugly, scary gobbledygook in the browser window.
  • The response gets an additional Content-Disposition header, which contains the name of the CSV file. This filename is arbitrary; call it whatever you want. It’ll be used by browsers in the Save as…
    dialogue, etc.
  • Hooking into the CSV-generation API is easy: Just pass response as the first argument to csv.writer. The csv.writer function expects a file-like object, and HttpResponse objects fit the bill.
  • For each row in your CSV file, call writer.writerow, passing it an iterable object such as a list or tuple.
  • The CSV module takes care of quoting for you, so you don’t have to worry about escaping strings with quotes or commas in them. Just pass writerow() your raw strings, and it’ll do the right thing.

Streaming large CSV files

When dealing with views that generate very large responses, you might want to consider using Django’s StreamingHttpResponse instead. For example, by streaming a file that takes a long time to generate you can avoid a load balancer dropping a connection that might have otherwise timed out while the server was generating the response. In this example, we make full use of Python generators to efficiently handle the assembly and transmission of a large CSV file:

import csv

from django.utils.six.moves import range
from django.http import StreamingHttpResponse

class Echo(object):
    """An object that implements just the write method of the file-like
    interface.
    """
    def write(self, value):
        """Write the value by returning it, instead of storing in a buffer."""
        return value

def some_streaming_csv_view(request):
    """A view that streams a large CSV file."""
    # Generate a sequence of rows. The range is based on the maximum number of
    # rows that can be handled by a single sheet in most spreadsheet
    # applications.
    rows = (["Row {}".format(idx), str(idx)] for idx in range(65536))
    pseudo_buffer = Echo()
    writer = csv.writer(pseudo_buffer)
    response = StreamingHttpResponse((writer.writerow(row)  
      for row in rows), content_type="text/csv")
    response['Content-Disposition'] = 'attachment;    
      filename="somefilename.csv"'
    return response 

Using The Template System

Alternatively, you can use the Django template system to generate CSV. This is lower-level than using the convenient Python csv module, but the solution is presented here for completeness. The idea here is to pass a list of items to your template, and have the template output the commas in a for loop. Here’s an example, which generates the same CSV file as above:

from django.http import HttpResponse
from django.template import loader, Context

def some_view(request):
    # Create the HttpResponse object with the appropriate CSV header.
    response = HttpResponse(content_type='text/csv')
    response['Content-Disposition'] = 'attachment;    
      filename="somefilename.csv"'

    # The data is hard-coded here, but you could load it 
    # from a database or some other source.
    csv_data = (
        ('First row', 'Foo', 'Bar', 'Baz'),
        ('Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"),
    )

    t = loader.get_template('my_template_name.txt')
    c = Context({'data': csv_data,})
    response.write(t.render(c))
    return response The only difference between this example and the previous example\
 is that this one uses template loading instead of the CSV module. The rest of the co\
de - such as the `content_type='text/csv'` - is the same. Then, create the template `\
my_template_name.txt`, with this template code:

{% for row in data %}
            "{{ row.0|addslashes }}",
            "{{ row.1|addslashes }}",
            "{{ row.2|addslashes }}",
            "{{ row.3|addslashes }}",
            "{{ row.4|addslashes }}"
{% endfor %}

This template is quite basic. It just iterates over the given data and displays a line of CSV for each row. It uses the addslashes template filter to ensure there aren’t any problems with quotes.

Other Text-Based Formats

Notice that there isn’t very much specific to CSV here – just the specific output format. You can use either of these techniques to output any text-based format you can dream of. You can also use a similar technique to generate arbitrary binary data; For example, generating PDFs.