I responded to a little challenge from Matt Layman of the Django Riffs podcast Episode 15, User Session Data.
Toward the end of the episode, Matt asked what might happen if over 4k worth of text was passed into messages
if the backend storage method was cookies?
What I found was unexpected.
First, you can put way more than 4 kb worth of text into a messages
message with a cookie backend and sail right on by. The actual max limit is much higher.
But more importantly, looking into this demonstrated exactly the point of the episode overall: diving into a framework can teach you a lot about what governs high level behavior.
I put a sample Django project up, over4kmessages
along with my exploratory path on this topic on github. The README is as follows:
over4kmessages
This is a test project to demo Django’s behavior when large amounts of data are passed through the messages
application.
Why
Episode 15 of Django Riffs podcast focuses on auth, which includes a detailed look at session
.
Toward the end of the episode, Matt Layman (@mblayman) mentions a limit on the size of cookies and asks listeners to report what happens if too much message data is passed through.
How this test project works
In settings.py
sets the MESSAGE_STORAGE
setting to cookie storage, then the main view takes a trivial form POST and includes a very long string of text.
The contacts.const.py
file contains increasingly large strings that are duplications of Charles Bukowski’s poem, “Style” These were created using this online text size calculator and verified by saving on disk.
The contacts.views.py
file contains an easy way to toggle use any of these text blobs to messages
. The project is by default set to use the 166 kb text.
Result
Increasingly large text blocks well beyond the believed 4k max were still allowed to pass through the cookie storage.
However, somewhere between 72kb and 166kb is too much text.
When the 166kb of text is passed as the message in contacts.views
Django throws an exception:
Not all temporary messages could be stored.
This occurs in django.contrib.messages.middleware.py
Digging
It turns out django limits the max cookie size to 2048, in django.contrib.messages.storage.cooky.CookieStorage.
A comment in the code point out a decade old Django ticket #18781 which details a need to reduce the max cookie size from what was then 3072 created by django to make room for large headers.
This doesn’t explain why 72kb+ sized message would make it through a cookie. Perhaps, compression is involved here!
Searching contrib.messages.storage.cookie.py
for ‘compress’ yields the _encode()
method which passes message
along and a compress=True
argument to django.core.signing.Signer.sign_object
.
sign_object()
has a conditional for compress
that shows the python standard library, zlib
is being used.
That is what is allowing these larger messages to make it through.
New questions
- What is the true max text size that can be compressed using
zlib.compress()
to duck the 2048 threshold for cookie
storage? - Should django still be using
zlib.compress()
to pack data into cookies? - Really, if we’re passing a long a message to our user on page load, should it be longer than 140 chars anyway?
(Probably not but that’s the opposite of the point of all of this!)
A few more thoughts
After realizing some compression was being done, I wondered why Django chose zlib.compress and whether a faster, more efficient algorithm might exist. I have a few links I came across looking at this idea below.
However, what was more interesting was how deep you can go to find out how a web framework is designed and why. It pointed out a Django core method I wasn’t aware of, and touched on using the Python standard library to get something done.
I used Pycharm to jump into routines and pause the debugger where the exception occurred. A simple question can lead to a deep exploration if you’re willing to debug.
References for more on compression
- Writing a custom compression method to outperform
zlib.compress()
:
Don Cross, Winning the Data Compression Game - Question related to improving on
zlib.compress()
:
Stack Overflow, zlib compress() produces awful compression rate