I responded to a little challenge from Matt Layman of the Django Riffs podcast Episode 15, User Session Data.
Toward the end of the episode, Matt asked what might happen if over 4k worth of text was passed into
messages if the backend storage method was cookies?
What I found was unexpected.
First, you can put way more than 4 kb worth of text into a
messages message with a cookie backend and sail right on by. The actual max limit is much higher.
But more importantly, looking into this demonstrated exactly the point of the episode overall: diving into a framework can teach you a lot about what governs high level behavior.
I put a sample Django project up,
over4kmessages along with my exploratory path on this topic on github. The README is as follows:
This is a test project to demo Django’s behavior when large amounts of data are passed through the
Episode 15 of Django Riffs podcast focuses on auth, which includes a detailed look at
Toward the end of the episode, Matt Layman (@mblayman) mentions a limit on the size of cookies and asks listeners to report what happens if too much message data is passed through.
How this test project works
settings.py sets the
MESSAGE_STORAGE setting to cookie storage, then the main view takes a trivial form POST and includes a very long string of text.
contacts.const.py file contains increasingly large strings that are duplications of Charles Bukowski’s poem, “Style” These were created using this online text size calculator and verified by saving on disk.
contacts.views.py file contains an easy way to toggle use any of these text blobs to
messages. The project is by default set to use the 166 kb text.
Increasingly large text blocks well beyond the believed 4k max were still allowed to pass through the cookie storage.
However, somewhere between 72kb and 166kb is too much text.
When the 166kb of text is passed as the message in
contacts.views Django throws an exception:
Not all temporary messages could be stored.
This occurs in
It turns out django limits the max cookie size to 2048, in django.contrib.messages.storage.cooky.CookieStorage.
A comment in the code point out a decade old Django ticket #18781 which details a need to reduce the max cookie size from what was then 3072 created by django to make room for large headers.
This doesn’t explain why 72kb+ sized message would make it through a cookie. Perhaps, compression is involved here!
contrib.messages.storage.cookie.py for ‘compress’ yields the
_encode() method which passes
message along and a
compress=True argument to
sign_object() has a conditional for
compress that shows the python standard library,
zlib is being used.
That is what is allowing these larger messages to make it through.
- What is the true max text size that can be compressed using
zlib.compress()to duck the 2048 threshold for cookie
- Should django still be using
zlib.compress()to pack data into cookies?
- Really, if we’re passing a long a message to our user on page load, should it be longer than 140 chars anyway?
(Probably not but that’s the opposite of the point of all of this!)
A few more thoughts
After realizing some compression was being done, I wondered why Django chose zlib.compress and whether a faster, more efficient algorithm might exist. I have a few links I came across looking at this idea below.
However, what was more interesting was how deep you can go to find out how a web framework is designed and why. It pointed out a Django core method I wasn’t aware of, and touched on using the Python standard library to get something done.
I used Pycharm to jump into routines and pause the debugger where the exception occurred. A simple question can lead to a deep exploration if you’re willing to debug.
References for more on compression
- Writing a custom compression method to outperform
Don Cross, Winning the Data Compression Game
- Question related to improving on
Stack Overflow, zlib compress() produces awful compression rate