Storing a binary hash value in a Django model field
Asked Answered
S

5

9

I have a twenty byte hex hash that I would like to store in a django model. If I use a text field, it's interpreted as unicode and it comes back garbled.

Currently I'm encoding it and decoding it, which really clutters up the code, because I have to be able to filter by it.

def get_changeset(self):
    return bin(self._changeset)

def set_changeset(self, value):
    self._changeset = hex(value)

changeset = property(get_changeset, set_changeset)

Here's an example for filtering

Change.objects.get(_changeset=hex(ctx.node()))

This is the approach that was recommended by a django developer, but I'm really struggling to come to terms with the fact that it's this ugly to just store twenty bytes.

Maybe I'm too much of a purist, but ideally I would be able to write

Change.objects.get(changeset=ctx.node())

The properties allow me to write:

change.changeset = ctx.node()

So that's as good as I can ask.

Seigler answered 5/2, 2009 at 18:51 Comment(0)
B
3

I'm assuming if you were writing raw SQL you'd be using a Postgres bytea or a MySQL VARBINARY. There's a ticket with a patch (marked "needs testing") that purportedly makes a field like this (Ticket 2417: Support for binary type fields (aka: bytea in postgres and VARBINARY in mysql)).

Otherwise, you could probably try your hand at writing a custom field type.

Brie answered 5/2, 2009 at 19:23 Comment(2)
n.b. despite this answer being four years old, BinaryField isn't in the latest release of Django (1.5) but is in the current development version.Gunzburg
It's now available as BinaryField docs.djangoproject.com/en/dev/ref/models/fields/#binaryfieldVicereine
S
4

Starting with 1.6, Django has BinaryField allowing to store raw binary data. However, for hashes and other values up to 128 bits it's more efficient (at least with the PostgreSQL backend) to use UUIDField available in Django 1.8+.

Sankey answered 9/8, 2017 at 12:24 Comment(1)
BinaryField does not support query, which is sadHampson
B
3

I'm assuming if you were writing raw SQL you'd be using a Postgres bytea or a MySQL VARBINARY. There's a ticket with a patch (marked "needs testing") that purportedly makes a field like this (Ticket 2417: Support for binary type fields (aka: bytea in postgres and VARBINARY in mysql)).

Otherwise, you could probably try your hand at writing a custom field type.

Brie answered 5/2, 2009 at 19:23 Comment(2)
n.b. despite this answer being four years old, BinaryField isn't in the latest release of Django (1.5) but is in the current development version.Gunzburg
It's now available as BinaryField docs.djangoproject.com/en/dev/ref/models/fields/#binaryfieldVicereine
K
3

"I have a twenty byte hex hash that I would like to store in a django model."

Django does this. They use hex digests, which are -- technically -- strings. Not bytes.

Do not use someHash.digest() -- you get bytes, which you cannot easily store.

Use someHash.hexdigest() -- you get a string, which you can easily store.

Edit -- The code is nearly identical.

See http://docs.python.org/library/hashlib.html

Krenek answered 6/2, 2009 at 0:59 Comment(2)
Using a different encoding doesn't make the code any cleaner. If I still have to encode and decode I haven't gained anything.Seigler
Sorry if my answer confused you. I've revised it. digest() and hexdigest() are nearly identical. Except you can persist hexdigest(). You can't easily persist digest().Krenek
K
2

You could also write your own custom Model Manager that does the escaping and unescaping for you.

Kaycekaycee answered 5/2, 2009 at 19:52 Comment(0)
V
1

If this issue is still of interest, Disqus' django-bitfield fits the bill:

https://github.com/disqus/django-bitfield

... the example code on GitHub is a little confusing at first w/r/t the modules' actual function, because of the asinine variable names -- generally I am hardly the sort of person with either the wherewithal or the high ground to take someone elses' goofy identifiers to task... but flaggy_foo?? Srsly, U guys.

If that project isn't to your taste, and you're on Postgres, you have a lot of excellent options as many people have written and released code for an assortment of Django fields that take advantage of Postgres' native type. Here's an hstore model field:

https://github.com/jordanm/django-hstore -- I have used this and it works well.

Here's a full-text search implementation that uses Postgres' termvector types:

https://github.com/aino/django-pgindex

And while I cannot vouch for this specific project, there are Django bytea fields as well:

https://github.com/aino/django-arrayfields

Viewfinder answered 19/9, 2012 at 14:22 Comment(1)
Personally I store all my hex hashes as text, but I never have had to create indexes on any of them so seek performance hasn't been an issue (I take it you are facing something like that)Viewfinder

© 2022 - 2024 — McMap. All rights reserved.