rpkilby/jsonfield

Problems withs PostGres 9.3, JSON Field and Query Distinct

ewjoachim opened this issue · 20 comments

As mentionned in #47, using ".distinct()", which is used by the admin panel creating users but also in several other cases in django processes, triggers the same bug as #47 describes.

A workaround has been found by @mkhattab :
The work around for this in the Django admin is to subclass QuerySet and Manager and override the distinct method [to remove the JSON fields from it].

What would be the definite fix ?

@ewjoachim do you have a test case by chance I could use to build a fix on top of? Not easily able to re-produce the issue on my end.

Got the same error by doing a distinct or an annotate on a field that has a json field.

Edit : It seems this guy runs into the same problem : http://stackoverflow.com/questions/19117933/django-db-error-could-not-identify-an-equality-operator-for-type-json-when-tryi
2nd Edit : I wrote a workaround, explained in the link below, involving writing the operator PGSQL asks for.

This is probably related:

I tried running the OrderedDictSerializationTest class which came in the out-of-the-box tests.py for 0.9.20. I'm using Postgres 9.3.2 and Django 1.5. The class creates the OrderedDict, but I get "no tests were found" in my (PyCharm) TestRunner. Also, "print" commands at the start of the test_*() methods don't execute. (Traceback below.) I tried commenting out the db_type function in JSONFieldBase and that worked better.

(Partial) Traceback:
  File "/usr/local/lib/python2.7/dist-packages/django/test/simple.py", line 367, in run_tests
    old_config = self.setup_databases()
  File "/usr/local/lib/python2.7/dist-packages/django/test/simple.py", line 315, in setup_databases
    self.verbosity, autoclobber=not self.interactive)
  File "/usr/local/lib/python2.7/dist-packages/django/db/backends/creation.py", line 293, in create_test_db
    load_initial_data=False)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 161, in call_command
    return klass.execute(*args, **defaults)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 255, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 385, in handle
    return self.handle_noargs(**options)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/commands/syncdb.py", line 102, in handle_noargs
    cursor.execute(statement)
  File "/usr/local/lib/python2.7/dist-packages/django/db/backends/postgresql_psycopg2/base.py", line 58, in execute
    six.reraise(utils.DatabaseError, utils.DatabaseError(*tuple(e.args)), sys.exc_info()[2])
  File "/usr/local/lib/python2.7/dist-packages/django/db/backends/postgresql_psycopg2/base.py", line 54, in execute
    return self.cursor.execute(query, args)
DatabaseError: data type json has no default operator class for access method "btree"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.

I had this issue popping up in Admin list views for models that contained JSON columns (even though I wasn't showing those field in the view). Interestingly enough, when the view is called, it doesn't actually call the distinct method on the Manager. My workaround was to subclass the manager, and override the get_query_set method as follows:

def get_queryset(self):
    """
    Returns a new QuerySet object which excludes JSON `key_values` column.
    """
    return QuerySet(self.model, using=self._db).defer('key_values')

Solved my case.

Model.some_query_here.order_by('id').distinct('id')

Model.some_query_here.order_by('-views_count', 'id').distinct('views_count', 'id')

Hope, this helps to somebody

when will you release a version that gets rid of the Postgres JSON column ?

@diwu1989 you can use the branch at https://github.com/bradjasper/django-jsonfield/tree/postgresql for now which is supported and kept up-to-date with master. Will likely be making it a official release soon.

I have tried to install the postgresql version from github, but I keep getting error code 128 in none. Do you know when you will be merging the two branches? Or can you post instructions to manually install from github.

The PostgreSQL change has been merged into master and the version has been bumped to 1.0.0. This is a backwards incompatible change for anyone using PostgreSQL & django-jsonfield 0.9.x.

ksze commented

If I understand correctly, it should be completely ok to upgrade jsonfield to 1.0.0 if I have never created JSONFields in PostgreSQL 9.2+, correct?

i.e.
If using PostgreSQL 9.2+, but no Django model has any existing JSONField yet => ok
If using PostgreSQL 9.2, then it depends on whether existing JSONFields were created pre- or post- c2b5907.
If using PostgreSQL 9.1 or lower, then existing JSONFields have always been backed by PostgreSQL's text type => ok

@ksze this looks right. The easiest way to be sure is to look at your column type in the database. If it's "text" you're ok. If it's "json" then it needs to be migrated or you should stick to an older version of django-jsonfield.

Hi,

I'm interested in updating to 1.0.0 but I am using postgres 9.3 and have models with the json column type from before the change. If I update to 1.0.0 and run ./manage.py schemamigration --auto then it doesn't notice any changes.

I am quite new to Django/South and migrations, are there any recommendations for how best to handle this case?

Cheers,
Michael

ksze commented

@michaeljones Using Django/South migrations is a little weird for this situation because you would then need to consider what happens when you run your whole suite of migrations on a pristine database (e.g. during ./manage.py test).

Instead, you may want to do it with some raw, throw-away SQL:

  1. create a new text type column;
  2. UPDATE my_table SET new_text_column = old_json_column::text;
  3. delete the json column;
  4. rename the text column.

WARNING: try this in a snapshot of your database first!

@ksze Thank you for such a quick response. I see now how South migrations would be the wrong approach. Thank you for pointing that out.

I will try my hand at raw, throw-away SQL then :) Much appreciated!

If anyone of my level comes by, here is what I used:

ALTER TABLE "myapp_mymodel"
    ALTER COLUMN "myfield" TYPE text,
    ALTER COLUMN "myfield" SET NOT NULL,
    ALTER COLUMN "myfield" DROP DEFAULT;

So for the links field on my Organisation model in my organisations app it is:

ALTER TABLE "organisations_organisation"
    ALTER COLUMN "links" TYPE text,
    ALTER COLUMN "links" SET NOT NULL,
    ALTER COLUMN "links" DROP DEFAULT;

You can make an sql file with as many of these changes as you need, save it to something like sql/change-json-fields-to-text.sql and then run:

$ ./manage.py dbshell
psql (9.3.5)
SSL connection (...)
Type "help" for help.

dbname=# \i sql/change-json-fields-to-text.sql

Where \i is the psql prompt command to read a file and execute the statements inside it.

If anyone more experienced comes along, please point out anything that is wrong. The sql comes from a verbose output of a South migration which represents the change I was after. I think that postgresql knows how to cast data from json to text so it is not necessary to inform it how to do that.

Edit: I thought I has something wrong as some of the new text fields seem to have json data with an extra level of quoting, but as this has happened to only some and not all the rows, I assume there is something wrong with my data which doesn't surprise me in this instance.

Xowap commented

If I might, the jsonb type appeared in PostgreSQL 9.4, and won't be causing any issue with DISTINCT anymore. Having an actual JSON support from the DB is useful in my application, not using the native PostgreSQL type is not an option, so I'd suggest to get native PostgreSQL type support merged back :)

@Xowap I'd be open to taking another look at this now that it looks like some of the support in django has been improved. Plan is to schedule some work on this in the next couple of weeks and this is something I can look at.

Any updates on getting support for using the native jsonb?

Hi @bradjasper any news on jsonb support?

g-as commented

It might be a problem if you cannot easily upgrade Django, but jsonb is natively supported by 1.9

https://docs.djangoproject.com/en/1.9/ref/contrib/postgres/fields/#jsonfield