When indexing, using distinct() on fields except for model's primary key field, gives an error
class CountryIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
def get_model(self):
return Country
def index_queryset(self, using=None):
return self.get_model().objects.order_by('code').distinct('code')
Primary key field on Country
model is default pk
/id
.
When trying to rebuild index with setup like so i get an error saying:
File "/home/post/.virtualenvs/proj/local/lib/python2.7/site-packages/django/db/backends
/util.py", line 53, in execute
return self.cursor.execute(sql, params)
ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT DISTINCT ON ("app_country"."code") "app_country"."id"...
When investigating, I placed a print statement at line 53 in django/db/backends/util.py
to print sql
variable to see what SQL is generated. And here is what i saw when launching rebuild_index.
All documents removed.
SELECT COUNT(*) FROM (SELECT DISTINCT ON ("app_country"."code") "app_country"."id",
"app_country"."title", "app_country"."title_en", "app_country"."title_ru", "app_country"."title_es",
"app_country"."title_de", "app_country"."code" FROM "app_country") subquery
Indexing 242 Countries
SELECT DISTINCT ON ("app_country"."code") "app_country"."id", "app_country"."title",
"app_country"."title_en", "app_country"."title_ru", "app_country"."title_es", "app_country"."title_de",
"app_country"."code" FROM "app_country" ORDER BY "app_country"."id" ASC LIMIT 242
As you can see, the second SQL snipped tries to get distinct elements by country.code, but orderring from index_queryset
method is overriden by ordering by country.id
, thus raising a Postgress error.
Looking through the django-haystack code i found a place where the ordering by model's pk field is happening: https://github.com/toastdriven/django-haystack/blob/92559b1f6c03229d6166112145db16bc4d60fb71/haystack/indexes.py#L183
order_by in index_queryset is overridden by hardcoded order in the code
When indesing, using distinct() on fields except for model's primary key field, gives an error
Nov 20, 2014
changed the title
When indesing, using distinct() on fields except for model's primary key field, gives an error
When indexing, using distinct() on fields except for model's primary key field, gives an error
Nov 20, 2014
I have to make an horrible hack to deal with this issue.
ids = list(queryset_with_distinct.values_list('pk', flat=True))
return self.get_model().objects.filter(id__in=ids)
This will work because my queryset is small but god help me when this scale.
Couldn't you just use Django's native subquery support?
pks = queryset_with_distinct.values_list('pk', flat=True)
return self.get_model().objects.filter(id__in=pks)