every field in solr index stored as array - Python django-haystack

  • [ x] Tested with the latest Haystack release
  • [ ] Tested with the current Haystack master branch

Expected behaviour

CharFields should store a char rather than an array with one char in it

Actual behaviour

every field stored in the solr index (apart from the id field) including djangoct and djangoid are arrays

Steps to reproduce the behaviour

  1. create an index with a CharField
  2. index some data
  3. check the solr index

Configuration

  • Operating system version: mcr.microsoft.com/vscode/devcontainers/python:0-3
  • Search engine version: Solr 6.6.6
  • Python version: 3.6.12
  • Django version: 3.1.8
  • Haystack version: 3
Asked Oct 12 '21 15:10
avatar sennierer
sennierer

3 Answer:

Please submit a failing test case since this is not the default behavior.

1
Answered Apr 13 '21 at 14:30
avatar  of acdha
acdha

Using this index definition:

class PersonIndex(indexes.SearchIndex, indexes.Indexable):

      text = indexes.CharField(document=True, use_template=True)
      name = indexes.CharField()
      academy_member = indexes.BooleanField(default=False)
      birth_date = indexes.DateField(model_attr="start_date", null=True, faceted=True)
      death_date = indexes.DateField(model_attr="end_date", null=True, faceted=True)
      place_of_birth = indexes.CharField(null=True, faceted=True)
      place_of_death = indexes.CharField(null=True, faceted=True)
      gender = indexes.CharField(null=True, model_attr="gender", faceted=True)
      profession = indexes.MultiValueField(null=True, faceted=True)
      akademiemitgliedschaft = indexes.MultiValueField(null=True, faceted=True)

      def get_model(self):
          return Person

      def prepare_akademiemitgliedschaft(self, object):
          res = object.personinstitution_set.filter(
              related_institution_id__in=[2, 3, 500],
              relation_type_id__in=classes["mitgliedschaft"][0],
          )
          res_fin = []
          for mitglied in res:
              mitgliedschaft = get_mitgliedschaft_from_relation(mitglied.relation_type)
              res_fin.append(
                  f"{mitgliedschaft}__{str(mitglied.related_institution)}__{mitglied.start_date}__{mitglied.end_date}"
              )
          return res_fin

      def prepare_profession(self, object):
          return [x.label for x in object.profession.all()]

      def prepare_place_of_birth(self, object):
          rel = object.personplace_set.filter(
              relation_type_id__in=getattr(settings, "BIRTH_REL_NAME", [])
          )
          if rel.count() == 1:
              return rel[0].related_place.name
          else:
              return None

      def prepare_place_of_death(self, object):
          rel = object.personplace_set.filter(
              relation_type_id__in=getattr(settings, "DEATH_REL_NAME", [])
          )
          if rel.count() == 1:
              return rel[0].related_place.name
          else:
              return None

      def prepare_academy_member(self, object):
          if (
              object.personinstitution_set.filter(
                  relation_type_id__in=classes["mitgliedschaft"][0],
                  related_institution_id__in=[2, 3, 500],
              ).count()
              > 0
          ):
              return True
          else:
              return False

      def prepare_name(self, object):
          return str(object)
  ```

that actually produces a fine schema.xml with the correct field definitions

I get something along the lines of:

```json
{
        "id":"apis_entities.person.502",
        "django_ct":["apis_entities.person"],
        "django_id":[502],
        "text":["{'first_name': 'Rainer', 'name': 'Abart', 'alternative_names': [], 'texts': []}"],
        "name":["Abart, Rainer"],
        "academy_member":[true],
        "birth_date":["1963-10-30T00:00:00Z"],
        "birth_date_exact":["1963-10-30T00:00:00Z"],
        "place_of_birth":["Mödling"],
        "place_of_birth_exact":["Mödling"],
        "gender":["male"],
        "gender_exact":["male"],
        "akademiemitgliedschaft":["k. M. I.__MATHEMATISCH-NATURWISSENSCHAFTLICHE KLASSE__2013-04-19__None"],
        "akademiemitgliedschaft_exact":["k. M. I.__MATHEMATISCH-NATURWISSENSCHAFTLICHE KLASSE__2013-04-19__None"],
        "_version_":1696936035385081856},

where even the django_id is an array rather than an int.

1
Answered Apr 13 '21 at 14:49
avatar  of sennierer
sennierer

I am sorry, I should have looked a bit closer. When debugging the problem I found that I had created a schema-less solr core. Now its working as intended. Closing the issue.

1
Answered Apr 13 '21 at 19:26
avatar  of sennierer
sennierer