Failed to add documents to Solr: Solr responded with an error (HTTP 400) (django + haystack + solr)
Asked Answered
B

3

5

I currently have Solr 4.2.0 working in production (set up around 2012). I have set up a new development environment where I upgraded all packages (Django 1.8.10, PySolr 3.4.0, Haystack 2.4.1) and set up Solr 5.5.0

In short

I have Solr running, my core/collection created with 'basic_configs' and it seems to work well, except that during indexing I get a lot of errors similar to these:

All documents removed.
Indexing 9604 contracts
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.22] unknown field 'status']
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.70556] unknown field 'date_signed']
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.72059] unknown field 'date_signed']
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.73458] unknown field 'date_signed']

Looking at the id's, it seems most documents are fine, but frequent enough (the list goes on) these errors appear throughout all tables/indexes.

Eventually I followed this promising github project guide, but unfortunately it did not solve the problems for me.

What I did, step by step

  1. Succesfully installed Solr 5.5.0 (web interface working at
    localhost:8983), using this guide
  2. Created a collection called 'spng', using the following command: sudo su - solr -c '/opt/solr/bin/solr create -c spng -d basic_configs'
  3. Overwritten my solr.xml (/srv/spng/src/django-haystack/haystack/templates/search_configuration/solr.xml) with the solr.xml from the earlier mentioned github project guide
  4. Just to be sure I gave the solr.xml file 777 rights.

My settings.py has the following entry:

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
        'URL': 'http://localhost:8983/solr/spng',
        'DEFAULT_OPERATOR': 'AND',
        'INCLUDE_SPELLING': True,
    },
}
  1. I created a schema.xml (python manage.py build_solr_schema) and placed it in /var/solr/data/spng/conf/schema.xml
  2. Again, just to be sure I gave the schema.xml file also 777 rights.
  3. I used the curl command to reload the core: curl 'http://localhost:8983/solr/admin/cores?action=RELOAD&core=spng&wt=json&indent=true'

The response was:

{
  "responseHeader":{
    "status":0,
    "QTime":300}}
  1. I also restarted uwsgi and solr just to make sure
  2. At this point I try to run the python manage.py rebuild_index command

I end up with the following errors, as mentioned before:

All documents removed.
Indexing 9604 contracts
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.22] unknown field 'status']
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.70556] unknown field 'date_signed']
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.72059] unknown field 'date_signed']
Failed to add documents to Solr: Solr responded with an error (HTTP 400): [Reason: ERROR: [d
oc=accounting.contract.73458] unknown field 'date_signed']

Does anyone have any idea what might be wrong? The indexing works without errors on my production server, running 4.2.0. Did I miss a setting or is Solr 5.5.0 causing these errors?

Bakery answered 16/3, 2016 at 10:48 Comment(0)
B
4

Special thanks to elyograg for helping me out on Solr's IRC channel (#solr on freenode).

elyograg: if you're using the stock solrconfig.xml from basic_configs, then your schema is located in a file named "managed-schema" -- ALL example configs are using the managed schema by default as of 5.5.

elyograg: put it (schema.xml contents) into managed-schema. You could potentially change the solrconfig.xml, but life will be easier for people trying to help you if you keep the defaults.

In other words, instead of schema.xml, as of version 5.5 the schema file is called 'managed-schema' when creating a collection with basic_configs (in my case located in /var/solr/data//conf/managed-schema)

After updating the file and reloading the core, indexing finished without errors.

Be wary in future versions, because elyograg also noted:

elyograg: It might also be a good idea to add the .xml extension. I don't think the lack of an extension is going to be much of a deterrent to hand-editing.

So in the future it may be called managed-schema.xml

Bakery answered 16/3, 2016 at 19:15 Comment(1)
I want to clarify, just <schema ... > ... </schema> part needs to be replaced. Additionaly, I had few more errors relating to fieldType [x] not found in the schema requiring to modify solrconfig.xml, which I solved using this answer: https://mcmap.net/q/979975/-solr-error-creating-core-fieldtype-x-not-found-in-the-schema. I use Solr 6.3.0Mccombs
K
3

Solr Index Update consists of 4 steps:

  1. add valid fields in search_index.py

  2. Generate schema by running:

    python manage.py build_solr_schema > schema.xml

  3. update your django by:

    python manage.py update_index

  4. restart server.

If all above steps complete without any error then your fields are successfully updated

Kweilin answered 30/1, 2017 at 12:12 Comment(0)
E
2

Check the schema file at

http://localhost:8983/solr/#/spng/files?file=schema.xml

and compare with the schema from build_solr_schema to make sure solr is using the right schema

Extraterritorial answered 16/3, 2016 at 16:53 Comment(1)
As of version 5.5, when creating a collection with basic_configs the schema file is called managed-schema instead of schema.xml - see my answer for more info.Bakery

© 2022 - 2024 — McMap. All rights reserved.