Data-config.xml and mysql - I can load only "id" column
Asked Answered
C

1

5

I've got Solr 5.0.0 on Windows Server 2012. I would like to load all data from my table into solr engine.

My data-config.xml looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
<!--# define data source -->
<dataConfig>
<dataSource type="JdbcDataSource" 
        driver="com.mysql.jdbc.Driver"
        url="jdbc:mysql://localhost:3306/database" 
        user="root" 
        password="root"/>
<document>
<entity name="my_table"  
pk="id"
query="SELECT ID, LASTNAME FROM my_table limit 2">
 <field column="ID" name="id" type="string" indexed="true" stored="true" required="true" />
 <field column="LASTNAME" name="lastname" type="string" indexed="true" stored="true"/>
</entity>
</document>
</dataConfig>

When I choose dataimport, I've got an answer:

Indexing completed. Added/Updated: 2 documents. Deleted 0 documents    
Requests: 1, Fetched: 2, Skipped: 0, Processed: 2 

And Raw Debug-Response :

{
  "responseHeader": {
    "status": 0,
    "QTime": 280
  },
  "initArgs": [
    "defaults",
    [
      "config",
      "data-config.xml"
    ]
  ],
  "command": "full-import",
  "mode": "debug",
  "documents": [
    {
      "id": [
        1983
      ],
      "_version_": [
        1497798459776827400
      ]
    },
    {
      "id": [
        1984
      ],
      "_version_": [
        1497798459776827400
      ]
    }
  ],
  "verbose-output": [
    "entity:my_table",
    [
      "document#1",
      [
        "query",
        "SELECT ID,LASTNAME FROM my_table limit 2",
        "time-taken",
        "0:0:0.8",
        null,
        "----------- row #1-------------",
        "LASTNAME",
        "Gates",
        "ID",
        1983,
        null,
        "---------------------------------------------"
      ],
      "document#2",
      [
        null,
        "----------- row #1-------------",
        "LASTNAME",
        "Doe",
        "ID",
        1984,
        null,
        "---------------------------------------------"
      ],
      "document#3",
      []
    ]
  ],
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
    "Total Requests made to DataSource": "1",
    "Total Rows Fetched": "2",
    "Total Documents Skipped": "0",
    "Full Dump Started": "2015-04-07 15:05:22",
    "": "Indexing completed. Added/Updated: 2 documents. Deleted 0 documents.",
    "Committed": "2015-04-07 15:05:22",
    "Optimized": "2015-04-07 15:05:22",
    "Total Documents Processed": "2",
    "Time taken": "0:0:0.270"
  }
}

And finally when I'm quering Solr

http://localhost:8983/solr/test/query?q=*:*

I've got an answer:

{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"*:*"}},
  "response":{"numFound":2,"start":0,"docs":[
      {
        "id":"1983",
        "_version_":1497798459776827392},
      {
        "id":"1984",
        "_version_":1497798459776827393}]
  }}

I would like to see lastname column too. Why can't I?

Conjugal answered 7/4, 2015 at 13:12 Comment(4)
Post your schema.xmlInterviewer
It looks like you're trying to define your fields as you normally would in the schema.xml directly in your data-config.xml, which I would have to experiment with, but I don't think is actually possible. The <field> element in the data-config.xml is more about mapping the returned SQL column with the expected Solr field name. Are you defining a lastname field in schema.xml? The data importer will usually silently drop fields returned from your queries if they aren't mapped to defined fields.Autonomous
Thanks for answers! So: 1. I changed data-config.xml to: <entity name="jn_person" pk="id" query="SELECT ID, lastname FROM jn_person limit 2"> <field column="ID" name="id" /> <field column="lastname" name="lastname"/> </entity> 2. And added line <!-- New field --> <field name="lastname" type="string" indexed="true" stored="true" multiValued="true" /> My schema.xml look like bellow: pastebin.com/xLgsv31hConjugal
And I've got an Warn in logs: ManagedIndexSchemaFactory The schema has been upgraded to managed,​ but the non-managed schema schema.xml is still loadable. PLEASE REMOVE THIS FILE. In my opinion there's no relation between my problems and this warm, but I'm not absolutely sure of thatConjugal
S
12

That warning in the logs is actually the real issue.

If you look in the solrconfig.xml file you will have a section:

<schemaFactory class="ManagedIndexSchemaFactory">
  <bool name="mutable">true</bool>
  <str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>

This means that your schema.xml file is being ignored. Instead the file managed-schema in the same folder will be being used.

There are a couple of ways to solve this. You can comment out the managed schema section and replace it with

<schemaFactory class="ClassicIndexSchemaFactory"/>

Or another way is to delete the managed-schema file. SOLR will then read the schema.xml file on restart and generate a new managed-schema. If that works then you should then see your fields at the bottom of the file.

For more information please see:

https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig

Slenderize answered 17/4, 2015 at 14:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.