Is there any way in Elasticsearch to get results as CSV file in curl API?
Asked Answered
I

9

62

I am using elastic search. I need results from elastic search as a CSV file. Any curl URL or any plugins to achieve this?

Idiographic answered 19/9, 2013 at 10:46 Comment(1)
for people also running Kibana, you can use it to export your index #34792646Selfoperating
T
84

I've done just this using cURL and jq ("like sed, but for JSON"). For example, you can do the following to get CSV output for the top 20 values of a given facet:

$ curl -X GET 'http://localhost:9200/myindex/item/_search?from=0&size=0' -d '
    {"from": 0,
    "size": 0,
    "facets": {
      "sourceResource.subject.name": {
        "global": true,
        "terms": {
          "order": "count",
          "size": 20,
          "all_terms": true,
          "field": "sourceResource.subject.name.not_analyzed"
        }
      }
    },
    "sort": [
      {
        "_score": "desc"
      }
    ],
    "query": {
      "filtered": {
        "query": {
          "match_all": {}
        }
      }
    }
  }' | jq -r '.facets["subject"].terms[] | [.term, .count] | @csv'

"United States",33755
"Charities--Massachusetts",8304
"Almshouses--Massachusetts--Tewksbury",8304
"Shields",4232
"Coat of arms",4214
"Springfield College",3422
"Men",3136
"Trees",3086
"Session Laws--Massachusetts",2668
"Baseball players",2543
"Animals",2527
"Books",2119
"Women",2004
"Landscape",1940
"Floral",1821
"Architecture, Domestic--Lowell (Mass)--History",1785
"Parks",1745
"Buildings",1730
"Houses",1611
"Snow",1579
Thermogenesis answered 8/4, 2014 at 18:53 Comment(0)
F
21

I've used Python successfully, and the scripting approach is intuitive and concise. The ES client for python makes life easy. First grab the latest Elasticsearch client for Python here:
http://www.elasticsearch.org/blog/unleash-the-clients-ruby-python-php-perl/#python

Then your Python script can include calls like:

import elasticsearch
import unicodedata
import csv

es = elasticsearch.Elasticsearch(["10.1.1.1:9200"])
# this returns up to 500 rows, adjust to your needs
res = es.search(index="YourIndexName", body={"query": {"match": {"title": "elasticsearch"}}},500)
sample = res['hits']['hits']

# then open a csv file, and loop through the results, writing to the csv
with open('outputfile.tsv', 'wb') as csvfile:   
    filewriter = csv.writer(csvfile, delimiter='\t',  # we use TAB delimited, to handle cases where freeform text may have a comma
                        quotechar='|', quoting=csv.QUOTE_MINIMAL)
    # create column header row
    filewriter.writerow(["column1", "column2", "column3"])    #change the column labels here
    for hit in sample: 
        # fill columns 1, 2, 3 with your data 
        col1 = hit["some"]["deeply"]["nested"]["field"].decode('utf-8') #replace these nested key names with your own
        col1 = col1.replace('\n', ' ')
        # col2 = , col3 = , etc...
        filewriter.writerow([col1,col2,col3])

You may want to wrap the calls to the column['key'] references in try / catch error handling, since documents are unstructured, and may not have the field from time to time (depends on your index).

I have a complete Python sample script using the latest ES python client available here:

https://github.com/jeffsteinmetz/pyes2csv

Fiddlefaddle answered 17/1, 2014 at 15:30 Comment(1)
Note that link-only answers are discouraged, SO answers should be the end-point of a search for a solution (vs. yet another stopover of references, which tend to get stale over time). Please consider adding a stand-alone synopsis here, keeping the link as a reference.Endothermic
T
9

You can use elasticsearch head plugin. You can install from elasticsearch head plugin http://localhost:9200/_plugin/head/ Once you have the plugin installed, navigate to the structured query tab, provide query details and you can select 'csv' format from the 'Output Results' dropdown.

Therein answered 4/4, 2015 at 7:53 Comment(1)
I've noticed some odd behaviour when using this; the order of the columns seems to randomly change each time I download the CSV. Has anyone else also experienced this?Josephson
M
5

I don't think there is a plugin that will give you CSV results directly from the search engine, so you will have to query ElasticSearch to retrieve results and then write them to a CSV file.

Command line

If you're on a Unix-like OS, then you might be able to make some headway with es2unix which will give you search results back in raw text format on the command line and so should be scriptable.

You could then dump those results to text file or pipe to awk or similar to format as CSV. There is a -o flag available, but it only gives 'raw' format at the moment.

Java

I found an example using Java - but haven't tested it.

Python

You could query ElasticSearch with something like pyes and write the results set to a file with the standard csv writer library.

Perl

Using Perl then you could use Clinton Gormley's GIST linked by Rakesh - https://gist.github.com/clintongormley/2049562

Maidenhood answered 19/9, 2013 at 17:59 Comment(0)
O
5

If you are using kibana (app/discover in general), you can make your query in the UI, then save it and share -> CSV Reports. This creates a csv with a line for each record and columns will be comma separated

Oesophagus answered 5/4, 2022 at 9:43 Comment(0)
P
4

Shameless plug. I wrote estab - a command line program to export elasticsearch documents to tab-separated values.

Example:

$ export MYINDEX=localhost:9200/test/default/
$ curl -XPOST $MYINDEX -d '{"name": "Tim", "color": {"fav": "red"}}'
$ curl -XPOST $MYINDEX -d '{"name": "Alice", "color": {"fav": "yellow"}}'
$ curl -XPOST $MYINDEX -d '{"name": "Brian", "color": {"fav": "green"}}'

$ estab -indices "test" -f "name color.fav"
Brian   green
Tim     red
Alice   yellow

estab can handle export from multiple indices, custom queries, missing values, list of values, nested fields and it's reasonably fast.

Pantalets answered 22/8, 2014 at 9:12 Comment(1)
Does this work with JSON queries? I have some complex queries in JSON that I've tried to pass through estab, but I just get [406] Content-Type header [application/x-www-form-urlencoded] is not supported. Specifically, my JSON looks like: { "aggs":{...} "query":{...} } but it looks like this is not supported.Ida
S
2

I have been using https://github.com/robbydyer/stash-query stash-query for this.

I find it quite convenient and working well, though i struggle with the install every time I redo it (this is due to me not being very fluent with gem's and ruby).

On Ubuntu 16.04 though, what seemed to work was:

apt install ruby
sudo apt-get install libcurl3 libcurl3-gnutls libcurl4-openssl-dev
gem install stash-query

and then you should be good to go

  1. Installs Ruby
  2. Install curl dependencies for Ruby, because the stash-query tool is working via the REST API of elasticsearch
  3. Installs stash query

This blog post describes how to build it as well:

https://robbydyer.wordpress.com/2014/08/25/exporting-from-kibana/

Sanctum answered 21/10, 2016 at 14:20 Comment(1)
i fixed a few bugs and added ssl support in github.com/antonmos/stash-query/tree/misc PR pending.Dabney
G
1

you can use elasticsearch2csv is a small and effective python3 script that uses Elasticsearch scroll API and handle a big query response.

Gynophore answered 1/6, 2017 at 15:30 Comment(0)
I
-4

You can use GIST. Its simple. Its in Perl and you can get some help from it.

Please download and see the usage on GitHub. Here is the link. GIST GitHub

Or if you want in Java then go for elasticsearch-river-csv

elasticsearch-river-csv

Immunity answered 19/9, 2013 at 12:9 Comment(1)
The elasticsearch-river-csv is only going to work for indexing and not for output... unless I've missed something?Maidenhood

© 2022 - 2024 — McMap. All rights reserved.