Logstash: Handling of large messages

I'm trying to parse a large message with Logstash using a file input, a json filter, and an elasticsearch output. 99% of the time this works fine, but when one of my log messages is too large, I get JSON parse errors, as the initial message is broken up into two partial invalid JSON streams. The size of such messages is about 40,000+ characters long. I've looked to see if there is any information on the size of the buffer, or some max length that I should try to stay under, but haven't had any luck. The only answers I found related to the udp input, and being able to change the buffer size.

Does Logstash has a limit size for each event-message? https://github.com/elastic/logstash/issues/1505

This could also be similar to this question, but there were never any replies or suggestions: Logstash Json filter behaving unexpectedly for large nested JSONs

As a workaround, I wanted to split my message up into multiple messages, but I'm unable to do this, as I need all the information to be in the same record in Elasticsearch. I don't believe there is a way to call the Update API from logstash. Additionally, most of the data is in an array, so while I can update an Elasticsearch record's array using a script (Elasticsearch upserting and appending to array), I can't do that from Logstash.

The data records look something like this:

{ "variable1":"value1", 
 ......, 
 "variable30": "value30", 
 "attachements": [ {5500 charcters of JSON}, 
                   {5500 charcters of JSON}, 
                   {5500 charcters of JSON}.. 
                   ...
                   {8th dictionary of JSON}]
 }

Does anyone know of a way to have Logstash process these large JSON messages, or a way that I can split them up and have them end up in the same Elasticsearch record (using Logstash)?

Any help is appreciated, and I'm happy to add any information needed!

Recommended topics

Hot tags