How to parse nagios status.dat file?
Asked Answered
D

7

5

I'd like to parse status.dat file for nagios3 and output as xml with a python script. The xml part is the easy one but how do I go about parsing the file? Use multi line regex? It's possible the file will be large as many hosts and services are monitored, will loading the whole file in memory be wise?
I only need to extract services that have critical state and host they belong to.

Any help and pointing in the right direction will be highly appreciated.

LE Here's how the file looks:

########################################
#          NAGIOS STATUS FILE
#
# THIS FILE IS AUTOMATICALLY GENERATED
# BY NAGIOS.  DO NOT MODIFY THIS FILE!
########################################

info {
    created=1233491098
    version=2.11
    }

program {
    modified_host_attributes=0
    modified_service_attributes=0
    nagios_pid=15015
    daemon_mode=1
    program_start=1233490393
    last_command_check=0
    last_log_rotation=0
    enable_notifications=1
    active_service_checks_enabled=1
    passive_service_checks_enabled=1
    active_host_checks_enabled=1
    passive_host_checks_enabled=1
    enable_event_handlers=1
    obsess_over_services=0
    obsess_over_hosts=0
    check_service_freshness=1
    check_host_freshness=0
    enable_flap_detection=0
    enable_failure_prediction=1
    process_performance_data=0
    global_host_event_handler=
    global_service_event_handler=
    total_external_command_buffer_slots=4096
    used_external_command_buffer_slots=0
    high_external_command_buffer_slots=0
    total_check_result_buffer_slots=4096
    used_check_result_buffer_slots=0
    high_check_result_buffer_slots=2
    }

host {
    host_name=localhost
    modified_attributes=0
    check_command=check-host-alive
    event_handler=
    has_been_checked=1
    should_be_scheduled=0
    check_execution_time=0.019
    check_latency=0.000
    check_type=0
    current_state=0
    last_hard_state=0
    plugin_output=PING OK - Packet loss = 0%, RTA = 3.57 ms
    performance_data=
    last_check=1233490883
    next_check=0
    current_attempt=1
    max_attempts=10
    state_type=1
    last_state_change=1233489475
    last_hard_state_change=1233489475
    last_time_up=1233490883
    last_time_down=0
    last_time_unreachable=0
    last_notification=0
    next_notification=0
    no_more_notifications=0
    current_notification_number=0
    notifications_enabled=1
    problem_has_been_acknowledged=0
    acknowledgement_type=0
    active_checks_enabled=1
    passive_checks_enabled=1
    event_handler_enabled=1
    flap_detection_enabled=1
    failure_prediction_enabled=1
    process_performance_data=1
    obsess_over_host=1
    last_update=1233491098
    is_flapping=0
    percent_state_change=0.00
    scheduled_downtime_depth=0
    }

service {
    host_name=gateway
    service_description=PING
    modified_attributes=0
    check_command=check_ping!100.0,20%!500.0,60%
    event_handler=
    has_been_checked=1
    should_be_scheduled=1
    check_execution_time=4.017
    check_latency=0.210
    check_type=0
    current_state=0
    last_hard_state=0
    current_attempt=1
    max_attempts=4
    state_type=1
    last_state_change=1233489432
    last_hard_state_change=1233489432
    last_time_ok=1233491078
    last_time_warning=0
    last_time_unknown=0
    last_time_critical=0
    plugin_output=PING OK - Packet loss = 0%, RTA = 2.98 ms
    performance_data=
    last_check=1233491078
    next_check=1233491378
    current_notification_number=0
    last_notification=0
    next_notification=0
    no_more_notifications=0
    notifications_enabled=1
    active_checks_enabled=1
    passive_checks_enabled=1
    event_handler_enabled=1
    problem_has_been_acknowledged=0
    acknowledgement_type=0
    flap_detection_enabled=1
    failure_prediction_enabled=1
    process_performance_data=1
    obsess_over_service=1
    last_update=1233491098
    is_flapping=0
    percent_state_change=0.00
    scheduled_downtime_depth=0
    }

It can have any number of hosts and a host can have any number of services.

Dextrin answered 2/2, 2009 at 13:11 Comment(0)
C
5

Nagiosity does exactly what you want:

http://code.google.com/p/nagiosity/

Capet answered 23/2, 2009 at 4:11 Comment(1)
Per their docs, Nagiosity "takes the nagios realtime status dada and outputs as XML." It has not been updated since 2011, but it's only a small python script.Billington
P
9

Pfft, get yerself mk_livestatus. http://mathias-kettner.de/checkmk_livestatus.html

Pickaxe answered 3/4, 2010 at 2:46 Comment(1)
Per their docs, MK_Livestatus avoids the duplication and overhead of using a DB. "Livestatus makes use of the Nagios Event Broker API and loads a binary module into your Nagios process. But unlike NDO, Livestatus does not actively write out data. Instead, it opens a socket by which data can be retrieved on demand." [English tweak-edited] It seems to be getting regular updates.Billington
C
5

Nagiosity does exactly what you want:

http://code.google.com/p/nagiosity/

Capet answered 23/2, 2009 at 4:11 Comment(1)
Per their docs, Nagiosity "takes the nagios realtime status dada and outputs as XML." It has not been updated since 2011, but it's only a small python script.Billington
M
3

Having shamelessly stolen from the above examples, Here's a version build for Python 2.4 that returns a dict containing arrays of nagios sections.

def parseConf(source):
    conf = {}
    patID=re.compile(r"(?:\s*define)?\s*(\w+)\s+{")
    patAttr=re.compile(r"\s*(\w+)(?:=|\s+)(.*)")
    patEndID=re.compile(r"\s*}")
    for line in source.splitlines():
        line=line.strip()
        matchID = patID.match(line)
        matchAttr = patAttr.match(line)
        matchEndID = patEndID.match( line)
        if len(line) == 0 or line[0]=='#':
            pass
        elif matchID:
            identifier = matchID.group(1)
            cur = [identifier, {}]
        elif matchAttr:
            attribute = matchAttr.group(1)
            value = matchAttr.group(2).strip()
            cur[1][attribute] = value
        elif matchEndID and cur:
            conf.setdefault(cur[0],[]).append(cur[1])              
            del cur
    return conf

To get all Names your Host which have contactgroups beginning with 'devops':

nagcfg=parseConf(stringcontaingcompleteconfig)
hostlist=[host['host_name'] for host in nagcfg['host'] 
          if host['contact_groups'].startswith('devops')]
Moolah answered 10/2, 2016 at 9:57 Comment(0)
P
2

Don't know nagios and its config file, but the structure seems pretty simple:

# comment
identifier {
  attribute=
  attribute=value
}

which can simply be translated to

<identifier>
    <attribute name="attribute-name">attribute-value</attribute>
</identifier>

all contained inside a root-level <nagios> tag.

I don't see line breaks in the values. Does nagios have multi-line values?

You need to take care of equal signs within attribute values, so set your regex to non-greedy.

Panek answered 2/2, 2009 at 15:12 Comment(0)
A
2

You can do something like this:

def parseConf(filename):
    conf = []
    with open(filename, 'r') as f:
        for i in f.readlines():
            if i[0] == '#': continue
            matchID = re.search(r"([\w]+) {", i)
            matchAttr = re.search(r"[ ]*([\w]+)=([\w\d]*)", i)
            matchEndID = re.search(r"[ ]*}", i)
            if matchID:
                identifier = matchID.group(1)
                cur = [identifier, {}]
            elif matchAttr:
                attribute = matchAttr.group(1)
                value = matchAttr.group(2)
                cur[1][attribute] = value
            elif matchEndID:
                conf.append(cur)
    return conf

def conf2xml(filename):
    conf = parseConf(filename)
    xml = ''
    for ID in conf:
        xml += '<%s>\n' % ID[0]
        for attr in ID[1]:
            xml += '\t<attribute name="%s">%s</attribute>\n' % \
                    (attr, ID[1][attr])
        xml += '</%s>\n' % ID[0]
    return xml

Then try to do:

print   conf2xml('conf.dat')
Apothecium answered 2/2, 2009 at 16:45 Comment(0)
S
2

If you slightly tweak Andrea's solution you can use that code to parse both the status.dat as well as the objects.cache

def parseConf(source):
conf = []
for line in source.splitlines():
    line=line.strip()
    matchID = re.match(r"(?:\s*define)?\s*(\w+)\s+{", line)
    matchAttr = re.match(r"\s*(\w+)(?:=|\s+)(.*)", line)
    matchEndID = re.match(r"\s*}", line)
    if len(line) == 0 or line[0]=='#':
        pass
    elif matchID:
        identifier = matchID.group(1)
        cur = [identifier, {}]
    elif matchAttr:
        attribute = matchAttr.group(1)
        value = matchAttr.group(2).strip()
        cur[1][attribute] = value
    elif matchEndID and cur:
        conf.append(cur)
        del cur
return conf

It is a little puzzling why nagios chose to use two different formats for these files, but once you've parsed them both into some usable python objects you can do quite a bit of magic through the external command file.

If anybody has a solution for getting this into a a real xml dom that'd be awesome.

Submerged answered 4/6, 2010 at 21:8 Comment(1)
Great, thanks! Note: you can cut the runtime in about half by compiling the regexes at the start of parseConf.Islamism
M
2

For the last several months I've written and released a tool that that parses the Nagios status.dat and objects.cache and builds a model that allows for some really useful manipulation of Nagios data. We use it to drive an internal operations dashboard that is a simplified 'mini' Nagios. Its under continual development and I've neglected testing and documentation but the code isn't too crazy and I feel fairly easy to follow.

Let me know what you think... https://github.com/zebpalmer/NagParser

Masked answered 16/2, 2012 at 19:55 Comment(1)
I appreciate your code at NagParser / nagparser / Services / nicetime.py! I also found a perl one -liner that might translate the dates correctly, geekpeek.net/nagios-log-convert-timestamp . On my server it finds many 1969 dates though :).Billington

© 2022 - 2024 — McMap. All rights reserved.