How to filter logs from gunicorn?
Asked Answered
J

3

9

I have a Flask API with gunicorn. Gunicorn logs all the requests to my API, i.e.

172.17.0.1 - - [19/Sep/2018:13:50:58 +0000] "GET /api/v1/myview HTTP/1.1" 200 16 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36"

However, I want to filter the logs to exclude a certain endpoint which is called from some other service all few seconds.

I wrote a filter to exclude this endpoint from being logged:

class NoReadyFilter(logging.Filter):
    def filter(self, record):
        return record.getMessage().find('/api/v1/ready') == -1

and if I add this filter to the werkzeuglogger and use the Flask development server, the filter works. Requests to /api/v1/ready won't appear in the log files. However, I can't seem to add the filter to the gunicornlogger. With the following code, requests to /api/v1/ready still appear:

if __name__ != '__main__':
    gunicorn_logger = logging.getLogger('gunicorn.glogging.Logger')
    gunicorn_logger.setLevel(logging.INFO)
    gunicorn_logger.addFilter(NoReadyFilter())

How can you add a filter to the gunicorn logger? I tried adding it to the gunicorn.error-logger as suggested here, but it didn't help.

Jauregui answered 19/9, 2018 at 14:2 Comment(1)
found this info to be quite useful in setting up filters in access_logs without security issues.Springing
J
16

I finally found a way of doing it by creating a subclass

class CustomGunicornLogger(glogging.Logger):

    def setup(self, cfg):
        super().setup(cfg)

        # Add filters to Gunicorn logger
        logger = logging.getLogger("gunicorn.access") 
        logger.addFilter(NoReadyFilter())

that inherits from gunicorn.glogging.Logger. You can then provide this class as a parameter for gunicorn, e.g.

gunicorn --logger-class "myproject.CustomGunicornLogger" app
Jauregui answered 22/9, 2018 at 9:38 Comment(1)
How to config this via gunicorn.conf file?Tmesis
G
5

While a custom logging class would work, it is probably an overkill for a simple access log filter. Instead, I would use Gunicorn's on_starting() server hook to add a filter to the access logger.

The hook can be added in the settings file (default gunicorn.conf.py), so all gunicorn configuration stays in one place.

import logging
import re

wsgi_app = 'myapp.wsgi'
bind = '0.0.0.0:9000'
workers = 5
accesslog = '-'

class RequestPathFilter(logging.Filter):
    def __init__(self, *args, path_re, **kwargs):
        super().__init__(*args, **kwargs)
        self.path_filter = re.compile(path_re)

    def filter(self, record):
        req_path = record.args['U']
        if not self.path_filter.match(req_path):
            return True  # log this entry
        # ... additional conditions can be added here ...
        return False     # do not log this entry

def on_starting(server):
    server.log.access_log.addFilter(RequestPathFilter(path_re=r'^/api/v1/ready$'))

Some notes on this sample implementation:

  • RequestPathFilter can also be nested on_starting() to hide it from external modules.
  • Filtering is applied on record.args. This contains the raw values used to construct the logging message.
  • Apply filtering on the results of record.getMessage() instead of the raw values is bad because:
    1. Gunicorn will already have done the work of constructing the message.
    2. Filtering mechanism can be manipulated by the client. This would allow e.g. an attacker to hide their activities by setting their user agent to Wget/1.20.1/api/v1/ready (linux-gnu).
Greatgranduncle answered 17/8, 2021 at 21:26 Comment(0)
W
2

It's an old question, but what you did is not working because you get the wrong gunicorn logger. The access log is not on error logger but on access logger (cf https://github.com/benoitc/gunicorn/blob/b2dc0364630c26cc315ee417f9c20ce05ad01211/gunicorn/glogging.py#L61)

Define your class like you did :

class NoReadyFilter(logging.Filter):
    def filter(self, record):
        return record.getMessage().find('/api/v1/ready') == -1

Then in the main entrypoint of your app :

if __name__ != "__main__":
    gunicorn_logger = logging.getLogger("gunicorn.access")
    gunicorn_logger.addFilter(NoReadyFilter())

gunicorn run command : gunicorn --access-logfile=- --log-file=- -b 0.0.0.0:5000 entrypoint:app

Witmer answered 27/4, 2020 at 15:51 Comment(1)
I'd add to this that it could be smart to filter the request method aswell. In my case I was filtering /health, which also matches /healthfoobar. Filtering on a string like GET /health (with trailing space) will match a specific route and method.Oireachtas

© 2022 - 2024 — McMap. All rights reserved.