Most Pythonic way to provide global configuration variables in config.py? [closed]
Asked Answered
C

8

159

In my endless quest in over-complicating simple stuff, I am researching the most 'Pythonic' way to provide global configuration variables inside the typical 'config.py' found in Python egg packages.

The traditional way (aah, good ol' #define!) is as follows:

MYSQL_PORT = 3306
MYSQL_DATABASE = 'mydb'
MYSQL_DATABASE_TABLES = ['tb_users', 'tb_groups']

Therefore global variables are imported in one of the following ways:

from config import *
dbname = MYSQL_DATABASE
for table in MYSQL_DATABASE_TABLES:
    print table

or:

import config
dbname = config.MYSQL_DATABASE
assert(isinstance(config.MYSQL_PORT, int))

It makes sense, but sometimes can be a little messy, especially when you're trying to remember the names of certain variables. Besides, providing a 'configuration' object, with variables as attributes, might be more flexible. So, taking a lead from bpython config.py file, I came up with:

class Struct(object):

    def __init__(self, *args):
        self.__header__ = str(args[0]) if args else None

    def __repr__(self):
        if self.__header__ is None:
             return super(Struct, self).__repr__()
        return self.__header__

    def next(self):
        """ Fake iteration functionality.
        """
        raise StopIteration

    def __iter__(self):
        """ Fake iteration functionality.
        We skip magic attribues and Structs, and return the rest.
        """
        ks = self.__dict__.keys()
        for k in ks:
            if not k.startswith('__') and not isinstance(k, Struct):
                yield getattr(self, k)

    def __len__(self):
        """ Don't count magic attributes or Structs.
        """
        ks = self.__dict__.keys()
        return len([k for k in ks if not k.startswith('__')\
                    and not isinstance(k, Struct)])

and a 'config.py' that imports the class and reads as follows:

from _config import Struct as Section

mysql = Section("MySQL specific configuration")
mysql.user = 'root'
mysql.pass = 'secret'
mysql.host = 'localhost'
mysql.port = 3306
mysql.database = 'mydb'

mysql.tables = Section("Tables for 'mydb'")
mysql.tables.users = 'tb_users'
mysql.tables.groups =  'tb_groups'

and is used in this way:

from sqlalchemy import MetaData, Table
import config as CONFIG

assert(isinstance(CONFIG.mysql.port, int))

mdata = MetaData(
    "mysql://%s:%s@%s:%d/%s" % (
         CONFIG.mysql.user,
         CONFIG.mysql.pass,
         CONFIG.mysql.host,
         CONFIG.mysql.port,
         CONFIG.mysql.database,
     )
)

tables = []
for name in CONFIG.mysql.tables:
    tables.append(Table(name, mdata, autoload=True))

Which seems a more readable, expressive and flexible way of storing and fetching global variables inside a package.

Lamest idea ever? What is the best practice for coping with these situations? What is your way of storing and fetching global names and variables inside your package?

Coati answered 1/6, 2011 at 8:35 Comment(7)
You already made a decision here that might or might not be good. The config itself can be stored in different ways, like JSON, XML, different grammars for *nixes and Windows and so on. Depending on who writes the config file ( a tool, a human, what background?) different grammars might be preferable. Most often it might not be a good idea to let the config file be written in the same language you use for your program, because it gives too much power to the user (what might be yourself, but you yourself might not remember everything that can go wrong some months ahead).Scheers
Often I end up writing a JSON config file. It can be read into python structures easily and also be created by a tool. It seems to have the most flexibility and the only cost are some braces that might be annoying to the user. I never wrote an Egg, though. Maybe that is the standard way. In that case just ignore my comment above.Scheers
You can use "vars(self)" instead of "self.__dict__.keys()"Noli
Possible duplicate of What's the best practice using a settings file in Python? They answer "Many ways are possible, and a bikeshed thread already exists. config.py is good unless you care about security."Oporto
I ended up using python-box, see this answerDirectoire
I burst into laughter as I read "In my endless quest in over-complicating simple stuff..."Nudi
Regardless of this question being closed, I found useful ideas in the answers.Ramonitaramos
A
5

I did that once. Ultimately I found my simplified basicconfig.py adequate for my needs. You can pass in a namespace with other objects for it to reference if you need to. You can also pass in additional defaults from your code. It also maps attribute and mapping style syntax to the same configuration object.

Arman answered 1/6, 2011 at 8:47 Comment(2)
I know this is a few years old, but I'm a beginner and I think this config file is essentially what I am looking for (maybe too advanced), and I would like to understand it better. Do I just pass initialize ConfigHolder with a dict of configs I'd like to set and pass between modules?Cahilly
@Cahilly At this point I would use (and am currently using) a YAML file and PyYAML for configuration. I also use a third-party module called confit and it supports merging multiple source. It's part of a new devtest.config module.Arman
M
69

How about just using the built-in types like this:

config = {
    "mysql": {
        "user": "root",
        "pass": "secret",
        "tables": {
            "users": "tb_users"
        }
        # etc
    }
}

You'd access the values as follows:

config["mysql"]["tables"]["users"]

If you are willing to sacrifice the potential to compute expressions inside your config tree, you could use YAML and end up with a more readable config file like this:

mysql:
  - user: root
  - pass: secret
  - tables:
    - users: tb_users

and use a library like PyYAML to conventiently parse and access the config file

Marabou answered 1/6, 2011 at 9:52 Comment(5)
But normally you want to have different config files and thus not have any configuration data inside your code. So ´config´ would be an external JSON / YAML file which you have to load from disk every time you want to access it, in every single class. I believe the question is to "load once" and have global-like access to the loaded data. How would you do that with the solution you suggested?Irreconcilable
if just would something exist to keep the data in memory ^^Parathion
Do not forget to include the config file with something from config import *Semipalmate
Getting attributes with string indexing is asking for poor maintainability. Refactoring a name in a large codebase becomes a mess. Support for autocomplete is non existent or at best limited. The syntax takes 4 extra characters instead of just 1 with the dot notation.Kiwi
I solved this by using a dataclass were I define all attributes with their default values. Then I use marshmallow_dataclass module to dump or load all attributes to or from a dict and write or load them with ruamel (or pyyaml) to a file. The config class can be instantiated by calling a load_from_file() function. Access of attributes with dot notation. It's even possible to implement getter/helper function in the config class.Ottillia
A
44

I like this solution for small applications:

class App:
  __conf = {
    "username": "",
    "password": "",
    "MYSQL_PORT": 3306,
    "MYSQL_DATABASE": 'mydb',
    "MYSQL_DATABASE_TABLES": ['tb_users', 'tb_groups']
  }
  __setters = ["username", "password"]

  @staticmethod
  def config(name):
    return App.__conf[name]

  @staticmethod
  def set(name, value):
    if name in App.__setters:
      App.__conf[name] = value
    else:
      raise NameError("Name not accepted in set() method")

And then usage is:

if __name__ == "__main__":
   # from config import App
   App.config("MYSQL_PORT")     # return 3306
   App.set("username", "hi")    # set new username value
   App.config("username")       # return "hi"
   App.set("MYSQL_PORT", "abc") # this raises NameError

.. you should like it because:

  • uses class variables (no object to pass around/ no singleton required),
  • uses encapsulated built-in types and looks like (is) a function call on App,
  • has control over individual config immutability, mutable globals are the worst kind of globals.
  • promotes conventional and well named access / readability in your source code
  • is a simple class but enforces structured access, an alternative is to use @property, but that requires more variable handling code per item and is object-based.
  • requires minimal changes to add new config items and set its mutability.

--Edit--: For large applications, storing values in a YAML (i.e. properties) file and reading that in as immutable data is a better approach (i.e. blubb/ohaal's answer). For small applications, this solution above is simpler.

Any answered 12/5, 2017 at 15:37 Comment(3)
But say we read in config from a YAML file; what then? Where do you store those values? Or do you just suggest reading from the YAML file again whenever the values are needed?Dominicadominical
It's kind of hard to answer @BenFarmer. First, use a YAML library. Second look-up context managers (CM) for cross-cutting concerns (XCC) (i.e. under Aspect Oriented software) and configs via environment variables. IIRC, using CM with XCC allows you to attach and detatch config from functions, without inline dependencies. Some good examples of this are looking at Web Service Frameworks, such as 'Play' in Scala/Java; probably Django in Py also uses this.Any
Some more discussion about that here: learncsdesign.com/…Any
F
32

How about using classes?

# config.py
class MYSQL:
    PORT = 3306
    DATABASE = 'mydb'
    DATABASE_TABLES = ['tb_users', 'tb_groups']

# main.py
from config import MYSQL

print(MYSQL.PORT) # 3306
Fuchsia answered 1/7, 2017 at 22:41 Comment(1)
I definitely prefer this approach as it works well with code completion in your IDE and you don't have to care about typos in string IDs. I'm not sure about the performance though. Works also pretty well when using inner classes to further separate entries.Frigg
A
21

Let's be honest, we should probably consider using a Python Software Foundation maintained library:

https://docs.python.org/3/library/configparser.html

Config example: (ini format, but JSON available)

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[bitbucket.org]
User = hg

[topsecret.server.com]
Port = 50022
ForwardX11 = no

Code example:

>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.read('example.ini')
>>> config['DEFAULT']['Compression']
'yes'
>>> config['DEFAULT'].getboolean('MyCompression', fallback=True) # get_or_else

Making it globally-accessible:

import configpaser
class App:
 __conf = None

 @staticmethod
 def config():
  if App.__conf is None:  # Read only once, lazy.
   App.__conf = configparser.ConfigParser()
   App.__conf.read('example.ini')
  return App.__conf

if __name__ == '__main__':
 App.config()['DEFAULT']['MYSQL_PORT']
 # or, better:
 App.config().get(section='DEFAULT', option='MYSQL_PORT', fallback=3306)
 ....

Downsides:

  • Uncontrolled global mutable state.
Any answered 18/11, 2019 at 4:19 Comment(3)
It's not useful to use .ini file if you need to apply if statements in your other files to change the configuration. It would be better to use config.py instead, but if the values don't change, and you just call and use it, I agree with the use of.ini file.Incrust
Hmm, configparser does have read/write and section set/get functionality.Any
But yes, handling your config as a dict and then json.dump(d) and write, is (fewer steps, more common and) a more "modern" (current) way to manage config....Any
S
11

A small variation on Husky's idea that I use. Make a file called 'globals' (or whatever you like) and then define multiple classes in it, as such:

#globals.py

class dbinfo :      # for database globals
    username = 'abcd'
    password = 'xyz'

class runtime :
    debug = False
    output = 'stdio'

Then, if you have two code files c1.py and c2.py, both can have at the top

import globals as gl

Now all code can access and set values, as such:

gl.runtime.debug = False
print(gl.dbinfo.username)

People forget classes exist, even if no object is ever instantiated that is a member of that class. And variables in a class that aren't preceded by 'self.' are shared across all instances of the class, even if there are none. Once 'debug' is changed by any code, all other code sees the change.

By importing it as gl, you can have multiple such files and variables that lets you access and set values across code files, functions, etc., but with no danger of namespace collision.

This lacks some of the clever error checking of other approaches, but is simple and easy to follow.

Snag answered 13/10, 2017 at 0:57 Comment(4)
It's misadvised to name a module globals, since it's a built-in function which returns a dict with every symbol in the current global scope. In addition, PEP8 recommends CamelCase (with all capitals in acronyms) for classes (i.e. DBInfo) and uppercase with underscores for the so-called constants (i.e. DEBUG).Engrave
Thanks @NunoAndré for the comment, until I read it I was thinking that this answer does something strange with globals, author should change the nameLimitative
This approach is my go to. I however see a lot of approaches that people say is "the best". Can you state some shortcomings to implementing config.py as this?Atomics
This is the global object pattern, see also the singleton pattern [as the article mentions the latter is less Pythonic]Oilcloth
C
7

Similar to blubb's answer. I suggest building them with lambda functions to reduce code. Like this:

User = lambda passwd, hair, name: {'password':passwd, 'hair':hair, 'name':name}

#Col      Username       Password      Hair Color  Real Name
config = {'st3v3' : User('password',   'blonde',   'Steve Booker'),
          'blubb' : User('12345678',   'black',    'Bubb Ohaal'),
          'suprM' : User('kryptonite', 'black',    'Clark Kent'),
          #...
         }
#...

config['st3v3']['password']  #> password
config['blubb']['hair']      #> black

This does smell like you may want to make a class, though.

Or, as MarkM noted, you could use namedtuple

from collections import namedtuple
#...

User = namedtuple('User', ['password', 'hair', 'name']}

#Col      Username       Password      Hair Color  Real Name
config = {'st3v3' : User('password',   'blonde',   'Steve Booker'),
          'blubb' : User('12345678',   'black',    'Bubb Ohaal'),
          'suprM' : User('kryptonite', 'black',    'Clark Kent'),
          #...
         }
#...

config['st3v3'].password   #> passwd
config['blubb'].hair       #> black
Chokeberry answered 16/9, 2014 at 23:42 Comment(6)
pass is an unfortunate variable name, since it is also a keyword.Primogeniture
Oh yeah... I just pulled together this dumb example. I will change the nameChokeberry
For this kind of approach, you might consider a class instead of the mkDict lambda. If we call our class User, your "config" dictionary keys would be initialized something like {'st3v3': User('password','blonde','Steve Booker')}. When your "user" is in a user variable, you can then access its properties as user.hair, etc.Corkscrew
If you like this style you can also opt to use collections.namedtuple. User = namedtuple('User', 'passwd hair name'); config = {'st3v3': User('password', 'blonde', 'Steve Booker')}Glyco
I think this is a misuse of lambda functions, since a lambda function is by definition an unnamed function (often used as "glue" in function arguments). Here you create a lambda function, and then go on to name it User. Might as well spend an extra line on writing it as a normal function. Would be cleaner. The namedtuple approach is nice.Calamus
Yeah, using a def instead would probly be a good idea. I am used to functional languages where assigning lambdas is sometimes how you write functions. I think python says it's a code smell, though, as it can't optimize stuff as good.Chokeberry
A
5

I did that once. Ultimately I found my simplified basicconfig.py adequate for my needs. You can pass in a namespace with other objects for it to reference if you need to. You can also pass in additional defaults from your code. It also maps attribute and mapping style syntax to the same configuration object.

Arman answered 1/6, 2011 at 8:47 Comment(2)
I know this is a few years old, but I'm a beginner and I think this config file is essentially what I am looking for (maybe too advanced), and I would like to understand it better. Do I just pass initialize ConfigHolder with a dict of configs I'd like to set and pass between modules?Cahilly
@Cahilly At this point I would use (and am currently using) a YAML file and PyYAML for configuration. I also use a third-party module called confit and it supports merging multiple source. It's part of a new devtest.config module.Arman
E
5

please check out the IPython configuration system, implemented via traitlets for the type enforcement you are doing manually.

Cut and pasted here to comply with SO guidelines for not just dropping links as the content of links changes over time.

traitlets documentation

Here are the main requirements we wanted our configuration system to have:

Support for hierarchical configuration information.

Full integration with command line option parsers. Often, you want to read a configuration file, but then override some of the values with command line options. Our configuration system automates this process and allows each command line option to be linked to a particular attribute in the configuration hierarchy that it will override.

Configuration files that are themselves valid Python code. This accomplishes many things. First, it becomes possible to put logic in your configuration files that sets attributes based on your operating system, network setup, Python version, etc. Second, Python has a super simple syntax for accessing hierarchical data structures, namely regular attribute access (Foo.Bar.Bam.name). Third, using Python makes it easy for users to import configuration attributes from one configuration file to another. Fourth, even though Python is dynamically typed, it does have types that can be checked at runtime. Thus, a 1 in a config file is the integer ‘1’, while a '1' is a string.

A fully automated method for getting the configuration information to the classes that need it at runtime. Writing code that walks a configuration hierarchy to extract a particular attribute is painful. When you have complex configuration information with hundreds of attributes, this makes you want to cry.

Type checking and validation that doesn’t require the entire configuration hierarchy to be specified statically before runtime. Python is a very dynamic language and you don’t always know everything that needs to be configured when a program starts.

To acheive this they basically define 3 object classes and their relations to each other:

1) Configuration - basically a ChainMap / basic dict with some enhancements for merging.

2) Configurable - base class to subclass all things you'd wish to configure.

3) Application - object that is instantiated to perform a specific application function, or your main application for single purpose software.

In their words:

Application: Application

An application is a process that does a specific job. The most obvious application is the ipython command line program. Each application reads one or more configuration files and a single set of command line options and then produces a master configuration object for the application. This configuration object is then passed to the configurable objects that the application creates. These configurable objects implement the actual logic of the application and know how to configure themselves given the configuration object.

Applications always have a log attribute that is a configured Logger. This allows centralized logging configuration per-application. Configurable: Configurable

A configurable is a regular Python class that serves as a base class for all main classes in an application. The Configurable base class is lightweight and only does one things.

This Configurable is a subclass of HasTraits that knows how to configure itself. Class level traits with the metadata config=True become values that can be configured from the command line and configuration files.

Developers create Configurable subclasses that implement all of the logic in the application. Each of these subclasses has its own configuration information that controls how instances are created.

Eulaeulachon answered 14/4, 2017 at 15:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.