The typical way to do what you want is to wrap your service calling for the data into a class:
class MyService():
dataset = None
def get_data(self):
if self.dataset = None:
self.dataset = get_my_data()
return self.dataset
Then you instantiate it once in your main and use it wherever you need it.
if __name__="__main__":
data_service = MyService()
data = data_service.get_data()
# or pass the service to whoever needs it
my_function_that_uses_data(data_service)
The dataset
variable is internal but accessible through a discoverable function. You could also use a property
on the instance of the class.
Also, using objects and classes makes it much more clear in a large project, as the functionality should be self-explanatory from the classname and methods.
Note that you can easily make this a generic service too, passing it the way to fetch data in the initialization (like a url?), so it can be re-used with different endpoints.
One caveat to avoid is to instantiate the same class multiple times, in your submodules, as opposed to the main. If you did, the data would be fetched and stored for each instance. On the other hand, you can pass the instance of the class to a sub-module and only fetch the data when it's needed (i.e., it may never be fetched if your submodule never needs it), while with all your options, the dataset needs to be fetched first to be passed somewhere else.
Note about your proposed options:
- Initializing in the
if __name__ == '__main__'
section:
It is not initialized globally if you were to call the module as a module (it would only be initialized when calling the module from shell).
You need to fetch the data to pass it somewhere else, even if you don't need it in main.
- Set a global within a function.
The use of global
is generally discouraged, as it is in any programming language. Modifying variables out of scope is a recipe for encountering odd behaviors. It also tends to make the code harder to test if you rely on this global which is only set in a specific workflow.
- Attribute on a function
This one is a bit of an eye-sore: it would certainly work, and the functionality is very similar to the Class
pattern I propose, but you have to admit attributes on functions is not very pythonic. The advantage of the Class is that you can initialize it in many ways, can subclass it etc, and yet not fetch the data until you need it. Using a straight function is 'simpler' but much more limited.