How to organize a set of scientific experiments using Git - McMap

About

How to organize a set of scientific experiments using Git

Asked 24/1, 2013 at 13:29 Answered 24/1, 2013 at 15:37

Solved git scientific-computing

O

1

18

I'm running experiments on a model, with a workflow like this:

I work in a model (a software in Python)
I change some parameters and run an experiment
Then, I will store the results of the experiment (as a pickle).
Then, I will analyze the (pickled) results using another software (IPython Notebooks).

I'm using Git and Scientific Reproducibility as a guide , where the results of an experiment are stored in a table along the hash of the commit. I would like to store the results in a directory instead, naming the directories as hashes.

Thinking about version control, I would like to isolate the code and analysis. For example, a change of the color in a plot in a IPython notebook in analysis shouldn't change anything in code

The approach I'm thinking:

A directory structure like this:

model
- code
- simulation_results
   - a83bc4
   - 23e900
   - etc 
- analysis

and different Git repositories for code and analysis, leaving simulation_results out of Git.

Any comments? A better solution? Thanks.

Obtest answered 24/1, 2013 at 13:29 Comment(4)

What hex numbers under simulation_results should mean? (I guess they are commit IDs, but I miss some context.) – Nozzle 31/1, 2013 at 0:23

I maintain submodules are a good approach.I have edited and detailed my answer. – Adonis 31/1, 2013 at 6:34

Hi Josef, Yes indeed the hex numbers are commit ids. – Obtest 1/2, 2013 at 20:45

Hi VonC. Thank you for detailing your answer, its very helpful. – Obtest 1/2, 2013 at 20:46

A

4

That seems sound, and your structure would be a good fit for using git submodules, model becoming a parent git repo.

That way, you will link together code, and analysis SHA1 within the model repo.

That means you can create your directory within the private (ie not versioned) directory model/simulation_results based on the SHA1 of model repo (the "parent" repo): that SHA1 links the SHA1 of both project and analysis submodules, which means you can reproduce the experiment exactly (based on the exact content of both project and analysis).

Adonis answered 24/1, 2013 at 15:37 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.