Based on our year-long experience in preparing scientific publications we suggest the following guidlines for preparing figures:
- Each figure is generated by a single python script.
- The basename of the generated figure is the same as the one of the python script generating it.
- The python script only generates the figure, it does not do any heavy computations.
- Have a single central python module that defines basic figure appearance.
- The data needed for plotting are available as
.csvtables or other common file formats. Do not use pickles (
.pkl) for storing data.
- No manual postprocessing by whatever graphics software.
Note, this does not apply for the figures you make for visualization during your actual analysis and method development. The guidelines are only ment for the final figures that go into a manuscript or a presentation.
There are a number of good reasons for these rules.
Primary reason: figures always change.
As long as you work on a manuscript you will always, or your supervisor wants you to, or journal regulations require you to modify your figure. Modifying a figure is a hassle if
- it takes some effort to find the spot where to edit the script,
- running the script takes a lot of time because data are computed,
- manual postprocessing is required.
Having small scripts (1.) that are simple to find (2.) and run quickly (3.) without any manual postprocessing (6.) significantly lowers the effort to actually modify a figure.
If you decide to change, for example, the fontsize or the colors in all your figures, then it is really annoying and time consuming to go through all your scripts and adapt these settings in every file. If instead a single module takes care about these general issues (4.), they can be modified in no time. rcParams in matplotlib belong there. See coding a figure for more about the separation of content and design.
Second reason: figures want to be used.
Ideally, the figures you generate are not only used for that particular manuscript you are currently writing. You yourself, your college, your supervisor, or a collaborator might want to use your figures in a different context, like a poster, a talk, or another manuscript, a review paper, or a book chapter. Chances for your figure to be used are dramatically increased, if the figure can be easily modified. For this it is not sufficient to provide the pdf (or even worse, a pixel file like png or even jpg) of the figure. Rather, a simple dedicated script (1.) with all necessary data (3.) that are readable even ten years later (5.) producing the complete figure (6.) ensures that the figure can be easily adapted to another context as needed.
Third reason: you need to provide the data of the figures anyways.
Many journals nowadays require you to upload the data that were used to generate the figures. Not the raw data, but the ones displayed in the figure. So you need to store this data in files using formats that can be read by others on whatever platform (5.). Storing the processed data into files for direct plotting has the additional advantage that it reduces the dependencies of the code on external packages needed for the computations. This also makes your figure code more likely to be work later on.
How to code a figure
Continue reading on coding a figure.