How to work during the course
Contents
How to work during the course¶
You will be working on your Galaxy instance throughout the course and produce two different kinds of results:
source codes: jupyter notebooks, python scripts
data: netCDF, tabular/csv, etc. and figures/plots (png, HTML, etc.)
We will use github to save our code changes (jupyter notebooks, python scripts) and our private s3-compatible shared storage (http://forces2021.uiogeo-apps.sigma2.no/
) for storing data (netCDF, tabular/csv, etc.) and plots/figures (png, HTML, etc.)
Clone the forces-2021 git repo to your Galaxy instance¶
start a terminal window in your JupyterLab to clone our github respository:
make a personal token (https://www.shanebart.com/clone-repo-using-token/) and make sure you save it in a safe place!
type:
git clone https://github.com/NordicESMhub/forces-2021 password = token
use the sidebar to navigate in the github structure.
Find code examples under content/learning/example-notebooks
Warning
As previously mentioned, make sure to push your code changes regularly to github.
Save your notebooks¶
Always add your name in the filename to avoid any mistakes (overwrite other’s notebooks, etc.) such as
eClimate_2021_NAME.ipynb
Add/update your notebook in the private forces-2021 repository:
If you are familiar with git and github: make a new branch, add your notebook in the
notebooks
folder and make a Pull Request;otherwise:
download your notebook by right-clicking its name in the file browser and selecting “Download” from the context menu;
upload your downloaded notebook to github force-2021 repository.
If you want to learn more about sharing your notebooks, check our section on “How to share your notebooks?”.
Data Management Plan¶
It is important to think about what you will be using (input data), what you will be producing (jupyter notebooks, python scripts, datasets) and what to do with all the different data (before, during and after your project is completed).
Write a small document (can be in a Jupyter notebook or mardown), detailing:
What input data is needed/used for your project (update it whenever you have new information to add):
Do they have a unique identifier (DOI), are they available from an authoritative data provider, etc. Provide as much information as you would need to find/reuse the same data in a year or more;
What data will you generate?
python scripts and jupyter notebooks should be saved in for forces-2021 private repository
new generated datasets, plots/figures should be saved in
http://forces2021.uiogeo-apps.sigma2.no/
: bucket:work
and then create a new folder with your name (no spaces or special characters). You may also want to save your python scripts and jupyter notebooks there too before you upload the final jupyter notebook & python scripts;
If you have any doubts, ask your group leader.
Organize my project directories¶
Most people who have done a bit of data analysis have experienced that your working directories can quickly become an unintelligible mess. Therefore, make sure to keep a clean house, with e.g. inputs in one corner, outputs in another, and your notebooks in yet a third. For instance, in http://forces2021.uiogeo-apps.sigma2.no/
, bucket: work/yourname
, I created 3 folders:
input
: I will have a document with a list of input data I am using, with the DOI for each datasets;tool
: I regularly save all my jupyter notebooks and scripts (remember that you should also upload them to the private github repository forces-2021, at least at the end of the course)output
: I created two sub-folders:data
: I save all the new datasets (data files), including intermediate results;figures
: I save all my important plots/figures (png, HTML, etc.)