-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Dear EZfate users -- In the next several messages I will post the tool change that I use to make the data that I serve with EZfate. This code is not uploaded to the repository because it is not designed for general use. It is here so you can see what I have done, and so you can adapt it if you wish. But read the code carefully, and think if you really want to start with such a specialized code.
This toolchain runs well on a machine with 256Gb of RAM and 32 cores. Fewer cores would be ok, but less memory would require some adaptation.
Note that I do not recommend you grab this code blindly and use it for your problem. It is uses the old version 2.2 of oceanParcels, and is not set up to be a general purpose tool chain. It is very strongly designed for a global multi-year project, and so is focused on efficiency in both computer time and disk storage. In general, for a more regional problem, I would keep the particle tracks in a full latitude and longitude format, and not change the locations to a grid index. Also, early on, we trimmed the periodic padding points from the Mercator GLORYS grid. This was a mistake, and if I were to re-write the code, I would not do this again. Also, the file names are based on the old Matlab date numbering scheme, which is awkward...
The first thing you need to do to use this code is download the global circulation data from Mercator. The following code does this. Note for EZfate alone, you only need to get horizontal and vertical velocity -- I get the other fields for other projects. Also, you will need to make a Copernicus ocean account that allows you to download the C-grid data (the A-grid data will not work). In this code, I have redacted my credentials -- you will need to obtain your own from Copernicus Ocean.
To use this code, set up your Copernicus Ocean account, and then set the dates in the python file to download the data you need. Note that this code will skip files if it can't read them and crash if the network blinks. You will need to keep running this code until it gets all the files it needs. Just keep running it. It will check what it has, and try to get what it does not.
Data Download Code: read_allKinds_data_toShare_noPasswords.py
After the files are downloaded, run the following two codes to move them to permanent storage sorted by years, and then the second code which adds symbolic links from a single directory with links to all files in all years to the actual data files. This may seem awkward and convoluted, but allows you to have storage spread across multiple disk groups.
moveData_DONT_rechunk.py and
makeSwitchboard.py
Before you start running any of the code below, make in the directory you are running in the directories dataPaths, dataPathsTemp and dataMatrices. After you are done, you can delete the contents of dataPathsTemp. The locations of the final R connectivity matrices is set in yieldConMatrix.py, discussed below. These directories can be very large -- on my machines they are actually symbolic links to directories on larger files systems.
Thank you to @sophia-wright-blue for pushing me to post this code