Online version: http://sand.ess.uci.edu/doc/greenplanet January 20, 2009
GreenPlanet Cheat Sheet
by Charlie Zender
University of California at Irvine
Department of Earth System Science zender@uci.edu
University of California Voice:
(949) 824-2987
Irvine, CA 92697-3100 Fax:
(949) 824-3256
The Earth System Modeling Facility (GreenPlanet) is designed to run large climate simulation codes. This document describes some general usage pointers for GreenPlanet, then demonstrates how to install and run certain codes (Section 3).
Users must have and maintain standard shell environment files. Essentially these files provide path information and helpful shortcuts to the users and any batch scripts that invoke them. These files need not be overly complex. Most importantly, they must set the ${PATH} environment variable and pass control on to CCSM scripts without errors.
Shell environment files must work in batch (i.e., non-interactive) mode. In particular, they must not cause any output to stdout in batch mode—or the batch environment will be incorrect and cause unpredictable results. Minimal shell environment files for GreenPlanet are located in your home directory. Following are some suggested additions.
The default login shell on GreenPlanet is Bash. Add these lines to your default standard shell file .bashrc, or .profile:
Many GreenPlanet users will want to run models, such as CCSM, which use C Shell scripts. For these scripts to work in batch mode, users must have a valid .cshrc file as well.
Users with a .login file in their home directory should ensure that it executes error free.
Alternatively, users can change their login shell to csh or tcsh vis the standard GreenPlanet procedure
This section describes how to install and run some of the various climate, chemical, and biogeochemical models for which GreenPlanet is designed.
This section documents how to install and run the component models which compose the fully coupled NCAR CCSM (Community Climate System Model). Running the fully coupled CCSM itself is described in Section 3.2. First we describe the preliminary system modifications common to all CCSM component models.
CCSM component models may run in Symmetric Multi-Processor (SMP), Single Program Multiple Data (SPMD), and hybrid (i.e., combined SMP and SPMD) modes. To take avantage of the SPMD capabilities, make sure you have the Message Passing Interface (MPI) installed. MPI is already installed on GreenPlanet. To install it on a Debian GNU/Linux machine use:
CCSM build scripts may require the GNU Make executable to be named gmake rather than make. gmake is installed in /usr/local/bin on GreenPlanet. To point the GNU make executable on a Debian GNU/Linux machine to gmake use, e.g.,
All CCSM input-output (I/O) is performed with the netCDF data model. netCDF is already installed on GreenPlanet. To install netCDF on a Debian GNU/Linux machine use:
No documentation yet. Please contribute!
The Community Land Model (CLM) is described by Dai et al. [2003]. Other important references are Bonan [2002], Bonan [1996], and Bonan [1998].
The most frequently used WWW sites pertinent to CLM are
Make GreenPlanet account capable of running CLM offline:
In modifying the CCSM to run on GreenPlanet, it is helpful to know that GreenPlanet and the NCAR-supported machine bluesky have nearly identical hardware and software configurations. First, greenplanet and bluesky are both clusters of p655 (8-way) and p690 (32-way) nodes connected by Federation switches. Each 8-way greenplanet node in the compute queues has 16 GB RAM (the interactive 8-way node greenplanet04m, which is on the internet as greenplanet.ess.uci.edu, has twice as much RAM, 32 GB). The 32-way p690 node has 64 GB RAM. Hence, each computational CPU on GreenPlanet has 2 GB RAM available. bluesky has about twenty times as many nodes as greenplanet. Finally, greenplanet has faster CPUs than bluesky (1.5 GHz rather than 1.3 GHz).
With this knowledge, we created machine files which describe the greenplanet cluster and queueing architecture to the CCSM. These files are based on the corresponding bluesky machine files distributed with the CCSM. Five machine-specific files are required to make the greenplanet a valid CCSM machine-type. These files are
First, read the purpose of GreenPlanet customizations. This will be helpful to those attempting to port the CCSM to other machines (e.g., Linux clusters) at UCI. Then, copy GreenPlanet-customized files to your local CCSM source tree.
The file Macros.AIX controls the options the compiler uses to build the CCSM source code (both C and Fortran). The first change replaces the default NCAR netCDF library path, /usr/local/lib64/r4i4, with its GreenPlanet equivalent, /usr/local/lib:
All GreenPlanet system libraries use the 64-bit ABI where possible. Hence NCAR’s lib64 is simply GreenPlanet’s lib. Second, the default size for Fortran variables of type real is four bytes, i.e., single precision. The default size for Fortran variables of type integer is also four bytes. Newer scientific computing codes, such as CCSM, explicitly specify the sizes of integers and reals so the default sizes are never used. However, many old codes rely on compiler-defaults to specify the sizes of integer and real. Some applications expect other defaults, e.g., eight-byte real and four-byte integer or r8i4 in NCAR parlance. The GreenPlanet uses the r4i4 standard for all libraries in /usr/local. This obsoletes the need to explicitly specify an equivalent to NCAR’s r4i4 subdirectory.
The file check_machine is used by the create_newcase script to customize the script. If greenplanet is not included in check_machine, then create_newcase will exit with the error check_machine ERROR, greenplanet not supported.
The file batch.ibm.greenplanet contains most of the default header that will be used by the CCSM run script when the script is run in batch mode. Here is where the default LoadLeveler commands are set. The GreenPlanet uses the default CCSM settings for the IBM AIX machine bluesky except for the following changes:
The file env.ibm.greenplanet contains environment variables which determine how the CCSM is built and where it looks for input data and stores output data. env.ibm.greenplanet is the same as env.ibm.bluesky with the following exceptions
Finally, the CCSM versions of env.ibm.* contain a subtle bug in the task geometry clauses. Essentially, C Shell syntax splits the two set statements that look like the body of many if-endif clauses into two. The first statement is inside the condition and the second is outside and thus always executed. This bug causes geometry statements to overwrite eachother. We re-coded env.ibm.greenplanet to avoid this confusion.
The file run.ibm.greenplanet is the template run script. The fully resolved run script will be created from this template. This template handles all the details of building, executing the model, as well as storing the output data and possibly continually re-submitting the job until completion. run.ibm.greenplanet is identical to run.ibm.bluesky.
Copy these three files to your local source tree:
Normally, no further customization is necessary. These greenplanet-customized defaults are sufficient to get started running the CCSM. However, advanced CCSM users will often need to further modify these greenplanet-customized defaults.
We are now ready to create the new CCSM case, i.e., experiment. First, store the case name the CASEID environment variable. This name should be clueful about the experiments’ purpose. Next, use the create_newcase script which will create environment variable files and generate the configure script for this experiment.
This will create the CASEROOT directory. Before the model may be run, however, it is necessary to populate the CASEROOT directory with the resolved scripts.
Each CASEID may have multiple resolved-scripts generated for it. The resolved scripts are customized for particular initialization sequences (e.g., restart runs), resolution, machine, and component sets. The configure command generates the resolved scripts:
CCSM supports a sohphisticated test suite.
Passing a single test is a good indication that your CCSM environment and installation are correct. Passing all tests means
Submit the job to GreenPlanet LoadLeveler queues with llsubmit:
Check the status of the job in the queue with llq.
All CCSM component models create log files during execution. Examine these log files to ensure your job is proceeding smoothly. The coupler log is most informative about the overall progress of the simulation.
Log files are stamped with the job-submission time. The time-stamp has the format YYMMDD-HHMMSS. Hence each simulation has a unique identifer, such as 041229-190833 above. The time-stamp is consistent among all the model components in the same simulation. After the simulation completes, the model compresses the log files and stores them, along with the history tapes (i.e., output data) in /ptmp/${USER}/archive/${CASEID}.
CCSM, by default, stores short term output files in
CCSM also has automated long term archiving capability. This can be customized to copy data and logs to any computer on the Internet. Once GreenPlanet disk farm is working, we will activate automatic long-term archiving to
Bonan, G., Ecological Climatology: Concepts and applications, 678 pp., Cambridge University Press, Cambridge, UK, 2002.
Bonan, G. B., A land surface model (LSM version 1.0) for ecological, hydrological, and atmospheric studies: Technical description and user’s guide, Tech. Rep. NCAR/TN–417+STR, National Center for Atmospheric Research, Boulder, Colo., 1996.
Bonan, G. B., The land surface climatology of the NCAR Land Surface Model coupled to the NCAR Community Climate Model, J. Clim., 11(6), 1307–1326, 1998.
Dai, Y., et al., The Common Land Model (CLM), Bull. Am. Meteorol. Soc., 84(8), 1013–1023, doi:10.1175/BAMS–84–8–1013, 2003.
CASEID, 6
CASEROOT, 7
Bash, 2
C Shell, 2
CCSM, 3
class, 5
CLM, 4
Community Climate System Model, 3
Community Land Model, 4
coupler, 7
Debian, 3
GNU/Linux, 3
GreenPlanet, 1
hybrid, 3
I/O, 3
LoadLeveler, 5, 7
Make, 3
Message Passing Interface, 3
MPI, 3
queue, 5
resolved scripts, 7
run script, 5, 6
Single Program Multiple Data, 3
SMP, 3
SPMD, 3
Symmetric Multi-Processor, 3
wall_clock_limit, 6