|
Sun, 21 Dec 2008
what is a common modeling architecture (CMA)?
This was written following a discussion on the NUOPC Common Modeling
Architecture Committee.
This is a short write-up attempting to clarify some issues around the
term "common modeling architecture". It can mean a variety of things,
and it's important to know which one we're talking about, as each
implies a different development model. In particular, this has
implications for who does what, and who pays for it.
In some circles this is taken to mean:
- a commitment to component-based design: where components can be
created, stepped forward, and destroyed; and have state variables
where they connect to other components. This is the sense in which
the term is used in ESMF or CCA.
- in addition, it implies a commitment to a standard component
interface (SCI), where we agree to adopt ESMF as a standard for
expressing the external interface to a component.
But there's two other steps before this results in a coupled model.
If we have a pool of N components that we wish to connect together,
have N!/(2!*(N-2)!) ways of hooking them together. So model assembly
is still quite far away... You probably don't expect anyone to have
the biosphere model send the data to the ionosphere model, which sends
it to the river model: but the "CMA" defined until now won't prevent
you. We could agree to define this as the CMA and stop here.
NEMS (and FMS, for that matter) takes it a step further, and says that
we expect the components to fit into these defined "slots", atmos-dyn ,
atmos-phy , etc, as well as adhering to the ESMF SCI. The relationship
between the slots is also defined: e.g atmos-dyn and atmos-phy must be
siblings. This we could call the "common physical architecture".
Signing up to participate in a CPA means taking on some development
costs (by the component developer). For instance, GFDL-FMS defines
atmos-phy as a child of atmos-dyn ; if we want our atmos-phy to play in
NEMS (and we do!) we have to take on the cost of recoding our model to
support atmos-phy as a sibling to atmos-dyn .
If we don't sign up to the CPA but only to the SCI, then that
development cost is passed on to the site that is doing the assembly.
So with a CPA we've created a single graph of the N components.
There's still a step to be done before we have an application, which
is to decide what physical fields are shared by each edge of the
graph. That's clearly a science problem and outside the scope of the
CMA committee.
So in summary:
- a CMA can be an SCI or a CPA.
- If it's a CPA, the component developer is signing up to deliver a
component that fits in a certain "slot" of the CPA.
- If it's an SCI, then the costs are passed on to the side that's
assembling the components into an application
Which model does NUOPC wish to follow?.
Sat, 6 Dec 2008
Courant meeting on prediction and filters
AEW predictability, Ryan Torn: look up Thorncroft et al 2007, Berry et
al 2008, Berry and Thorncroft 2005. (Gareth Berry).
10 Sep 2006 AMMA test case. WRF 36 km with param conv.
Suranjana Saha NCEP webpage has maps of rawin and surface measurements.
Sat, 29 Nov 2008
How much disk space do we need?
A survey of disk and archive usage from various experiments. The
ultimate goal is to estimate disk and archive usage. Specifically:
- at what rate will we archive data?
- how much disk is needed so that experiments can be post-processed
without touching
/archive (assuming that a filesystem like ptmp is
available...)
Data produced by /home/vb/bin/frearchrpt.csh .
length is length of experiment in years
Y/D is years/day from restart directory, thus including all delays
due to user, machine, queue wait. etc.
- Other fields are archive size in GB.
- The
ts directory contains duplicate data, due to binning the same
time series in 5yr/20yr/100yr chunks. The pp/ts min column lists
the ideal size if we eliminated all duplicate data. In some cases,
there's more than one level of duplication (i.e more than 2 copies
of some time chunks).
- The
a (atm), o (ocean), i (ice), c (atmospheric chemistry) and b
(biology, i.e topaz ) subscripts split the sizes by component,
unlisted components are negligible.
- There is a significant chunk of data in
ts/seasonal , does anyone
use this or can we eliminate it?
- What's the likely disk high water mark for an experiment? I suggest
we estimate it as a 100-year post-processing chunk, i.e all the
history , ts and av data for a run scaled to 100 years. The archive
high water mark should be the sum of history , ts (min) and av for
the whole run.
- The total disk (i.e
ptmp ) needed: take some mix of runs from this
table — the extreme cases would be 66 ESM2G runs or 11 CM2.4 runs
(7920 PEs) — and see if we could store all the 100-year
post-processing associated with it on ptmp . Based on this data, we
need about 1 PB of disk to operate the post-processing entirely
without touching /archive .
- This total disk space could be reduced by 5X if we used the 20-yr
chunk as the largest chunk needing to be disk-resident at one time,
and the 100-yr chunks as a rare occurrence. And a further 4X if we
used the 5-yr chunk as the basis.
Model |
NPES |
Length |
Y/D |
restart |
history |
pp/av |
pp/ts |
pp/ts min |
directory |
CM2.1U |
180 |
300 |
4.96 |
135 |
6268 |
1166 (926o+213a) |
20354 (16172o+3765a) |
8322 (5725o+2449a) |
/arch4/ccsp/ipcc_ar4/CM2.1U_Control-1990_E1 |
CM2G |
120 |
311 |
6.13 |
194 |
2838 |
709 (646o+36a) |
14649 (12175o+2086a) |
6891 (5044o+1688a) |
/arch7/aja/fre/omsk_2008_03/gm_topo_ogrp/CM2G_gm_topo |
ESM2G |
120 |
68 |
3.48 |
114 |
2148 |
808 (476b+268o+55a) |
10148 (7330b+1862o+826a) |
5314 (3812b+968o+462a) |
/arch8/jgj/fre/perth/ESM2G/oct282008/ESM2G_1860_mica_bergs_prescribed_co2 |
CM2.4 |
720 |
236 |
1.33 |
7027 |
24014 |
6005 (5530o) |
102340 (87836o+8670i+5640a) |
57734 (46052o+5958i+4769a) |
/arch6/fjz/CM2.4/CM2.4SQ_A_Control-1990_D1 |
CM3Q |
264 |
186 |
8.26 |
237 |
5542 |
243 (218o+19a) |
12629 (99870+2344a) |
6356 (4597o+1632a) |
/arch2/wh/cm3/CM3Q_Control-1990_B5 |
C180 |
216 |
27 |
0.5 |
92 |
882 |
14 |
315 |
256 |
/arch3/miz2/GCM/omsk/c180_hv1_amip1 |
CM2M |
110 |
110 |
5.34 |
101 |
1920 |
702 (601o+89a) |
8350 (6536o+1108a+681i) |
4963 (3583o+773a+587i) |
/arch2/bls/rts/perth_2008_10/CM2.1_ESM/CM2M_09nov2008_smaxp005 |
C48L48 |
96 |
22 |
1.6 |
516 |
1557 |
293 (281c+12a) |
1477 (1263a+211c) |
1450 (1235a+211c) |
/arch4/lwh/fms/perth/c48L48_am3p5_chem-gamma2 |
- version mismatch: "re-versioning" tool a useful addition.
- things to say about GO-ESSP/ESG/CMIP5
- broad acceptance of native grids
- broad acceptance that AR4 questionnaire can be replaced by something
that can harvest more structured information
- example of volcano input datasets: controlled vocabulary for
keys/values?
- CONCIM (dt=1yr) to evolve more slowly than APPCIM (dt=1m)
- RCM issues, asked for IOPs of interest ahead of time.
- raised issue of vertical coordinates, seems to be general
acceptance of conversion software!
- PMIP experiments must be done with same model (same resolution) as
CMIP, so that we can directly compare past and future climate
sensitivity. Perhaps if you run PMIP, you must run CMIP with that
model?
- what about CFMIP? Should this be a component? Can we get cloud
feedbacks any other way? Implement the simulator is a high
priority. Need to talk about this at WGNE.
- IPSL: 2.5x1.5A 20; higher-res ocn for AOGCM than ESM; diurnal
cycle of convection from Lafore, Guichard, &c! interactive
aerosols. 80 NEC PEs for 1 year budgeted for all expts.
- CNRM: Arpege+NEMO: indirect effect (first only); completed 1000-yr
control, etc, for ENSEMBLES, also LGM run; CM4LR 2.8A+2O; CM4HR
- 4A+1O; perhaps 1A+0.5O) can run 250yr/mo with LR.
- MPI: ECHAM5+JSBACH veg + MPIOM + HAMOCC. already doing dec pred
with assimilation, limited skill; already doing "ENSEMBLES stream
2" with LR ESM: important for us to be doing ENSEMBLES! ESM for
AOGCM expts as well; atm res between T85-T159, 0.5L80 for ocn;
perhaps no interactive aerosols; expts start spring 2009, new
machine.
- CCCma: CanESM1, perhaps also to be used as AOGCM; similar to AR4
model. pretty good carbon cycle; interactive O3 from strat version,
to be used in CCMVal also. improved phys/chem but probably same res
as before. T63 CHFP coupled historical forecast project CHFP, T63
maybe T128, 1 or 0.5 ocn. runs start late 2009.
- CMCC-INGV: dec pred via EU COMBINE and ENSEMBLES; US collaboration,
is that Rosati? ECHAM5 T159+0.5 NEMO; PICNTL completed with
C-model, all via ENSEMBLES again! T30+2deg OPA. carbon
equilibration acceleration, how?
- CSIRO: T63+0.94; full aerosol chemistry; regional downscaling model
CCAM (conformal cubic); ACCESS model: N96 atm + OM3 res, ACCESS
model probably not available for AR5; will use mk3.5.
- MIROC: chem offline in AOGCM, runs start first half 2009: long term
t42+1deg; near-term HR T213+1/6 ocn + aerosols; 10-member
ensembles, also embedded regional ocean?; MR T85; HR timeslice 20km
+ 1km regional
- MRI/JMA: MRI-ESM; 20 km HR near-term integrations (time-slice and
coupled)
- Hadley: HADCM3 N48+1.26 yes, HiGEM N144+0.33 likely, HadGEM3
N216+0.25 possible; start early 2009. full veg + chem. Upgrading to
IBM early next year, model performance unknown.
- Hadley HiGEM for dec prediction. see above, 128p 0.75y/day.
- CCSM: prognostic aerosols; coupled chemistry. 2009 Ultra-HR 0.2atm
X 0.1 ocn this year at ORNL.
- GFDL
- GMAO; dec pred only, full o3/aerosols. 1x1L72 atm 0.5 ocn 10-member
ensemble + twice the res.
- KMA: using HadGEM2
- LASG (China): regional model CREM coupled to POM. No ESM. coupled
aerosol-chem available.
- CMA (China): NCAR flux coupler coupling MOM4+SIS? T42 atm + MOM4
OM3 grid. with chem. carbon cycle+aero+chem by end of 2008.
- DKC: AOGCM only, T159-0.5, no interactive aerosols.
- NorClim: HAMOCC+MICOM+CCSm for everything else 1.9atm X 1 ocn.
- EC-Earth: ECMWF IFS extended to climate ranges (but key developers
are KNMI etc, ECMWF doesn't actually do runs: seems to be a way to
reassure people that ECMWF isn't invading their turf). T159L62+1deg
ORCA. Only dec pred.
- Discussion: standardized datasets for prescribed o3... what about
n2? oxidants?
- got only half of discussion on decadal expts, most of discussion
focused on misinterpretation (as "forecast")
long-term expts: Karl
AOGCMs
- 500+ yr PIcontrol, prescribed concentrations
- C20C, prescribed conc
- AMIP
- SRES replaced by RCPs labeled by approx rad forcing at 2100 using
- 5 (next highest) 6 (high) 4.5 (mandatory) 2.7 (very high): extend
to 2300 (v high for 4.5, recommended for other RCPs)
carbon cycle models
CMIP3 model yrs 500-8400 median 2200
CMIP5 1215+1125 + 640
data available to public dec 2010
sandrine bony: CFMIP-2
- additional diagnostic: COSP obs simulator: cfmip.net software
download? online calc means less output
- additional idealized runs
- additional process diagnostics
- present our CFMIP plans at WGNE
veronika eyring: ccmval
- ccmval-generated o3 timeseries may be available
- time/space resolution to be decided, possibly
dynamically-referenced coordinates (e.g height wrt tropopause)
pascale braconnot: PMIP
- coordinated expts "highly recommended" by CMIP?
filippo georgi: RCMs
- examples; RegCM, PRECIS, RSM, WRF
- coupled RCMs, 2-way nesting
- 10 km typical, NH CRMs growing
- projects: NARCCAP, ENSEMBLES, PRUDENCE, AMMA, AFRMIP, ARCMIP, AIACC
- 6-hourly fields requirement for subset of rus (as in NARCCAP)
smg: WGOMD
- general acceptance of native grid data
karoly: IDAG
- discussion on volcanic/solar forcing datasets, there will likely be
more than one (~3-4) data sources, need to document.
Wed, 11 Jun 2008
FRE: ppmake
- moved to
fms/make directory, alias to ppmake .
Order matters: if the same target has several different prereqs, the
first one is used. I've used this to make a clean implementation of
HSM functionality. The rules are:
HSM directories are called %.d , and are defined relative to the
root directories on on work , ptmp and arch .
- a directory in
work looks for itself in ptmp and copies it.
- a directory in
ptmp looks for itself on arch and copies it.
- if
%.d is not found on arch , look for %.cpio instead, at the same
point in the tree relative to archroot .
- we'd eventually like to transition out of
cpio to tar . We can do
this by implementing a third fallback rule: if %.cpio is not found,
look for %.tar .
The reverse traffic is a function of policy.
- directories on
work get copied to ptmp .
- directories on
ptmp get copied, or cpio ed or
tar red depending on the policy. Perhaps even dependent
on filesize, or user choice.
Thu, 5 Jun 2008
FRE: Continuing development of frepp.mk
Maintenance issues:
- rename
frepp.mk and put it somewhere else, not in the
scripts/postProcess directory.
- rename the
mkpp alias to ppmake or something... I frequently find
myself typing mkdir when I mean mkpp .
Current problems:
- parallel
make is a problem, several of the lower-level scripts
appear to create intermediate files without unique names maybe? For
instance timavg.csh creates a date-stamped namelist input file, but
that uses the date command with seconds: if two timavg are launched
in the same second, there's a name conflict. I haven't looked into
plevel or split_ncvars or list_ncvars so far.
- deletion of intermediate files is a problem, should I declare all
files in the work directory to be
.PRECIOUS ? i.e
.PRECIOUS: $(WORKROOT)/%
- how to do cheap
LOCALCP on the same filesystem? I tried ln and ln
-s , both still make make try to remake...
- the
LOCALCP is necessary because everything needs to end up in
pp/* , and I'm still struggling with the WORKDIR ... how do you run
everything in some directory, and then have it end up in pp/* ?
Wed, 4 Jun 2008
FRE: design of refrepp
The current frepp runs years in sequence... the 1981 job submits 1982
and so on. If you hit a "chunk" boundary, that job is longer, as it
does the chunk stuff.
Working with the example of a 5-year chunk, and let's assume 1983
failed but 1981, 82, 84 and 85 are available.
The most efficient implementation of refrepp would detect the 1983
hole, resubmit a frepp -t 1983 (which by default is one year, then
submit frepp -t 1985 to complete the chunk. (This will repeat the
one-year processing for 1985 but that can't be helped, at least in the
current frepp ).
1983 should finish running before you launch 1985... this is best done
by using qsub -hold_jid .
If 1981 and 1983 are both missing, they can both be launched
concurrently.
The refrepp script needs to:
- detect the
frepp configuration: you can do this by: interpreting
the FRE XML file; letting freppcheck interpret the FRE file and
looking at the resulting directory tree; by calling tree on the pp
directory (assumes that you've got the right directory structure
there
- detecting the holes: you can let
freppcheck do this; or you can
tree the output directory and do it yourself.
Implementation: you can either:
- produce a
frepp command that creates the output, or
- look in the
scripts/postProcess directory and rerun the scripts you
find there (assumes the scripts exist). This option is probably
safer, since the assumption of refrepp is that some scripts ran and
failed.
The refrepp script will continue to work with only year boundaries: it
looks like smaller run segments will have to wait for the new frepp
(as in the frepp.mk approach).
Fri, 23 May 2008
FRE: frepp.mk state issues
How does frepp.mk know the state of play?
- Availability of history files in
$(ARCHROOT)/history tells where the run
is.
- Availability of files in
$(ARCHROOT)/pp tells what's been
post-processed. (Delete files here to recreate them).
- history
cpio file can contain multiple dates (currently cpio files
are made per year, not per segment). Unpack to recover.
- variables in a file are recovered by running
list_ncvars.csh .
- components in the archive file can be got from FRE XML or by
extracting (
tar tvf or cpio -itI ).
- other input files (e.g
GRIDSPEC ) from FRE. (For other input files,
how do we know when to issue a dmget ? We can force dmget for any
file under $(ARCHROOT) but what about outside?)
Propose that frepp.mk creates the pp directory tree in PTMPROOT and
WORKROOT when unpacking files, and creates a Makefile in each. This
Makefile will include a state.mk file that contains variables you
need, e.g list of variables in a file.
Fri, 23 May 2008
FRE: frepp.mk
The first attempts at building frepp.mk have been quite enlightening.
Indispensable is the GNU Make manual:-).
Here's a first cut:
# HSM.mk: makefile fragment for extracting files from archive files,
# using three-level storage model
#use csh, with no user .cshrc
SHELL = /bin/csh -f
# Directory trees in home, archive, ptmp, home follow identical
# "experiment tree" as defined by EXPT
ARCH = /archive/$(USER)
PTMP = /work/$(USER)
WORK = $(TMPDIR)
EXPT = fms/omsk_vb/c48_am3p3
GRIDSPEC = /archive/bw/fms/init/cubed_sphere/mosaic/C48.M45.tripolar.mosaic.cpio
ifeq ($(suffix $(GRIDSPEC)),.cpio)
USES_MOSAIC_GRIDSPEC =
else
#write clause for no-mosaic case
endif
# define roots
ARCHROOT = $(ARCH)/$(EXPT)
HOMEROOT = $(HOME)/$(EXPT)
PTMPROOT = $(PTMP)/$(EXPT)
WORKROOT = $(WORK)/$(EXPT)
# list of available history files
histories = $(shell ls $(ARCHROOT)/history)
filelists = $(patsubst %.cpio,%.filelist,$(histories))
histdates = $(patsubst %.nc.cpio,%,$(histories))
# commands
TIMER = /usr/bin/time -f # timer for various commands
DMGET = dmget # retrieve from deep storage
FASTCP = /usr/cluster/bin/cxfscp -dprv # fast copy
CP = /bin/cp -p
FREGRID = /home/z1l/bin/tools_20071219/fregrid
FREGRID_FLAGS = --interp_method 'conserve_order2' --nlon 144 --nlat 90
FMSBIN = /home/fms/local/ia64/v10
NCVARS = $(FMSBIN)/list_ncvars.csh
TIMAVG = $(FMSBIN)/timavg.csh -mb
PLEVEL = $(FMSBIN)/plevel.sh
# not sure about .PRECIOUS, this will not get deleted if you C-c!
.PRECIOUS: $(PTMPROOT)/history/%.filelist
.SUFFIXES: .cpio .tar .filelist .complist .varslist
.PHONY: list setup
list:
@echo $(ARCHROOT)/history contains:
@echo $(histdates)
# make setup first
setup: $(PTMPROOT)/history $(WORKROOT)/history $(PTMPROOT)/pp $(WORKROOT)/pp
$(PTMPROOT)/history $(WORKROOT)/history $(PTMPROOT)/pp $(WORKROOT)/pp:
mkdir -p $@
# vpath %.cpio $(ARCHROOT)/history
# vpath %.tar $(ARCHROOT)/history
# vpath %.filelist $(PTMPROOT)/history
#everything in WORKROOT must come from PTMPROOT
$(WORKROOT)/%: $(PTMPROOT)/%
$(FASTCP) $< $(@D)
$(PTMPROOT)/history/%.filelist: $(ARCHROOT)/history/%.nc.cpio
$(DMGET) $<
cpio -it -I $< > $@
mkdir -p $(basename $@)
#how to make the same directory in WORKROOT?
# echo $(patsubst $(PTMPROOT),$(WORKROOT),$(basename $@))
# unpacking a history cpio file: unpack to directory ptmp/%
define unpack_cpio
$(PTMPROOT)/history/$(1).complist: $(ARCHROOT)/history/$(1).nc.cpio $(PTMPROOT)/history/$(1).filelist
cd $(PTMPROOT)/history/$(1) && \
cpio -iv -I $$<
#the following depends on our filenaming conventions
echo dates = `ls $(PTMPROOT)/history/$(1) |cut -f1 -d. |sort -u` > $$@
echo comps = `ls $(PTMPROOT)/history/$(1) |cut -f2 -d. |sort -u` >> $$@
echo tiles = `ls $(PTMPROOT)/history/$(1) |cut -f3 -d. |sort -u |grep -v '^nc'` >> $$@
endef
$(foreach date, $(histdates), $(eval $(call unpack_cpio,$(date))))
#gridspec
$(PTMPROOT)/gridspec: $(GRIDSPEC)
ifeq ($(suffix $(GRIDSPEC)),.cpio)
$(DMGET) $< # how do I know $< is on archive?
mkdir -p $@
cd $@; cpio -iv -I $<; echo Source $(GRIDSPEC) > README
else
#write clause for no-mosaic case
endif
%.nc.2dvars: %.nc
#list of variables in file with a horizontal domain
$(NCVARS) -st23 $< > $@
#regrid an atmos file
# ifdef USES_MOSAIC_GRIDSPEC
# $(shell echo here)
define fregrid_atmos
$(WORKROOT)/history/$(1)/$(2).$(3).nc: $(WORKROOT)/history/$(1)/$(2).$(3).tile1.nc \
$(WORKROOT)/history/$(1)/$(2).$(3).tile2.nc \
$(WORKROOT)/history/$(1)/$(2).$(3).tile3.nc \
$(WORKROOT)/history/$(1)/$(2).$(3).tile4.nc \
$(WORKROOT)/history/$(1)/$(2).$(3).tile5.nc \
$(WORKROOT)/history/$(1)/$(2).$(3).tile6.nc \
$(WORKROOT)/history/$(1)/$(2).$(3).tile1.nc.2dvars \
$(WORKROOT)/gridspec
$(FREGRID) $(FREGRID_FLAGS) --input_mosaic $(WORKROOT)/gridspec/atmos_mosaic.nc --input_file $$@ \
--scalar_field `cat $(WORKROOT)/history/$(1)/$(2).$(3).tile1.nc.2dvars`
endef
define atmos_loop
include $(PTMPROOT)/history/$(1).complist
$(foreach comp, $(comps), $(foreach d, $(dates), $(eval $(call fregrid_atmos,$(1),$(d),$(comp)))))
endef
$(foreach date, $(histdates), $(eval $(call atmos_loop,$(date))))
# endif # USES_MOSAIC_GRIDSPEC
Some notes:
frepp.mk gets the state of the run by looking at the history
directory (which will later extend to the pp directory).
- Certain things (e.g
GRIDSPEC , list of pp components) can be
obtained from the XML. Certain other "state" issues (e.g list of
variables) have to be extracted from the history files. Since we
cannot assume that unpacked cpio files are available, we have to
set some variables as the result of a make . The best approach might
be to create all the Makefile while unpacking.
- we have restricted work to happen in a stringent directory
hierarchy anchored at the various
*ROOT directories. This makes the
Makefile hard to read? Having a local Makefile in each pp/*
directory might be better.
Thu, 17 Apr 2008
Computers in weather and climate
Fri, 22 Feb 2008
FMS: Designated I/O processors
We've toyed on and off with the idea of having I/O streamed from a
subset of PEs: currently you can either do single-threaded from a
pelist , or they all do it, there's nothing between.
Back in October 2007, I wrote a design doc for more flexible parallel
I/O which has yet to be implemented, mainly because on the platforms
we ourselves run, this hasn't become the rate-limiter.
However, some of our colleagues (specifically, MOM users in Australia
with a NEC platform, have very fast arithmetic, and thus are becoming
I/O bound. Tim Pugh proposes a somewhat different design. We arecurrently evaluating both.
Thu, 21 Feb 2008
FMS diag_manager feature requests
Regional output on mosaics
This is a request from Steve Klein:
Is it possible to instrument diag_manager so that output can be
restricted to a subset of grid tiles? The application would be
something where regional output at high time-resolution is desired: it
would make sense to reduce data volumes by only doing this on the
subset of grid tiles in a mosaic (e.g perhaps only one cube face) that
are needed to span the region in question.
CF standard names
We've already implemented a new character(len=*), intent(in)
:: standard_name argument to register_diag_field . The domain
expert is supposed to consult the CF standard name table and match our
diagnostic variables against a standard name.
Bruce Wyman suggests that rather than modify register_diag_field
calls, it would be better if diag_manager_init read a text file that's
part of the checkout, say cf_standard_name.txt which contains a lookup
table from "GFDL variable name" to "CF standard name".
I don't see why that's any easier... it seems to me that either way
the relevant domain expert has to go through all her
register_diag_field calls, and find the standard_name , and add it
somewhere, either in the source code, or submit it for inclusion in
cf_standard_name.txt . But if that's what people prefer we can go with
it.
Thu, 21 Feb 2008
FMS: f90tohtml
Niki found this neat tool called f90tohtml that marks up f90 source
code in HTML, with cross-ref links (caller, callee, etc). Try
this cobweb page.
It looks like f90tohtml hasn't been updated in a while... perhaps we
should try to extend its capabilities? Ideas include
- markup variables with a
title tag that gives its definition, and a
link to its declaration line.
- figure out how to CVSify it, so you regenerate it for each
checkout.
- perhaps link to the model dev database?
Sun, 3 Feb 2008
FMS: enabling concurrent ensembles
FMS provides support for running ensembles (specifically, the Ensemble
Adjustment Kalman Filter of Zhang et al 2005) where each ensemble
member runs concurrently on an independent set of processors. The
concurrency requirement ensures that each ensemble member has its own
address space, and can modify anything within it, including module
global variables, without affecting any other ensemble member. This is
in contrast to the position Max Suarez calls the "moral high ground",
where components must be written so it is always possible to create as
many instances as you like of any component within a single address
space. Be that as it may, we are sticking to the current method
developed by Matt, Shaoqing and myself, where ensembles can only be
run concurrently in separate address spaces.
The ensemble code is now mature to the point where we're planning to
make part of the FMS infrastructure. Bringing it into FMS is proposed
with the perth release: this also means that henceforth the RTS will
include a regression test in ensemble mode.
Niki has provided a good summary of the current code status. The basic
change to the plumbing in coupler_main is straightforward. Some of the
more extensive changes have to do with I/O.
The basic design of I/O relies on the fact that each ensemble member
has its own pelist , which is then assigned a name (e.g ENS01 , ENS02 ,
...); then any file foo.nc which is private to an ensemble member gets
automatically renamed foo.ENS01.nc , etc by the I/O layer. The I/O code
is currently instructed to add this string by the argument
append_pelist_name , which when TRUE , tells the I/O to change the
names.
The code changes appear extensive because this append_pelist_name
argument is passed up and down many deep call chains. I propose
instead replacing this with an internal logical variable
append_pelist_name in mpp , which can be altered by the user:
append_pelist_name=.FALSE.
subroutine mpp_append_pelist_name()
append_pelist_name=.TRUE.
end
... so that it gets set to TRUE in ensemble mode but is FALSE most of
the time.
With these changes, I think the ensemble code becomes vastly
simplified: please review the code changes in this light.
The code is scheduled to be in the perth release (June 2008).
Fri, 1 Feb 2008
FMS: enabling the river code on mosaics
Review of current parallel river model
The first version of a parallel river model in LM2 (see
this paper section) was based on a basin decomposition. An entire
river basin network was maintained on a single processor. The amount
of computational work in a basin is proportional to the number of
cells in the basin. So the work was partitioned among processors by
sorting the basins in descending order of size; then assigning the
basins among processors by going down the sorted sequence and
assigning basins to the PE that had the least amount of work. (See
routine sort_basins ).
This non-traditional decomposition had some limitations. One is that
the largest basin limits the scalability of this approach. For the
30-minute grid of Vorosmarty et al (2000) this limited the scalability
to about 30. The second is since the code uses non-standard
decomposition rather than the domain2D type, we aren't able to use
traditional distributed I/O from diag_manager . Instead we gather the
data onto one processor and pretend to diag_manager that this is
indeed undistributed data. Even the gather onto one processor cannot
use the standard gather routine mpp_global_field , thus we have a
number of strange uses of mpp_sum to effect a gather:
wrk2 = 0.0
wrk2(isc_lnd:iec_lnd,jsc_lnd:jec_lnd) = runoff(:,:) &
*gland_mask(isc_lnd:iec_lnd,jsc_lnd:jec_lnd)
call mpp_sum(wrk2, nlon_lnd*nlat_lnd)
before calling diag_manager , horiz_interp and so on.
Motivation to rewrite
The old method worked for CM2-class models where the performance of
the river code wasn't an issue: it was too small a fraction of the
total model time. We were also working under the assumption that
regridding a network grid was somewhat insurmountable: we were wedded
to the Voro 30-minute grid.
We are currently revisiting this issue for various reasons.
- The atmosphere is moving to a cube-sphere grid. In the FMS
implementation of this grid the cube faces are on different
processors: thus a "basin decomposition" is not possible (basins
may cross face boundaries). For algorithmic reasons we may want to
keep the land grid identical to the atmosphere grid.
- We are getting more confident (see Fekete et al (2001)) that it's
possible to scale up river networks to any target grid in a
reasonable amount of time. For algorithmic reasons we may want to
keep the river grid the same as the land grid. Generation of river
network grids is a project of Chris Milly and Krista Dunne.
- We are now expecting the river to have a complete biogeochemistry
and carry many tracers. The performance assumptions of CM2 may no
longer apply to CM3 and beyond.
Given these considerations, we are proposing a new river parallel
infrastructure that will address many of these concerns: a framework
where
- atmosphere, land and river grids are allowed to be independent;
- when the grids indeed overlie, the code will make appropriate
optimizations;
- scalability not limited by catchment size.
Current proposal
The current architecture has the river physics performed within an
update_river step, which is surrounded by calls that interpolate and
transfer data between land and river grids.
update_river calls river_physics_step which actually performs the
physics
travelnow = maxtravel
do travelnow = maxtravel, 1, -1
if( ncells(travelnow).EQ.0 )cycle
call mpp_clock_begin(physicsclock)
call river_physics_step (River, petravel, travelnow )
call mpp_clock_end(physicsclock)
enddo
maxtravel is 72 on the Voro 30-min grid (that's the Nile I think...)
but most basins have a shorter length. ncells detects whether a basin
has any work for the current travel step travelnow .
Within river_physics_step , most of the action is local, on the point
(i,j) . The exception is this code section, where you send your outflow
to the inflow at the downstream point:
idest=i+di(River%tocell(i,j))
jdest=j+dj(River%tocell(i,j))
if (idest.eq.0 ) idest=River%nlon
if (idest.eq.River%nlon+1) idest=1
if (jdest.eq.0 .or. jdest.eq.River%nlat+1) then
write (*,*) 'WARNING: JDEST=', jdest
endif
River%inflow(idest,jdest) = &
River%inflow(idest,jdest) + River%outflow(i,j)
We propose to rebuild the code as follows.
river calls mpp_define_domains and gets a halo-less mosaic
decomposition. (If it is inheriting the land grid, it can just use
Land%domain ).
- When
idest and jdest reach the domain edge, they send their outflow
into a buffer. (if jdest==jec the outflow gets
copied into the e_buf ; if idest==iec and
jdest==jec , it goes to ne_buf . Once the buffers are
full, we call mpp_send (non-blocking) to send the buffers to the
appropriate destination.
- at the beginning of the
i,j loop, river_physics_step looks in
River%fromcell to see if it's expecting an inflow from another PE:
if so, it calls mpp_recv (blocking) and receives the inflows.
(River%fromcell is not currently in the River data structure, but
it's trivial to compute and define).
- Since the communication is point-to-point and not a halo update, it
is only called when necessary, the amount of communication is
small. If this turns out to be a problem (unlikely) it's not hard
to make this communication overlap with computation.
- The "gather using
mpp_sum " can be eliminated, we can directly apply
horiz_interp and diag_manager calls to the distributed River fields
with the River%domain argument.
- The code is set up to call
horiz_interp between land and river. In
the new code this is optimized so that this call is bypassed with
simple copies when in fact the river and land grids are the same.
Testing
We propose to test this using the current river_solo test case that
uses the Voro 30-min grid. This grid will be cast as a mosaic with two
domains, and should yield the same answer as the original code to
within roundoff. Estimated completion: 1 March 2008.
Other code comments
- I find the current structure, where new tracers have been added
afterwards (using a
_c suffix) to be cumbersome. Could we, for
instance, combine River%storage(:,:) and River%storage_c(:,:,:)
into a single array River%storage(:,:,:+1) whose first element is the
original %storage field?
- The current structure has led to possible typos: e.g in
river_physics_step (line 160) I see a line like
River%storage_c(i,j,:) = River%storage_c(i,j,:) &
+ (River%inflow_c(i,j,:) - River%outflow_c(i,j,:) &
+ River%infloc_c(i,j,:) ) * River%dt_slow
but there's no corresponding line for River%storage(i,j) .
Mon, 28 Jan 2008
FMS: ifort problem?
What is a linker supposed to do if it finds multiple instances of a
main (i.e program entry point) in its list of objects and libraries?
On the Origin, this was a fatal error: the linker ld would fail with
an error listing all the available instances of main .
We have inadvertently set up our cubed-sphere based compilations to
have multiple main programs. In particular, the directory
atmos_cubed_sphere/tools/pp contains files cub2cub_driver.F90 and
cub2latlon_driver.F90 that both have entry points.
Nonetheless, ifort compiled successfully with no error! And the
resulting executable works correctly.
The linker command line from our Makefiles gives us something like:
ifort libcoupler.a libatmos_dyn.a libatmos_phys.a libice.a libland.a libocean.a libfms.a \
-Wl,-V,-M -lnetcdf -lmpi -lsma -o fms_c48_am3p3.x
It turns out that cub2cub_driver.o and cub2latlon_driver.o are in
libatmos_dyn.a whereas coupler_main.o is in libcoupler.a ... purely by
virtue of the fact that our Makefile put the coupler library first on
the command line we have a successful compile!
This is clearly fragile and unacceptable. And after scouring the ld
manpage (for strings like main , entry , program , ...) I can find no
flag even to control how ifort should respond.
Two things:
- this should be taken up with Intel.
- we should modify our codes or checkouts so that these other entry
points never get linked.
|
|