B.N. Lawrence[1], R. Drach[2], B.E. Eaton[3], J. M. Gregory[4], S. C. Hankin[5], R.K. Lowry[6], R.K. Rew[7], and K. E. Taylor2.
The Climate and Forecast (CF) conventions governing metadata appearing in netCDF files are becoming ever more important to earth system science communities. This paper outlines proposals for the future of CF, based on discussions at an international meeting held at the British Atmospheric Data Centre in 2005. The proposal presented here is aimed at maintaining the scientific integrity of the CF conventions, while transitioning to a community governance structure (from the current situation where CF is maintained informally by the original authors).
The Climate and Forecast (CF) metadata conventions are designed to promote the processing and sharing of data stored in files created with the netCDF API[8]. Fundamental features of the conventions are that CF-aware software can automatically determine the space-time location of variables (facilitating analysis and graphical display), and metadata describing each variable is sufficiently detailed to determine whether variables from different sources are comparable. The general principles of CF design are enumerated in Gregory (2003)[9]:
CF was initiated in the climate modelling community because of the lack of an adequate existing standard (although COARDS was a good basis, with which CF remains backward compatible). CF is becoming the de facto standard for storing outputs of atmospheric, ocean and climate models. Interest may expand to related "earth system" communities, but few users have thus far embraced the full self-descriptive capabilities enabled by the CF-standard, perhaps because software that can interpret all of this information has yet to be developed. Furthermore, "standard names" have not yet been agreed on for certain quantities, and in some cases the existing conventions are unsuitable or inefficient. To avoid fragmentation of the potential user community, such issues must be addressed more rapidly.
CF use is growing. As CF becomes more important to a diverse range of communities, its boundaries of applicability are being stretched, and the workload on the original authors is growing, such that it is not possible to keep pace with requests for extensions. There have been questions about how (and when) CF can evolve in the future, but it is not possible to answer such questions with confidence given the current situation where authorship of CF is a marginal activity on top of already full workloads, with the authors’ personal interests representing only a part of the range of the user community. Moreover the widespread use of CF means that its development must be undertaken with great care and proper consultation.
This paper outlines some of the immediate issues that the CF community is confronting, beginning with how CF transitions to community ownership and maintenance. It is based on an extended discussion held at the British Atmospheric Data Centre in June 2005 as part of the annual Global Organisation for Earth System Science Portals (GO-ESSP) meeting.
If CF is to become a widely accepted community standard, it is imperative that the community (a) takes ownership of and responsibility for the future trajectory of the standard and (b) invests substantial resources in its development. To do this we need first a clear definition of the boundaries of the problem (who are the CF community?) and then to set in place a structure for governance and development which:
In some circumstances these three “what” requirements will pull in different directions. When this happens, one of the requirements of the governance structure is to impose a set of constraints on CF evolution that minimises this tension. To support the requirements we also need agreement about three “how” issues:
In this section we discuss the definition of “who is the CF community” and these three “how” issues in regard to both what we believe is achievable in the near term, and in terms of the three “what” requirements.
Climate modellers initially dominated the CF community, but, recently others (e.g., observational communities) have seized on CF as the solution to their own format description problems. As these communities have expressed their needs, new problems have arisen: for example, different terminologies are used, Earth geometry is real, rather than idealised, and standard names are needed for “raw” instrumental measurements before they are converted to data which are independent of the method of observation. Other problems can be anticipated, particularly as interdisciplinary work progresses. The scope of CF standard names, for example, will likely need to include observational biological oceanography, atmospheric chemistry, forest-type classification, etc.
Given that CF effectively consists of three sets of conventions - vocabulary management, semantic concepts (axes, cells etc), and format specific conventions (netCDF, for now), it is possible that different communities may wish to be involved in different parts of the CF evolution. For example, some users of CF are only interested in the CF standard names, storing their data in other formats (HDF-EOS, NASA-Ames etc), although this is problematic because some parameters need more than just the standard names to be fully described (e.g. the cell methods attribute describes the averaging); i.e., vocabulary and concept are not entirely separated.
For CF to fulfil its mission as a medium of interchange for complex information, the needs of users must be balanced with those of data producers. In addition, the input from developers of software that can interpret and use the CF metadata descriptions is vital. (It was noted at the GO-ESSP meetings in both 2004 and 2005 that only a subset of CF-defined metadata is interpretable and actually used by any real software.) Furthermore, despite the name, forecasters have not been greatly involved in developing CF.
The future of CF will inevitably involve use in earth system models, and so the scope of CF will need to address the requirements of a wide range of communities. Furthermore, CF does not exist in a vacuum. There are other projects that are developing community standards for encoding data and describing it: notably the Open Geospatial Consortium; the WMO metadata initiatives; and controlled vocabularies in use in other parts of the earth system science community. A significant challenge for the CF community will be finding ways of interacting productively with these communities. Meanwhile a working definition of the “CF community” needs to be “those who choose to use netCDF to store earth system data in a self describing way” even though this initially disenfranchises those who wish to use CF conventions for other data formats.
There are four basic funding mechanisms that could support the maintenance and development of the CF conventions:
Currently all CF development is funded by the fourth of these options, which means that CF development is not responsive enough, and it is leading to untenable workloads for the key players.
Questions that have been raised concerning the first two funding approaches listed above include:
Given the right community management structure, a combination of funding approaches could be used, and indeed given where CF is now, a combination of these approaches is inevitable during what will have to be a transition phase.
In the near future an element of funding from benevolent organisations is expected. Expressions of commitment to support parts of CF development have been received from Unidata, from the Program for Climate Model Diagnosis and Intercomparison (PCMDI), and from the NERC Centres of Atmospheric Science (NCAS): Unidata may assist with the community interface, PCMDI will provide support for the CF development process, and the NCAS/British Atmospheric Data Centre (BADC) will be contributing direct support for standard name maintenance, development and evolution. However, even with this extra support, CF will not be able to evolve as it needs, and other funding arrangements will be necessary. Such arrangements will not be possible without visible leadership, accountability and formal encoding of the process of governance
In keeping with the past development of the CF standard, the GO-ESSP participants strongly favoured a continued consensus-based governance procedure. Wide participation should be encouraged with input sought from individuals representing a variety of disciplines, perspectives, and geographical locations. Individual influence on CF's future should be commensurate with the value of suggestions made and independent of funding.
There is a strong desire within the existing CF community that, regardless of funding, the evolution of CF should be carried out with due process. Accordingly, we propose 1) that a collection of reference files, illustrating correct CF encoding and suitable for the testing of data-reading applications be maintained at every stage and 2) that CF modifications occur in a manner that conforms to the following sequence:
The process will be different in detail for modifications (usually additions) to the standard name table on the one hand, and to the CF standard on the other. The former are more numerous, require faster turn-round, but generally do not involve wide community debate or trial implementations; an appropriate turn-round time is one month. For the latter, three months would be more realistic.
We also propose the establishment of two standing CF committees, the membership of which would be open to those with significant interest and time to commit to taking CF forward:
1. Conventions Committee, to be responsible for developing changes to the CF standard, and to include (but not be limited to) representatives of those who have reference implementations, who can provide feedback on the practicality of CF initiatives and validation of tools which wish to be described as “CF-compliant”.
2. Standard Name Committee, to be responsible for adding standard names to the CF convention and working towards interoperability with other vocabulary maintainers.
Although CF must remain community-driven, its continuity can only be assured if some formally established board, responsive to community needs, assumes responsibility. Accordingly, it is also proposed that a CF Governance Panel be established under the auspices of relevant major international programmes, which would be expected to appoint all or nearly all nominees to membership. The next section discusses the establishment of such a panel.
In view of their responsibility to the community, the membership of the committees will be advertised on the CF website. The job of the committee members will be to take an active interest in the community debate on new developments (process outlined above), participating personally where appropriate. On behalf of the community, the committees will take the required decisions within their respective remits by consensus among their own membership in cases where community consensus is not evident because public debate is inconclusive.
Each of the CF committees will include a permanent funded member of staff, viz. the manager of CF standard names and the manager of CF conventions. Both individuals must be skilled technical document editors, as their primary duty will be the maintenance of the CF standard documents. The latter will mostly likely be a software engineer, able to appreciate issues involved in both data processing and models. The former needs an understanding of modern metadata frameworks and a broad scientific interest, since to deal with requests for new standard names (often coming from those who are not themselves the scientists responsible for producing the data) it is necessary to understand what the quantities are and to become familiar with the terminology in various fields. If past experience is a guide, the development of CF will depend largely on these two individuals, who will be the first point of call for requests for modifications and who will carry the process forward by their own contributions to discussions and by deploying appropriate technology to facilitate community involvement. In particular, it will be part of their job to provide reference files and implementations. Although the responsibility will lie with the committees corporately, the other members will probably have only part-time involvement and the vagaries of their individual workloads should not be allowed to preclude timely progression of CF. The lack of anyone with time to carry out the role of manager of CF conventions over the last year or so, for example, has meant that several significant developments have been agreed on the CF email list but not yet implemented in the standard (extension to cell_methods, standard_name parameters, relation of formula_terms and bounds, relation of forecast and validity time).
In both committees we recommend that
o Insofar as possible, the principle of “no consensus, no change” should be followed.
o The chair should be elected to serve for a period of three years, although the membership should be self-selected. Where and when possible, the secretarial function should be funded.
o Insofar as possible, a geographical distribution of members should be encouraged – partly by nominating potential candidates to self-select as required.
Other ad-hoc groups may be needed to consider various issues such as those listed in the Appendix One. Such areas, while heavily overlapping with the CF core, may involve developing branches of CF which may initially be in conflict. Such conflicts will need to be resolved through a community discussion at the committee level.
As outlined above, a formal governance framework is also desirable. There are a number of possible candidate organisations that could provide a framework, but reflecting the history of CF, we here recommend that initially the World Climate Research Programme's (WCRP's) Working Group on Coupled Modeling (WGCM) take formal responsibility for CF.
The WGCM and Working Group on Numerical Experimentation (WGNE) have been established to work together and with the international climate and weather modeling communities to establish "an integrated approach to climate modelling in the WCRP." Part of their charge is to promote coordinated experimentation, which requires sharing data among centres. For the sharing of climate model output, the format of choice has usually been netCDF, and in some recent WGCM-coordinated projects, the CF-conventions for metadata have been adopted (e.g., simulations in support of the IPCC's Fourth Assessment Report).
The WGCM has expressed interest in data standards for climate model output and has been kept abreast of the CF-developments. Recent discussions at the Ninth Session of the WGCM led to an invitation to the original CF authors to formally request appointment of a panel that under the WGCM would provide oversight for the governance of the CF conventions. It is therefore proposed that the WGCM establish a CF Governance Panel charged with the following responsibilities
The WGCM will appoint members of the panel, which would specifically include (but not necessarily be limited to[10]) representatives from: 1) the WGCM, 2) the leaders, or their nominees, of groups (initially PCMDI and NCAS) contributing significant resources in support of CF, and 3) the chairs of the Conventions Committee and Standard Name Committee described in section 2.3 above. Although this panel would be responsible for stewardship of CF, it would not have any special responsibility for or influence on its technical content. The terms of reference for the CF Governance Panel appear in Appendix Two.
As CF usage widens, it may be necessary to consider alternative parent bodies (to the WGCM) for the CF Governance Panel, to provide alternative funding strategies and/or better represent community expectations of governance. Accordingly, both the WGCM and CF panel will regularly review whether other governance strategies would be more effective.
The two main priorities in the near future are to spread the load of CF management by making better use of technology (including website, issue tracking etc) to exploit the existing in-kind contributions, and to agree upon a strategy for establishing longer term funding stream to support CF activities. The WGCM's CF Governance Panel needs to be established. At the same time, the community needs to buy into the future management structures outlined here (or suggest alternatives). The GO-ESSP meeting proposed a formal timetable to achieve that buy-in and move from the current situation to a new community managed framework:
Throughout this process, interested parties should lobby any or all national or international bodies that could provide ongoing funding for CF maintenance and development.
There are a large number of issues which need addressing in the near future, some of which have the potential for requiring divergence in the CF conventions, and this is partly why an effective CF governance structure is needed.
Issues include:
Constitution of the CF Governance Panel
The CF Governance Panel will come into existence on 1st October 2006 and assume its duties and responsibilities from that date. Until that date, the original CF authors will have responsibility for the governance of the CF standard.
The panel will initially be appointed by the WMO/WCRP Working Group on Coupled Modelling (WGCM) to govern the development of the CF standard on behalf of the user community. If the WGCM wishes to transfer or share responsibility for CF with another permanent committee of an international organisation, it may decide to do so with the agreement of both CF committees.
The panel will include, but not be limited to, members of the WGCM, the leaders, or their nominees, of organisations contributing significant resources in support of CF, and the chairs of the CF committees on conventions and standard names.
Terms of reference of the CF Governance Panel
The panel will be responsible for the stewardship of CF, but will not have any special responsibility for or influence on the technical content of CF.
The panel will ensure that its own constitution and terms of reference, those of the CF committees, and the aims and general principles of design of the CF standard are published. These may be altered by the panel only after appropriate public debate and with the agreement of both CF committees.
The panel will promote and help integrate the use of CF across WCRP programmes, the broader programmes of WCRP's sponsors (ICSU, WMO, and IOC), and other interested communities. In particular the panel will attempt to influence developing metadata standards of WMO and other international organisations so that they accommodate the CF standard.
The panel will encourage continued support of CF by benevolent organisations and explore additional funding mechanisms if necessary.
The panel will appoint people who have nominated themselves or agreed to serve as members of the conventions committee and the standard names committee. In doing so it will have regard to the expertise and interests relevant to the remit of each committee and will attempt to obtain representation of a variety of communities and a geographical spread of members. The maximum and minimum numbers of members of either committee may be amended by agreement between that committee and the panel. The panel will ensure that the current membership of the two CF committees is published.
Constitution of the CF committees
Anyone with sufficient time, interest and expertise is qualified to serve as an ordinary member of the conventions committee or the standard names committee or both. Prospective ordinary members will nominate themselves. Before 1st October 2006, members of the committees will be appointed by the original CF authors. From 1st October 2006, they will be appointed by the CF Governance Panel. Committee members judged by the panel to have been inactive for an extended period will be asked to resign. Members may choose to resign at any time and must retire after five years, but may be reappointed. The maximum number of ordinary members of each committee is sixteen and the minimum is four.
The initial chairs of the committees will be appointed by the original CF authors. Whenever a vacancy arises subsequently, each committee will elect its own chair from among its ordinary members. The chair may resign the office at any time and must retire after three years, but may be reelected.
Each committee will have authority and responsibility for the development of the aspects of the CF standard within its remit, as detailed in its terms of reference, having regard to the aims and general principles of design of CF. It will discharge its responsibility by overseeing and contributing to public debate on how the standard should be expanded or altered, and by deciding after debate whether and what changes will be made.
Each committee will be assisted in its responsibility by a permanent and funded member of staff, who will be appointed in a manner decided by the CF Governance Panel, and who will be an ex-officio member and secretary of the committee. The manager of CF conventions will be the secretary of the conventions committee and the manager of CF standard names will be the secretary of the standard names committee. The committee will propose priorities for work by its secretary.
Each committee will ensure that appropriate means are made available for making proposals and carrying out debates in a way which is visible and open to participation by all interested parties, and for retaining a permanent public record of debates and of any decisions made by the committee. It will decide the period of time which should be allowed between a proposal being made publicly and a final decision being required, and any other procedures necessary. These procedures should assume that community consensus will generally be followed in deciding for or against changes, but the committee will make the decision itself if no consensus is evident or the public debate is inconclusive. The committee may decide exceptionally to make no change although public consensus is in favour of a change, if it judges that the proposed change would be damaging to the CF standard. A committee member who believes this to be case for a particular proposed change should notify the other members of this opinion before the time arrives by which a decision must be made according to the rules, and may request the committee to take the decision itself, rather than allowing public consensus to decide. If such a request is not made, the secretary will decide what is the outcome of the public debate and will make or not make changes accordingly.
A decision by the committee on changes to the CF standard within its remit or on any procedural matter will require a simple majority of the membership to be in favour.
The committees may not devolve responsibility for the CF standard to other bodies, but they may encourage the formation of other permanent or temporary ad-hoc groups to debate issues and propose developments to CF.
Each committee will ensure that the parts of the standard for which it is responsible, any supporting documents and resources, and the procedures for proposing and deciding changes are kept up-to-date and made publicly available.
The committees may not change their constitutions or terms of reference themselves, but each may propose changes to be made by the CF Governance Panel.
Terms of reference of the conventions committee
The conventions committee will be responsible for the development of the CF conventions constituting the CF netCDF standard, except for the definition of standard names and of any other aspects of controlled vocabulary in the appendices to the standard that it agrees with the standard names committee should be within the remit of that committee.
The conventions committee and standard names committee will together define the format of the standard name table.
The conventions committee will have an interest in implementation of CF metadata conventions corresponding to the CF standard in other file formats and media apart from netCDF.
The conventions committee will be responsible for the CF conformance document and for deciding what CF conformance means.
The membership of the conventions committee should include representatives of those who maintain widely used software which follows the CF conventions, especially those which the committee regards as reference implementations.
Terms of reference of the standard names committee
The standard names committee will be responsible for the definition of CF standard names and of any other aspects of controlled vocabulary in the appendices to the CF netCDF standard that it agrees with the conventions committee should be within its remit.
The standard names committee will be responsible for maintaining the standard name table. The standard names committee and the conventions committee will together define the format of the standard name table.
The standard names committee will have an interest in working towards interoperability with other vocabulary maintainers.
The standard names committee will make proposals for modification of conventions when it appears that names cannot be satisfactorily defined within the prevailing conventions.
The membership of the standard names committee should include representatives of the various scientific user communities of the CF standard.
[1] NCAS/British Atmospheric Data Centre, Rutherford Appleton Laboratory, U.K. Correspondence to: b.n.Lawrence@rl.ac.uk
[2] Program for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, U.S.A.
[3] National Center for Atmospheric Research, U.S.A.
[4] NCAS/Centre for Global Atmosphere Modelling, University of Reading, U.K.
[5] NOAA Pacific Marine Environmental Laboratory, U.S.A.
[6] British Oceanographic Data Centre, Proudman Oceanographic Laboratory, U.K.
[7] Unidata, University Corporation for Atmospheric Research, U.S.A.
[8] http://www.cgd.ucar.edu/cms/eaton/cf-metadata/ as of July 5, 2004, describing CF-1.0
[9] http://www.cgd.ucar.edu/cms/eaton/cf-metadata/clivar_article.pdf as of July 5, 2004, document dated November 6, 2003.
[10] Indeed, other groups and disciplines for which CF is important will be encouraged to become involved in all levels of CF governance.