You are here
Cyberinfrastructure
Cyberinfrastructure can be defined as “the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor” [USA National Science Foundation Blue Ribbon Advisory Panel on Cyberinfrastructure]. The mission, then, of the USA-NPN Cyberinfrastructure Working Group is to design and implement these needed technologies, as well as to identify and develop the necessary personnel and integration components, in support of the broad mission of integrating a wide variety of phenological science data sources.
USA-NPN Cyberinfrastructure (USA-NPNCI) Goals Ensure that phenological data are:
• Available – the computing systems in the USA-NPNCI employ appropriate best practice and operate continuously. Well-established practices are used to ensure that the location and content of data can be readily identified.
• Usable – data are stored in a stable, reliable, and interpretable data retrieval system.
• Reliable – The USA-NPNCI includes a process for QA/QC of data, ensures that data are not inadvertently changed, that all changes are logged, and that users have tools to verify the integrity of data which they have entered.
• Shareable – data products are complete, subjected to quality assurance, formatted for use and documented for interpretation by others.
• Easily integrated – data and products are consistent with data exchange standards and mechanisms are in place to ensure interoperable with related data sets and information systems.
• Interpretable – the data are routinely summarized, transformed into useful information, and reported in formats designed for specific clients.
Principles and Assumptions To meet these broad goals, the USA-NPNCI will be consistent with the following principles and assumptions. The system must provide to users:
• A simple web interface to input and check the contributed data
• A standardized and well-documented web service interface for contributing, searching, and retrieving data and metadata
• A broad range of USA-NPN phenological data products, and analysis tools.
• Searchable metadata describing phenological data maintained by others, including information for how users can access that data. Eventually, this may develop into a full-fledged portal for a network of phenological data
• Information about and a portal to phenological data maintained by others, both within the USA and in other countries.
• Information and interfaces tailored to specific needs of researchers, educators, students, and decision makers.
Implementation Assumptions The overall implementation and development of the USA-NPNCI will be guided by these assumptions:
• The implementation of the USA-NPNCI will be based, as much as possible, on existing environmental and climate science cyberinfrastructure, such as that from NBII, GEON, and NEON.
• Content will be provided by the USA-NPN community, not the cyberinfrastructure support staff
• Data integrity and cybersecurity are essential, primary considerations.
• While there may be a master repository for metadata describing phenology data relevant to the USA, it is highly unlikely that there will be a master repository for the data itself. However, it will be necessary for the USA-NPNCI to provide the capability of storing USA-relevant phenology data for researchers who do not have the means or infrastructure to maintain a long-term archive.
• A data access and use policy will be necessary that covers observation data and personal data about observers.
• The design must be efficient and automate processes where possible.
• The USA-NPN must be in full cooperation with other national phenology networks to ensure that the needs for global climate research can be met.
• Data sets will need to be modified and expanded. Versioning and traceability are important.
• A small number of metadata and data formats will be supported.
• The USA-NPNCI will evolve and be enhanced over time. The initial implementation will gracefully accommodate scaling and embellishment with cutting-edge technology.
• Where practical, the system will make use of free and open source tools.
• USA-NPN data will be distributed to users in electronic form and there will be no financial cost to users for access to this data.
• Initially, all USA-NPN managed data will be available without restriction, except where necessary to protect the privacy of observers. In subsequent versions of the USA-NPNCI, access to certain data may be restricted to project members during the course of the project, but this type of project-level security might not be in the initial implementations as a design simplification mechanism.
Progress and Timeline The Cyberinfrastructure Working Group has met at three workshops thus far. From these workshops, we have developed an initial plan for the cyberinfrastructure implementation, divided up into three phases.
Phase 1 was targeted for completion by the end of calendar 2007, to provide very basic functionality (essentially proof-of-concept) for collecting and searching metadata about phenology collections and will provide a mechanism for groups to contribute datasets and the metadata about those datasets. In Phase 1, we are working with existing data collections, such as the Plant Phenology Network and Project Budburst, to help coordinate their data collections and to develop tools for these project that can be leveraged into other networks. Phase 1 was built using existing resources and leveraging largely volunteer contributions of time.
Phases 2 and 3 are contingent on funding, and will broaden the scope, provide production-class computing platforms, develop tools for integration and analysis of data, and measure success of information and knowledge dissemination to the scientific community, decision makers, citizen scientists, and the public.
Contacts For further information about the Cyberinfrastructure Working Group and plans, contact Bruce Wilson (Oak Ridge National Laboratory) at wilsonbe@ornl.gov.