FilteredPushProject

June 23rd, 2012

Background

Sponsored by FilteredPush Continuous Quality Control Network Integration.

The aggregation of rapidly increasing quantities of species-occurence data from large number of distributed sources can greatly benefit many biological research areas, such as taxonomy, modeling species distributions and assessing the effects of climate change on biological diversity. There are three critical issues with data in current distributed networks of species-occurrence data, as in all scientific data: correcting the errors, maintaining currency, and assessing fitness for use. The FilteredPush project will build a continuous quality control system for such distributed heterogeneous data sets.

The Filtered Push Continuous Quality Control (FPCQC) software integrates the Kepler workflow system as a means for assessing fitness for use, and also providing quality control facilities to Kepler users. For both Kepler and other FPCQC analysis engines, the comparison of data from different sets can rely on standards and ontologies to insure meaningful interpretation of restuls based on them.

A network for continuous quality control of species-occurence data will advance knowledge and understanding across biodiversity informatics, organismal biology, and broader domains.

Mission of Kepler

  • develop formal requirements for integrating Kepler workflow systems as agents of analysis for CQC
  • design and implement message specifications to allow the Kepler workflow system to meet requirements made by the above mission
  • implement a Kepler Agent that implements the FPCQC client interface so that Kepler users can execute QC procedures that are outside Kepler and can subscribe to community annotations on their data and workflows.
  • support for semantic annotation of Kepler workflow

Use Cases

Project Artifacts