Kepler/pPOD

Kepler/pPOD is a customized distribution of the Kepler scientific workflow system designed specifically to support phylogenetic data analysis. The goal of Kepler/pPOD is to provide an easy-to-use desktop application for allowing researchers to create, run, and share phylogenetic workflows as well as manage and explore the provenance of workflow results.

Kepler/pPOD is one of the core database technologies being developed within the NSF-funded processing PhylOData project, which aims at enabling wide-scale information sharing and integration for the AToL and broader phylogenetics communities.


Features of the Kepler/pPOD preview

The current "preview" release of Kepler/pPOD is designed to demonstrate the capabilities of scientific workflow technology for phylogenetics. The current distribution includes:

  • A library of reusable workflow components for aligning biological sequences and inferring phylogenetic trees based on molecular and morphological data. Library components provide transparent access to remote services, including CIPRes web services (PAUP*, RAxML, MrBayes, CLUSTAL, etc), and automate the execution of local applications (including Phylip and Gblocks).
  • A graphical workflow editor for viewing, configuring, executing, and creating scientific workflows.
  • A flexible data model for representing phylogenetic data (including sequences, character matrices, and trees) and data collections. This data model is used by the system to transparently convert between different underlying data and file formats (FASTA, NEXUS, PHYLIP, etc).
  • An integrated provenance recording system for tracking the components used and data dependencies created as a workflow is running. The result of running a workflow is stored in a standalone, workflow "trace" file.
  • An interactive provenance browser for visualizing and navigating data dependencies and provenance information created during a workflow run.
  • A suite of sample workflows for performing various phylogenetic analyses. These workflows can easily be modified by changing parameters, providing different input data, or substituting different methods for particular steps of the analysis.

Download and installation

The current preview release is available for Mac OS X and experimentally for Windows. To try out the preview:

  1. Download the Kepler/pPOD preview release.
  2. Expand the distribution file to your desktop or to other location on your computer.
  3. Open the distribution directory and double-click on the file named "Run Me!.jar".

More information

More information about the preview is available in a short presentation and a poster. Also, please see the Kepler website for more information on Kepler, including tutorials, user manuals, and developer documentation.


The Kepler/pPOD Team:

Kepler/pPOD is being developed by the DAKS group at the UC Davis Genome Center and Department of Computer Science, and includes contributions from Shawn Bowers, Timothy McPhillips, Sean Riddle, Manish Anand, and Bertram Ludaescher. For additional information about the project, including comments or suggestions, please contact Shawn Bowers at sbowers@ucdavis.edu.
 

This material is based upon work supported by the National Science Foundation under Grant No. 0612326 (IIS). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.