Andrew Warren (
    Timothy Driscoll (

Visual exploration of global metrics for prokaryotic replicons.

    Next-generation sequencing methods have led to a dramatic increase in the available number of fully sequenced prokaryotic genomes, a trend that is sure to increase into the near future.  Standard methods for comparing multiple genomes are classically linear, and focus on individual genes (or, at best, small handfuls of related genes).  In contrast, there is much information to be learned from a system level genome comparison.  Fortunately, there exist a number of useful metrics for describing genomes, including gene density, GC content, promoter density, and more.
    We propose to build a discovery-based software tool that allows researchers to visualize a novel genome sequence relative to a landscape of existing genomes.  The core of this tool will be a 3D plot of three metrics, chosen by the user, with a fourth metric plotted as color on the landscape surface.  Initially, our landscape will consist of all sequenced genomes from the National Center for Biotechnology Information (NCBI), the central clearinghouse for genomic data.  In addition, users will also be able to import their own dataset to serve as the landscape.  While we will focus on genomic data for this project, we propose to build this tool to allow visualization of any related dataset.

    The objectives of this class project will be:

  1. Design a graphical method for the exploration of metrics which characterize prokaryotic replicons (distinct units of a genome).
  2. Find or create global metrics that characterize prokaryotic genomes and can be used to compare multiple genomes.  Possible sources include:
  3. Determine the best implementation language and graphical libraries to use for the visualization; for example:
  4. Create a proof-of-concept tool based around a simple user interface that features:

    Both investigators have similar backgrounds, and work in the same bioinformatics field.  As a result, many of the tasks will be handled jointly.  This includes discovery and choice of metrics, data formatting and import, and user interface design and implementation.

    The investigators will be working in conjunction with researchers at Virginia Bioinformatics Institute, and 454 Life Sciences.  Both of these groups have expressed a strong interest in developing more sophisticated visual tools for genome comparison, and may use this project as a template for expanded development.