Information & Computational Sciences

Helium Prototypes

The initial prototype for Helium was written in Perl and used the GraphViz dot program to generate static pedigree representations. The screenshots bellow show the first Helium prototypes and show the evolution from static Perl to interactive Java based tools.

Early sample pedigree showing genetic similarity to a defined line.

Early sample pedigree showing genetic similarity to a defined line.

This screenshot shows one of the first prototype GraphViz implementations showing genetic similarity. The reference node is the single node on the 2nd layout layer and the darker the node the more similar it is to this node. What you should see is lines becoming progressively lighter in colour as they move away. One thing that is slightly misleading is that we don’t have all parents here so its just the lineage from the reference line. We did some work looking at node shape and found that while square nodes were probably most useful people really loved the circular nodes!

h4

The first full scale pedigree visualization. In this case node size represents the number of times the line has been used in crosses and colour is used to show the winter/spring ecotype that is one of the main high level dividers of elite barley germplasm.

This example was a severely restricted set of lines as a proof of concept but we now need to move on to developing something that worked with upwards of 500 plant lines.

The second prototype that was developed built on the previous concepts but allowed for node sizing and colouring based on phenotype. While it is true that these visualization techniques were not new; it was true that they had not been used on this scale and within the plant community with other tools focussing on smaller, in some ways more manageable pedigree. This Sugiyama layout with curved edges will be available within Helium as an output option but it has a number of issues. Firstly, its not interactive and thus it was really difficult for the breeders and geneticists to actually find anything and secondly it was really difficult tracing edges. Even with a large display like this one people had problems tracing edges and therefore lineage. We needed something that was more interactive.

David Marshall and Prof. Andy Flavell (University of Dundee) look at our large format pedigree.

David Marshall and Prof. Andy Flavell (University of Dundee) look at our large format pedigree. Displays of this type presented in public places promote cooperative working and have allowed us to identify a large number of errors in our underlying datasets.

The reason we are keeping this layout is that people love it! I think what while the information it conveys is more suited to a high-level overview as opposed to intricate detail it gives a nice overview of a dataset that has general interest to both geneticists and breeders. One interesting effect of the static prototype was that when it was printed to large scale (2.5m x 1.5m)  staff within The James Hutton Institute began looking at it as they passed in the corridor where it was displayed and immediately started picking faults with it. While this seems bad initially it was in reality a great acknowledgement of what we had been doing, it was a useful tool for finding problems in datasets. Finding errors now is really important as a lot of the historical information will be lost over time as experts retire or more on to new areas of work.

Implementation of Helium in Java showing the major features we required. Pedigree visualization, local view and background information sources.

Implementation of Helium in Java showing the major features we required. Pedigree visualization, local view and background information sources. We experimented with square nodes in this example but users preferred round ones.

In order to address the problems mentioned with regards to the static prototype (lack of interactive features and problems with dense data) it was decided to implement the visualization that we had in our static prototype in Java so that we could include interactive features that would allow users to explore their data from within a desktop application. The Java implementation utilised the yWorks yFiles library and connects to the Germinate platform to retrieve pedigree definitions, background passport type data, genotypic and phenotypic data types. The ability to connect to Germinate allows us to hook Helium in to the large databases we have on site at The James Hutton Institute and allow users to explore this information in a pedigree context. This multifaceted approach for viewing data means users can approach it in a way that makes most sense to both them, and the science that we are undertaking.

h6

Development prototype of Helium showing colouring of predecessors and ancestors and node sizing. Also shows the first basic integration with Germinate to pull back background information on selected lines.

This final screenshot shows a recent version of our pedigree visualization tool where we have sized lines based on usage and coloured predecessors (ancestors) and successors (descendants) to show their lineage in the pedigree. It also shows basic integration with Germinate to pull back additional information on selected lines (nodes). The current Helium versions are tightly integrated with Germinate but users will also be able to load in information from text files if required. We will also offer links to other tools developed by the Information and Computational Sciences Group at the James Hutton Institute such as Flapjack and CurlyWhirly.

For more information on the current state of the development of the pedigree visualization tool Helium please view the main Helium webpage. We are happy to get feedback on any of our tools. You can contact us at paul.shaw@hutton.ac.uk.

 

 

 

 

 

 

 

 

 

 

 

Top