Information & Computational Sciences

Using Frequency Tag Clouds in Error Identification

By on 09/09/2013 in Crop Geeks
Word cloud showing barley pedigree data.

Word cloud showing barley pedigree data.

Word clouds sometimes get a hard time (and sometimes justifiably so) but they have functionality that makes them an effective way to look for errors in large textual-based datasets.

In this example we used the online word cloud based service Wordle after running a Perl program over as much pedigree data as we could gather on commercial National List trialed barley cultivars pulling out counts of the number of times a particular cultivar is mentioned. This gives an overall indication of the relative importance of a variety in the UK breeding process in that is shows which cultivars have been most widely bred from.


Some other image showing a word cloud.


About Paul Shaw

Paul Shaw is an informatician within the Information and Computational Sciences Group at The James Hutton Institute and has worked with us since 2002.

His primary interests are in the development of biological databases mainly in the genetic resources community and in the development of visualization tools to aid in the understanding of complex biological data.

For some more background on Paul visit


If you enjoyed this article, subscribe now to receive more just like it.

Comments are closed.