NAME
icastats - Statistically analyse the structure of the clus-
ter index.
SYNOPSIS
icastats [ filename ]
DESCRIPTION
ICAstats takes a named ICAprint output file, or reads one
from stdin, and prints to the screen(stdout) various statis-
tics about the cluster organisation. ICAstats is useful
because it concisely summarises the structure present in the
original sequence data file as recorded in the cluster index
file.
Some of the statistics are only relevant if the index file
was produced from a single fully normalized cDNA library.
The prediction of the number of unfound sequences is calcu-
lated using a Poisson model. The prediction is best made
from a cluster index produced by ICAass because only ICAass
clusters sequences on the basis of global similarity. The
best ASS threshold to use depends upon the context but I
tend to use 50% most often.
EXAMPLES
icaprint | icastats
icastats clusters.January
SEE ALSO
ICAass(1), N2tool(1), ICAtool(1), ICAprint(1), ICAmatches(1)
BUGS
No one has yet produced a fully normalized cDNA library so
comparisons of the predicted number of unfound sequences
with the total numbers of clusters found at various sampling
levels would be interesting.