datops.c - data matrix operations
Version Tag $Id: datops.c,v 1.40 2006/02/06 19:55:46 db60 Exp $
void dna_makebin(const Dataptr mat, Lvb_bool fifthstate,
unsigned char **enc_mat);
Converts a matrix of sequence strings to a matrix of binary-encoded
statesets, where each of A, C, T, G and O (deletion) is represented by
a different bit. Ambiguous bases are converted to the union of all the
bases they may represent. ? is treated as totally ambiguous and
- is either treated as <?> or as <O>.
mat->m and mat->n give the number of bases in
each sequence and the number of sequences, respectively. Member
mat->row points to the first element in an array of
pointers, each of which points to a sequence stored as a text string.
If LVB_TRUE, treat gaps indicated by - as identical to O. Otherwise,
treat gaps indicated by - as identical to <?>, i.e., totally ambiguous.
enc_mat must point to the first element in an array of
mat->n pointers, each of which points to an allocated
array of mat->n elements. On return,
enc_mat[i][j] will give the binary-encoded stateset for
mat->row[i][j], where i is in the interval
[0..mat->n-1] and j is in the interval
[0..mat->m-1].