A Novel Variation Operator for More Rapid Evolution of DNA Error Correcting Codes
Submitted to CIBCB 2005
Daniel Ashlock,
with Sheridan K. Houghten

Abstract PDF eprint

Error correcting codes over the edit metric have been used as embedded DNA markers in at least one sequencing project. The algorithm used to construct those codes was an evolutionary algorithm with a fitness function with exponential time complexity. Presented here is an substantially faster evolutionary algorithm for locating error correcting codes over the edit metric that exhibits either equivalent or only slightly inferior performance on test cases. The new algorithm can produce codes for parameters where the run-time of the earlier algorithm was prohibitive. The new algorithm is a novel type of evolutionary algorithm using a greedy algorithm to implement a variation operator. This variation operator is the sole variation operator used and has unary, binary, and k-ary forms. The unary and binary forms are compared, with the binary form being found superior. Population size and the rate of introduction of random material by the variation operator are also studied. A high rate of introduction of random material and a small population size are found to be the best.