Tuesday, March 25, 2008

Evolving Crystal Structure Through Simulated Evolution

From my news webservice I got this interesting sample of genetic algorithm application, see here.

An excerpt:

"[Julio] Facelli, director of the University of Utah's Center for High Performance Computing and a biomedical informatics professor, uses NCSA's Mercury cluster to predict crystal structures for organic molecules that are frequently used in pharmaceuticals, fertilizers, and explosives."

"Modeling the crystal structure of a given substance, Facelli and his team begin with nothing more than the atoms in the molecule and the nature of their bonds. They're looking for structures with the lowest energies, which typically mark the molecules' standard crystal structures or something very close. The problem is that this straightforward data and straightforward goal create billions of possible solutions. Just imagine finding the needle of the lowest energy in that haystack of possible structures."

"'An exhaustive search is not feasible, so we have to have a way to direct the search,' says Facelli. '[Genetic algorithms] are based on the principle of survival of the fittest. Trial solutions compete with one another in the population for survival and produce offspring for the next generation of tests. These algorithms offer excellent scaling properties, which make them good for large-scale parallel computing systems like those at NCSA and emerging computational grids like TeraGrid.'"

Here the article outlines through a simple manner the inner working of the genetic algorithm used:

"For example, an initial calculation on NCSA's Mercury may run 20 simulations on 20 different processors simultaneously, calculating possible crystal structures for a given set of atoms and their bonds. The energies for these structures are compared. The 10 with the lowest energies are kept, and the features of those 10 are mixed and matched to generate the structures for another 10 possible structures. Energies are calculated again, comparisons are made, best candidates are kept, and the cycle continues.

The 'mating operation,' as the mixing and matching is called, stagnates quickly, producing very similar structures over the course of thousands of generations. To combat this lack of variety, the genetic algorithm also introduces arbitrary mutations into the process, occasionally taking one variable from one of the best candidates and including a random number for that variable in the next generation."

Stagnation has been, since immemorial times, a hard drawback to overcome, mainly when using an elitist selection method. Taking into account that those individuals holding the lowest levels of energy are the best ones, then it is, in some sense, normal that the genetic algorithm got stuck quickly. Surely, there already are techniques to treat that, such as truncation selection.

If I were the author and considering he is using mutation and crossover, I would set up a high mutation rate (0.85 to 0.95) and a mild crossover rate (0.45-0.6), of course I would put the elitist rate (the number of fittest individuals saved per generation) as a rate of 1/7 of population size. Inversion could help too!

Labels: , , , , , , , ,


Post a Comment

<< Home

Charles Darwin Has A Posse Check Google Page Rank