Optimization By Decimation: Explaining Adaptation in Genetic Algorithms with Uniform Crossover

I’m preparing chapter 4 of my dissertation for submission to a journal.

Manuscript: http://s3.amazonaws.com/burjorjee/www/hyperclimbing_hypothesis.pdf

Abstract:
We submit the hyperclimbing hypothesis—an explanation for adaptation in genetic algorithms with uniform crossover (UGAs). Hyperclimbing is a stochastic search heuristic that works by decimating a search space, i.e. by iteratively fixing the values of small numbers of search space attributes. Global decimation is known to be an effective way to approach large instances of hard constraint satisfaction problems. The hyperclimbing hypothesis holds that UGAs work by implicitly implementing efficient global decimation. Proof of concept for this hypothesis comes from the use of a novel analytic technique involving the exploitation of algorithmic symmetry. We also present experimental results that show that a simple tweak inspired by the hyperclimbing hypothesis significantly improves the performance of a UGA on an instance of Uniform Random MAX-3SAT . The hyperclimbing hypothesis suggests that other kinds of evolutionary algorithms may also work by implicitly implementing efficient global decimation.

Presentation at the University of Washington: Optimization by Hyperclimbing

Yesterday, I presented my research on genetic algorithms at the University of Washington.

Talk abstract

My slides

Working Abstract of my Next Paper

This one’s to catch the attention of folks in machine learning, and theoretical computer science.

We recently proposed a new explanation for the adaptive capacity of simple recombinative genetic algorithms. This explanation proceeds from evidence that the simple genetic algorithm with uniform crossover (UGA) can implement a stochastic non-local search heuristic called hyperclimbing extraordinarily efficiently. To showcase the core computational efficiency involved we take up the problem of learning a classifier for the attributes of an unknown parity function over n attributes, k of which are effective. We consider the case where the learning algorithm can make adaptive queries against a membership query oracle. Given a bitstring of length n, the oracle returns a boolean value indicating the parity of the bitstring under the unknown parity function. For certain small, but otherwise arbitrarily chosen, values of k, we “show” that a UGA that uses the oracle as its fitness function can learn a classifier that classifies any attribute of the parity function—as effective or non-effective—with arbitrary accuracy; the learning occurs in time that is linear in n, and with query complexity that is constant in n, even when the oracle is “moderately” noisy.

Related blog post: Red Dots, Blue Dots

Update (June 11, 2010): Had a back and forth with Vitaly Feldman about the “angle” I take in this paper.  He suggested that it may not be the best. For small values of k, and particular regimes of the noise parameter, a GA based learning algorithm performs at par (in an asymptotic sense) with the best known algorithms for solving the learning parities problem. Vitaly cautioned, however, that the problem of learning parities with an adaptive memebership oracle is currently not of practical interest. And since the GA based learning algorithm does not improve upon an existing computational bound, he thinks that from a pure computational learning perspective, this result it is unlikely to be of interest.

So, back to the drawing board. Taking it from the top, the goal is to draw the attention of the machine learning community to the hyperclimbing heuristic, and the GA’s ability to implement this heuristic extraordinarily efficiently. One way to do this is to showcase the core computational efficiency at play in this implementation . And one way to do this is by showing how this core efficiency can be used to efficiently solve a problem that members of the computational learning community care about—presuming that such a problem currently exists.

Screencast Presentation: An Introduction to the Generative Fixation Hypothesis

Dissertation Deposition

I deposited my dissertation today.

Click here to see the final version (single spaced for easy reading).

Red Dots, Blue Dots

In this blog entry I’d like to showcase just one of a number of remarkable findings that comprise the basis for the generative fixation hypothesis—a new explanation for the adaptive capacity of recombinative genetic algorithms.

Consider the following stochastic function which takes a bitstring of length \ell as input and returns a real value as output.

fitness(bitstring)
  accum = 0
  for i = 1 to 4
     accum = accum + bitstring[pivotalLoci[i]]
  end
  if accum is odd
     return a random value from normal distribution N(+0.25,1)
  else
     return a random value from normal distribution N(-0.25,1)
  end

The variable pivotalLoci is an array of four distinct integers between 1and \ell which specifies the location of  four loci—let’s call them A, B, C, D—of an input bitstring that matter in the determination the bitstring’s fitness. These four loci are said to be pivotal. Read the rest of this entry »

Bit Dynamics Visualization

I’ve found the bit dynamics visualizer included in speedyGA very useful for understanding the dynamics of SGAs with bitstring genomes. In each generation the visualizer plots/updates the frequency of the bit 1 at each locus (the frequency of the bit 0 is straightforwardly deducible) .

Here’s a visualization of the bit dynamics of an SGA with 1pt crossover when applied to the the Royal Roads fitness function. Going by the building block hypothesis one expects to see the dots marching orderly to the top of the plot in groups of eight or more.

That’s not what happens. Instead, one gets to see hitchhiking in action—look for a swift downward movement of certain dots in tandem with the swift upward movement of other dots at close by loci.

speedygarun_crossovertype12

speedygarun_crossovertype12

This movie requires Adobe Flash for playback.

The maximum and average fitness in each generation of this run are shown belowavg_max_fitness_crossover1

The matlab code used to generate these and other figures in this blog post can be found here.

Let’s visualize the bit dynamics of a population when an SGA with uniform-crossover is applied to the Royal Roads function.

speedygarun_crossovertype21

speedygarun_crossovertype21

This movie requires Adobe Flash for playback.

The maximum and average fitness in each generation of this run are shown below

Read the rest of this entry »

What Are GAs Good For?

Researchers studying the foundations of genetic algorithms have not, to the best of my knowledge, identified a non-trivial computational problem that a simple GA can solve robustly and scaleably (I’ve previously raised this issue here) . In my opinion, this singular fact is the most clear evidence for the inadequacy of current paradigm within which we understand/study the adaptive capacity of GAs—the question of what GAs are good for is, after all, intimately related to the question of how GAs work.

In a draft of one of my dissertation chapters I identify a hard computational problem and show that a GA can solve it robustly and scalably. Remarkably, this problem is closely related to a hairy statistical problem in computational biology. How might a GA leverage this kind of computational ability to perform adaptation? I’ll be presenting my theory about this in future chapters. The idea behind this theory is delightfully simple. Presenting it formally, however, is a another story. Stay tuned.


New manuscript now at arXiv

My latest manuscript is now posted at arXiv.

http://arxiv.org/abs/0711.1401

Follow

Get every new post delivered to your Inbox.