Screencast Presentation: An Introduction to the Generative Fixation Hypothesis

February 13, 2010 at 7:01 pm (Bit Frequency Visualization, generative fixation, genetic algorithms, hyperclimbing, symmetry-analysis) ()

Permalink 1 Comment

Hyperclimbing and Decimation

January 29, 2010 at 2:01 am (decimation, generative fixation, genetic algorithms, hyperclimbing, survey propagation)

In recent years, probabilistic inference algorithms such as survey propagation and belief propagation have been shown to be remarkably effective at tackling large, random instances of SAT, and other combinatorial optimization problems that lie beyond the reach of previous approaches. These inference algorithms belong to a class of techniques called decimation strategies. Decimation strategies monotonically reduce the size of a problem instance by iteratively fixing partial solutions (partial variable assignments in the case of SAT).

The generative fixation hypothesis essentially states that genetic algorithms work by efficiently implementing a decimation strategy called hyperclimbing.

Permalink Leave a Comment

Hyperclimbing, Genetic Algorithms, and Machine Learning

October 27, 2009 at 7:59 am (generative fixation, genetic algorithms, hyperclimbing, machine learning)

I’ve identified a promising stochastic search heuristic, called hyperclimbing, for large-scale optimization over huge attribute product spaces (e.g., the set of all binary strings of some length N, where N is very large) with rugged fitness functions. Hyperclimbing works by progressively limiting sampling to a series of nested subsets with increasing expected fitness. At any given step, this heuristic sifts through vast numbers of coarse partitions of the subset it “inhabits”, and identifies ones that partition this set into subsets whose expected fitness values are significantly variegated. Because hyperclimbing is sensitive, not to the local features of a search space, but to certain more global statistics, it is not susceptible to the kinds of issues that waylay local search heuristics.

The chief barrier to the wide and enthusiastic use of hyperclimbing is that it seems to scale very poorly with the number of attributes. When one heeds the seemingly high cost of applying hyperclimbing to large search spaces, this heuristic quickly looses its shine. A key conclusion of my doctoral work is that this seemingly high cost is illusory. I have uncovered evidence that strongly suggests that genetic algorithms can implement hyperclimbing extraordinarily efficiently.

As readers of this blog probably know, genetic algorithms are search algorithms that mimic natural evolution. These algorithms have been used in a wide range of engineering and scientific fields to quickly procure useful solutions to poorly understood (i.e. black-box) optimization problems. Unfortunately, despite the routine use of genetic algorithms for over three decades, their adaptive capacity has not been adequately accounted for. Given the evidence that genetic algorithms can implement efficient hyperclimbing, I’ve proposed a new explanation for the adaptive capacity of these algorithms. This new account—the generative fixation hypothesis—promises to spark significant advances in the fields of genetic algorithmics and discrete optimization.

The discovery that hyperclimbing is efficiently implementable also promises to have a non-negligible impact on the ecology of machine learning research. Optimization and machine learning are, after all, intimately related. Overlooking a few exceptions, the practice of machine learning research, can be characterized as the effective reduction of difficult learning problems to optimization problems for which efficient algorithms exist. In other words, the machine learning problems that can effectively be tackled are in large part those that can in practice be reduced to optimization problems that can be tackled efficiently. Currently, this largely limits the class of tractable machine learning problems to the class of learning problems that can in practice be reduced to convex optimization problems [1] . The identification of general-purpose non-convex optimization heuristics with efficient implementations (e.g. hyperclimbing), thus, has the potential to significantly extend the reach of machine learning.

For a description of hyperclimbing, and evidence that genetic algorithms can implement this heuristic efficiently, please see my dissertation

[1]  Kristin P. Bennett and Emilio Parrado-Hernandez. The interplay of optimization and machine  learning research. Journal of Machine Learning Research, 7:1265–1281, 2006.

Permalink Leave a Comment

Google Group for Generative Fixation

August 27, 2009 at 6:21 pm (generative fixation)

The generative fixation hypothesis now has a Google group—a place to ask  questions and share your insights.  If you’re intrigued by the idea of generative fixation, please sign up.

http://groups.google.com/group/generativefixation

Permalink Leave a Comment

Dissertation Deposition

August 18, 2009 at 10:23 pm (Bit Frequency Visualization, QTL, active learning, building block hypothesis, combinatorial optimization, data mining, epistasis, evolutionary biology, function of recombination, generative fixation, genetic algorithms, genetics, hyperclimbing, hyperscapes, machine learning, max-sat, occam's razor, philosophy of science, philosopy, population genetics, sublinear computation)

I deposited my dissertation today.

Click here to see the final version (single spaced for easy reading).

Permalink 3 Comments

Back to the Future: A Science of Genetic Algorithms

July 22, 2009 at 11:07 pm (building block hypothesis, generative fixation, genetic algorithms, philosophy of science) (, )

From the preface to my dissertation:

The foundations of most computer engineering disciplines are almost entirely mathematical. There is, for instance, almost no question about the  soundness of the foundations of such engineering disciplines as graphics, machine learning, programming languages, and databases. An exception to this general rule is the field of genetic algorithmics, whose foundation includes a significant scientific component.

The existence of a science at the heart of this computer engineering discipline is  regarded with nervousness. Science traffics in provisional truth; it requires one to adopt a form of skepticism that is more nuanced, and hence more difficult to master than the radical sort of skepticism that suffices in mathematics and theoretical computer science. Many, therefore, would be happy to see science excised from the foundations of genetic algorithmics. Indeed, over the past decade and a half, much effort seems to have been devoted to turning genetic algorithmics into just another field of computer engineering, one with an entirely mathematical foundation.

Broadening one’s perspective beyond computer engineering, however, one cannot help wondering if much of this effort is not a little misplaced. Read the rest of this entry »

Permalink Leave a Comment

Red Dots, Blue Dots

June 29, 2009 at 7:02 pm (Bit Frequency Visualization, epistasis, generative fixation, symmetry-analysis)

In this blog entry I’d like to showcase just one of a number of remarkable findings that comprise the basis for the generative fixation hypothesis—a new explanation for the adaptive capacity of recombinative genetic algorithms.

Consider the following stochastic function which takes a bitstring of length \ell as input and returns a real value as output.

fitness(bitstring)
  accum = 0
  for i = 1 to 4
     accum = accum + bitstring[pivotalLoci[i]]
  end
  if accum is odd
     return a random value from normal distribution N(+0.25,1)
  else
     return a random value from normal distribution N(-0.25,1)
  end

The variable pivotalLoci is an array of four distinct integers between 1and \ell which specifies the location of  four loci—let’s call them A, B, C, D—of an input bitstring that matter in the determination the bitstring’s fitness. These four loci are said to be pivotal. Read the rest of this entry »

Permalink 2 Comments

Bit Dynamics Visualization

December 30, 2008 at 1:07 pm (genetic algorithms, visualization) ()

I’ve found the bit dynamics visualizer included in speedyGA very useful for understanding the dynamics of SGAs with bitstring genomes. In each generation the visualizer plots/updates the frequency of the bit 1 at each locus (the frequency of the bit 0 is straightforwardly deducible) .

Here’s a visualization of the bit dynamics of an SGA with 1pt crossover when applied to the the Royal Roads fitness function. Going by the building block hypothesis one expects to see the dots marching orderly to the top of the plot in groups of eight or more.

That’s not what happens. Instead, one gets to see hitchhiking in action—look for a swift downward movement of certain dots in tandem with the swift upward movement of other dots at close by loci.

This movie requires Adobe Flash for playback.

The maximum and average fitness in each generation of this run are shown belowavg_max_fitness_crossover1

The matlab code used to generate these and other figures in this blog post can be found here.

Let’s visualize the bit dynamics of a population when an SGA with uniform-crossover is applied to the Royal Roads function.

This movie requires Adobe Flash for playback.

The maximum and average fitness in each generation of this run are shown below

Read the rest of this entry »

Permalink Leave a Comment

The Fundamental Problem with the Building Block Hypothesis (new manuscript)

October 18, 2008 at 8:30 pm (building block hypothesis, epistasis, genetic algorithms, occam's razor, philosophy of science, philosopy, population genetics) (, , )

Abstract: Skepticism of the building block hypothesis  has previously been expressed on account of the weak theoretical foundations of this hypothesis and anomalies in the empirical record of the simple genetic algorithm. In this paper we focus on a more fundamental cause for skepticism—the extraordinary strength of some of the assumptions undergirding the building block hypothesis. As many of these assumptions have been embraced by the designers of so called “competent” genetic algorithms, our critique is relevant to an appraisal of such algorithms. We argue that these assumptions are too strong to be acceptable without additional evidence. We then point out weaknesses in the arguments that have been provided in lieu of such evidence.

Download manuscript

Permalink Leave a Comment

What Are GAs Good For?

May 23, 2008 at 7:50 pm (QTL, combinatorial optimization, epistasis, genetic algorithms, genetics, symmetry-analysis) (, , )

Researchers studying the foundations of genetic algorithms have not, to the best of my knowledge, identified a non-trivial computational problem that a simple GA can solve robustly and scaleably (I’ve previously raised this issue here) . In my opinion, this singular fact is the most clear evidence for the inadequacy of current paradigm within which we understand/study the adaptive capacity of GAs—the question of what GAs are good for is, after all, intimately related to the question of how GAs work.

In a draft of one of my dissertation chapters I identify a hard computational problem and show that a GA can solve it robustly and scalably. Remarkably, this problem is closely related to a hairy statistical problem in computational biology. How might a GA leverage this kind of computational ability to perform adaptation? I’ll be presenting my theory about this in future chapters. The idea behind this theory is delightfully simple. Presenting it formally, however, is a another story. Stay tuned.


Permalink Leave a Comment

Next page »