<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Hacking Evolution &#187; machine learning</title>
	<atom:link href="http://blog.hackingevolution.net/category/computer-science/machine-learning/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.hackingevolution.net</link>
	<description>Explaning Adaptation in Evolutionary Systems</description>
	<lastBuildDate>Fri, 27 Jan 2012 17:49:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.hackingevolution.net' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Hacking Evolution &#187; machine learning</title>
		<link>http://blog.hackingevolution.net</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.hackingevolution.net/osd.xml" title="Hacking Evolution" />
	<atom:link rel='hub' href='http://blog.hackingevolution.net/?pushpress=hub'/>
		<item>
		<title>Hyperclimbing, Genetic Algorithms, and Machine Learning</title>
		<link>http://blog.hackingevolution.net/2009/10/27/hyperclimbing/</link>
		<comments>http://blog.hackingevolution.net/2009/10/27/hyperclimbing/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 12:59:34 +0000</pubDate>
		<dc:creator>Keki</dc:creator>
				<category><![CDATA[generative fixation]]></category>
		<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[hyperclimbing]]></category>
		<category><![CDATA[machine learning]]></category>

		<guid isPermaLink="false">http://blog.hackingevolution.net/?p=1057</guid>
		<description><![CDATA[I’ve identified a promising stochastic search heuristic, called hyperclimbing, for large-scale optimization over massive attribute product spaces (e.g., the set of all binary strings of some length N, where N is very large) with rugged fitness functions. Hyperclimbing works by progressively limiting sampling to a series of nested subsets with increasing expected fitness. At any [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=1057&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I’ve identified a promising stochastic search heuristic, called <em>hyperclimbing</em>, for large-scale optimization over massive attribute product spaces (e.g., the set of all binary strings of some length <em>N</em>, where <em>N</em> is very large) with rugged fitness functions. Hyperclimbing works by progressively limiting sampling to a series of nested subsets with increasing expected fitness. At any given step, this heuristic sifts through vast numbers of coarse partitions of the subset it &#8220;inhabits&#8221;, and identifies ones that partition this set into subsets whose expected fitness values are significantly variegated. Because hyperclimbing is sensitive, not to the local features of a search space, but to certain more global statistics, it is not susceptible to the kinds of issues that waylay local search heuristics.</p>
<p>The chief barrier to the wide and enthusiastic use of hyperclimbing is that it seems to scale very poorly with the number of attributes. If one heeds the seemingly high cost of applying hyperclimbing to large search spaces, this heuristic quickly looses its shine. A key conclusion of my doctoral work is that this seemingly high cost is illusory. I have uncovered evidence that strongly suggests that genetic algorithms can implement hyperclimbing extraordinarily efficiently.</p>
<p>As readers of this blog probably know, genetic algorithms are search algorithms that mimic natural evolution. These algorithms have been used in a wide range of engineering and scientific fields to quickly procure useful solutions to poorly understood (i.e. black-box) optimization problems. Unfortunately, despite the routine use of genetic algorithms for over three decades, their adaptive capacity has not been adequately accounted for. Given the evidence that genetic algorithms can implement efficient hyperclimbing, I’ve proposed a new explanation for the adaptive capacity of these algorithms. This new account&#8212;<a href="http://cs.brandeis.edu/~kekib/dissertation.html">the generative fixation hypothesis</a>&#8212;promises to spark significant advances in the fields of genetic algorithmics and discrete optimization.</p>
<p>The discovery that hyperclimbing is efficiently implementable also promises to have a non-negligible impact on the ecology of machine learning research. Optimization and machine learning are, after all, intimately related. Overlooking a few exceptions, the practice of machine learning research, can be characterized as the effective reduction of difficult learning problems to optimization problems for which efficient algorithms exist. In other words, the machine learning problems that can effectively be tackled are in large part those that can <em>in practice </em>be reduced to optimization problems that can be tackled efficiently. Currently, this largely limits the class of tractable machine learning problems to the class of learning problems that can in practice be reduced to <em>convex</em> optimization problems [1] . The identification of general-purpose non-convex optimization heuristics with efficient implementations (e.g. hyperclimbing), thus, has the potential to significantly extend the reach of machine learning.</p>
<p>For a description of hyperclimbing, and evidence that genetic algorithms can implement this heuristic efficiently, please see my <a href="http://cs.brandeis.edu/~kekib/dissertation.html">dissertation</a></p>
<p>[1]  Kristin P. Bennett and Emilio Parrado-Hernandez. <a href="http://jmlr.csail.mit.edu/papers/volume7/MLOPT-intro06a/MLOPT-intro06a.pdf">The interplay of optimization and machine  learning research</a>. Journal of Machine Learning Research, 7:1265–1281, 2006.</p>
<br />Posted in generative fixation, genetic algorithms, hyperclimbing, machine learning  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hackingevolution.wordpress.com/1057/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hackingevolution.wordpress.com/1057/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hackingevolution.wordpress.com/1057/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hackingevolution.wordpress.com/1057/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hackingevolution.wordpress.com/1057/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hackingevolution.wordpress.com/1057/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hackingevolution.wordpress.com/1057/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hackingevolution.wordpress.com/1057/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=1057&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.hackingevolution.net/2009/10/27/hyperclimbing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Keki</media:title>
		</media:content>
	</item>
		<item>
		<title>Dissertation Deposition</title>
		<link>http://blog.hackingevolution.net/2009/08/18/dissertation-set-in-stone/</link>
		<comments>http://blog.hackingevolution.net/2009/08/18/dissertation-set-in-stone/#comments</comments>
		<pubDate>Wed, 19 Aug 2009 03:23:56 +0000</pubDate>
		<dc:creator>Keki</dc:creator>
				<category><![CDATA[active learning]]></category>
		<category><![CDATA[Bit Frequency Visualization]]></category>
		<category><![CDATA[building block hypothesis]]></category>
		<category><![CDATA[combinatorial optimization]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[epistasis]]></category>
		<category><![CDATA[evolutionary biology]]></category>
		<category><![CDATA[function of recombination]]></category>
		<category><![CDATA[generative fixation]]></category>
		<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[genetics]]></category>
		<category><![CDATA[hyperclimbing]]></category>
		<category><![CDATA[hyperscapes]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[max-sat]]></category>
		<category><![CDATA[occam's razor]]></category>
		<category><![CDATA[philosophy of science]]></category>
		<category><![CDATA[philosopy]]></category>
		<category><![CDATA[population genetics]]></category>
		<category><![CDATA[QTL]]></category>
		<category><![CDATA[sublinear computation]]></category>

		<guid isPermaLink="false">http://blog.hackingevolution.net/?p=1021</guid>
		<description><![CDATA[I deposited my dissertation today. Click here to see the final version (single spaced for easy reading). Posted in active learning, Bit Frequency Visualization, building block hypothesis, combinatorial optimization, data mining, epistasis, evolutionary biology, function of recombination, generative fixation, genetic algorithms, genetics, hyperclimbing, hyperscapes, machine learning, max-sat, occam's razor, philosophy of science, philosopy, population genetics, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=1021&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I deposited my dissertation today.</p>
<p><a href="http://cs.brandeis.edu/~kekib/dissertation.html">Click here</a> to see the final version (single spaced for easy reading).</p>
<br />Posted in active learning, Bit Frequency Visualization, building block hypothesis, combinatorial optimization, data mining, epistasis, evolutionary biology, function of recombination, generative fixation, genetic algorithms, genetics, hyperclimbing, hyperscapes, machine learning, max-sat, occam's razor, philosophy of science, philosopy, population genetics, QTL, sublinear computation  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hackingevolution.wordpress.com/1021/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hackingevolution.wordpress.com/1021/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hackingevolution.wordpress.com/1021/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hackingevolution.wordpress.com/1021/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hackingevolution.wordpress.com/1021/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hackingevolution.wordpress.com/1021/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hackingevolution.wordpress.com/1021/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hackingevolution.wordpress.com/1021/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=1021&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.hackingevolution.net/2009/08/18/dissertation-set-in-stone/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Keki</media:title>
		</media:content>
	</item>
		<item>
		<title>New manuscript now at arXiv</title>
		<link>http://blog.hackingevolution.net/2007/11/13/latest-manuscript-now-on-arxiv/</link>
		<comments>http://blog.hackingevolution.net/2007/11/13/latest-manuscript-now-on-arxiv/#comments</comments>
		<pubDate>Tue, 13 Nov 2007 05:45:46 +0000</pubDate>
		<dc:creator>Keki</dc:creator>
				<category><![CDATA[building block hypothesis]]></category>
		<category><![CDATA[coarse-graining]]></category>
		<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[manuscript]]></category>
		<category><![CDATA[mathematical]]></category>
		<category><![CDATA[technical]]></category>
		<category><![CDATA[wee-bit-ranty]]></category>

		<guid isPermaLink="false">http://evoadaptation.wordpress.com/2007/11/13/latest-manuscript-now-on-arxiv/</guid>
		<description><![CDATA[My latest manuscript is now posted at arXiv. http://arxiv.org/abs/0711.1401<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=14&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>My latest manuscript is now posted at arXiv.</p>
<p><a href="http://arxiv.org/abs/0711.1401">http://arxiv.org/abs/0711.1401</a></p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/hackingevolution.wordpress.com/14/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/hackingevolution.wordpress.com/14/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hackingevolution.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hackingevolution.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hackingevolution.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hackingevolution.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hackingevolution.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hackingevolution.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hackingevolution.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hackingevolution.wordpress.com/14/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=14&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.hackingevolution.net/2007/11/13/latest-manuscript-now-on-arxiv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Keki</media:title>
		</media:content>
	</item>
		<item>
		<title>Optimization, Adaptation, Machine Learning and Evolutionary Computation</title>
		<link>http://blog.hackingevolution.net/2007/09/04/optimization-adaptation-machine-learning-and-evolutionary-computation-2/</link>
		<comments>http://blog.hackingevolution.net/2007/09/04/optimization-adaptation-machine-learning-and-evolutionary-computation-2/#comments</comments>
		<pubDate>Tue, 04 Sep 2007 14:53:05 +0000</pubDate>
		<dc:creator>Keki</dc:creator>
				<category><![CDATA[genetic algorithms]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[overview]]></category>

		<guid isPermaLink="false">http://evoadaptation.wordpress.com/2007/09/04/optimization-adaptation-machine-learning-and-evolutionary-computation-2/</guid>
		<description><![CDATA[From the introduction of a manuscript that I recently submitted for review The practice of Machine Learning research can be characterized as the effective semiprincipled reduction of learning problems to problems for which robust and efficient solution techniques exist &#8211; ideally ones with provable bounds on their use of time and space. In a recent [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=11&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>From the introduction of a <a title="Towards a Sound Theory of Adaptation for the Simple Genetic Algorithm" href="http://arxiv.org/PS_cache/arxiv/pdf/0711/0711.1401v1.pdf">manuscript</a> that I recently submitted for review</p>
<p>The practice of Machine Learning research can be characterized as the effective semiprincipled reduction of learning problems to problems for which robust and efficient solution techniques exist &#8211; ideally ones with provable bounds on their use of time and space. In a recent paper Bennett and Parrado-Hern´andez (2006) describe the synergistic relationship between the fields of machine learning (ML) and mathematical programming (MP). They remark:</p>
<p>&#8220;Optimization lies at the heart of machine learning. Most machine learning problems reduce to optimization problems. Consider the machine learning analyst in action solving a problem for some set of data. The modeler formulates the problem by selecting an appropriate family of models and massages the data into a format amenable to modeling. Then the model is typically trained by solving a core optimization problem that <span id="more-11"></span>optimizes the variables or parameters of the model with respect to the selected loss function and possibly some regularization function. In the process of model selection and validation, the core optimization problem may be solved many times. The research area of mathematical programming theory intersects with machine learning through these<br />
core optimization problems&#8221; (Bennett and Parrado-Hern´andez, 2006).</p>
<p>Later Bennett and Parrado-Hern´andez imply that when the targets of ML reductions have been optimization problems, they have for the most part been the convex optimization problems within the MP pantheon.</p>
<p>&#8220;Convexity plays a key role in mathematical programming. Convex programs minimize convex optimization functions subject to convex constraints ensuring that every local minimum is always a global minimum. In general, convex problems are much more tractable algorithmically and theoretically. The complexity of nonconvex problems can grow enormously. General nonconvex programs are NP-hard.&#8221; (Bennett and Parrado-Hern´andez, 2006).</p>
<p>The close relationship between ML and MP arguably exists because MP provides ML with a set of crisp, well-defined problems along with algorithmic solvers that come with guarantees on their use of time and space. To state this using metaphors from software engineering, the well-defined convex optimization problems are interfaces that MP publishes, and the provably efficient and robust algorithmic solvers of MP implement these interfaces.</p>
<p>Let us differentiate, in this paper, between optimization and adaptation. We define optimization as the procurement of one or more points of optimal or close-to-optimal value, and adaptation as the generation of points of increasing value over time. Given this definition, to say that the target problems of Machine Learning reductions are optimization problems is to fudge the truth somewhat. While the Mathematical Programming community indeed seems to be almost completely concerned with the procurement of optimal or close to optimal points, ML researchers aren&#8217;t interested in optimization per se but in the means by which it is achieved in most MP algorithms, i.e. adaptation. In fact optimization is often prevented in machine learning algorithms &#8211; using a &#8220;technique&#8221; named early-stopping &#8211; to prevent overfitting. In other words, robust, efficient adaptation is the modus operandi of most convex optimization algorithms, and for the most part, it is this modus operandi that makes these algorithms interesting to Machine Learning researchers.</p>
<p>The interface-problems published by the MP community give ML researchers useful targets to hit; if a ML researcher works out a semi-principled reduction of a class of learning problems to one of MP&#8217;s interface-problems, there are off-the-shelf algorithms within MP which allow her to quickly test whether her reduction is effective. Because of the emphasis that the ML community place on guarantees of robustness and efficiency. when the targets of ML reductions have been optimization problems, they have for the most part, been restricted to being convex optimization problems within MP. These problems are rather simple as adaptation problems go &#8211; every local optimum is also a global optimum, or stated differently, there are no local optima. Rather heroic feats of ingenuity are therefore necessary in order to obtain effective semi-principled reductions<br />
of hard problems to these simple optimization problems. The difficulty of obtaining such<br />
reductions is currently a fundamental limitation on the pace of progress within ML.</p>
<p>The SGA (Mitchell, 1996) is an adaptation algorithm which mimics natural sexual evolution. It has been directly applied to a large number of hard real-world problems and has often succeeded in generating solutions of remarkably high-quality. To be sure, some amount of thought is required to &#8220;massage&#8221; these problems into a form which allows the SGA to operate successfully on them (e.g. choices must be made about the fitness function used and the way solutions are encoded as bitstrings), but unlike the case in machine learning this massaging is largely ad-hoc, an outcome more of trial and error than principled reasoning. The resulting problems are almost certainly hard ones (non-convex), with objective functions that are riddled with local optima. It is a testament to the adaptive power of the SGA that it nevertheless often produces solutions of remarkably high quality. Given these successes one might expect a great deal of interest in SGAs from the machine learning community. That this is not the case speaks to an unfortunate shortcoming of GA research. There is no dearth of one-off problems that SGAs have adequately solved. However GA researchers have yet to publish a single class of problems such that a) SGAs are likely to perform robust, efficient adaptation when applied to problems in this class, and b) the class is likely to be useful as the target of ML reductions. For the sake of brevity we will loosely define such a class of problems as an SGA-Easy/ML-Useful class. We believe that when such problem classes are found the ML community will begin to take a greater interest in GA research. The future relationship between the GA and ML communities might then be similar to the one that currently exists between MP and ML. As mentioned above SGAs commonly adapt high-quality solutions to problems which are almost certainly contain large numbers of local optima. It is reasonable therefore to suspect that there exists an SGA-Easy/ML-Useful class of hard non-convex problems and that the identification of this class will significantly ease the burden of obtaining novel ML reductions. We believe that the identification of such a problem class will go hand in hand with the discovery of a theory which can give a satisfying explanation of the adaptive capacity of the SGA. Such a theory does not currently exist.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/hackingevolution.wordpress.com/11/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/hackingevolution.wordpress.com/11/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hackingevolution.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hackingevolution.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hackingevolution.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hackingevolution.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hackingevolution.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hackingevolution.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hackingevolution.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hackingevolution.wordpress.com/11/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=11&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.hackingevolution.net/2007/09/04/optimization-adaptation-machine-learning-and-evolutionary-computation-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Keki</media:title>
		</media:content>
	</item>
		<item>
		<title>Optimization, Adaptation, Machine Learning and Evolutionary Computation (unpolished)</title>
		<link>http://blog.hackingevolution.net/2007/02/02/optimization-adaptation-machine-learning-and-evolutionary-computation/</link>
		<comments>http://blog.hackingevolution.net/2007/02/02/optimization-adaptation-machine-learning-and-evolutionary-computation/#comments</comments>
		<pubDate>Fri, 02 Feb 2007 23:41:38 +0000</pubDate>
		<dc:creator>Keki</dc:creator>
				<category><![CDATA[machine learning]]></category>

		<guid isPermaLink="false">http://evoadaptation.wordpress.com/2007/02/02/optimization-adaptation-machine-learning-and-evolutionary-computation/</guid>
		<description><![CDATA[I recently came across a wonderful bird&#8217;s eye-view paper in JMLR [1]. It is helping me to clarify my views about the relationship between Machine Learning and Evolutionary Computation. Bennett and Parrado-Hernandez remark: “Optimization lies at the heart of machine learning. Most machine learning problems reduce to optimization problems. Consider the machine learning analyst in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=10&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I recently came across a wonderful bird&#8217;s eye-view paper in JMLR [1]. It is helping me to clarify my views about the relationship between Machine Learning and Evolutionary Computation.</p>
<p>Bennett and Parrado-Hernandez remark:</p>
<p>“Optimization lies at the heart of machine learning. Most machine learning problems reduce to optimization problems. Consider the machine learning analyst in action solving a problem for some set of data. The modeler formulates the problem by selecting an appropriate family of models and massages the data into<span id="more-10"></span> a format amenable to modeling. Then the model is typically trained by solving a core optimization problem that optimizes the variables or parameters of the model with respect to the selected loss function and possibly some regularization function. In the process of model selection and validation, the core optimization problem may be solved many times. The research area of mathematical programming theory intersects with machine learning through these core optimization problems”</p>
<p>and later,</p>
<p>“Convexity plays a key role in mathematical programming [MP]. Convex programs minimize convex optimization functions subject to convex constraints ensuring that every local [optimum] is always a global [optimum]. In general, convex problems are much more tractable algorithmically and theoretically. &#8230; General nonconvex programs are NP-hard”</p>
<p>This paper lends support to my sense that the optimization techniques used in machine learning are largely limited to convex optimization algorithms. Successes in machine learning have so far been a result of the application of human ingenuity to carry out  <em>effective semi-principled </em>reductions of certain classes of problems to convex optimization problems. I use the qualifier <em>semi-principled</em> because there is usually some solid rationale behind an ML reduction that leads one to expect that the reduction makes the original problem amenable to solution by convex optimization techniques. This however is never formally proven. i.e. the reduction is not formal in the sense that the traveling salesman problem formally reduces to SAT. In all likelihood the optimization problems that the redutions yield are <em>non-convex</em>. However in practice, the application of convex optimiaztion techniques to these optimization problems often gives satisfactory solutions to the original problems. Hence the qualifer <em>effective.</em></p>
<p>For example researchers in the support vector machine community use the kernel trick in an effective semi-principle reduction of classification and regression problems to a class of convex optimization problems called quadratic programming problems. These problems are then solved using quadratic programing algorithms such as Platt&#8217;s SMO algorthm.</p>
<p>To summarize the discussion so far: convex optimization problems are a class of optimization problems in which every local optimum is also a global optimum. Many useful algorithms for efficiently solving convex optimization problems have been constructed by the mathematical programming community. One of the reasons for the success of Machine Learning techniques is the use of human ingenuity to achieve effective semi-principled reductions of certain classes of problems to convex optimization problems for which there exist efficient solvers within MP.</p>
<p>Another reason, as authors mention, is that machine learning researchers have been tweaking the convex optimization algorithms of mathematical programming to create new  algorithms which are better suited for their purposes. “Mathematical Programming puts a premium on accuracy, speed, and robustness. Since generalization is the bottom line in machine learning and training is normally done off-line, accuracy and small speed improvements are of little concern in machine learning. Machine learning prefers simpler algorithms that work in reasonable computational time for specific classes of problems” [1]. The paper later mentions that optimization is often cut short (!) in Machine Learning to prevent overfitting This “technique” is called <em>early-stopping</em>.  And later &#8216;Thus not only is “good” optimization not necessary, but “bad” optimization algorithms can lead to better machine learning models&#8217;.</p>
<p>Here are some reformulations of this paper and some of my thoughts.</p>
<p>Let me differentiate between optimization and adaptation. I&#8217;ll define <em>optimization</em> as the procurement of one or more points of optimal or close-to-optimal value, and <em>adaptation</em> as the generation of points of increasing value over time. Things become quite clear in light of this distinction. Mathematical Programming seems to be almost completely concerned with optimization whereas machine learning seems to require relatively quick <em>adaptation</em>.</p>
<p>The love affair between mathematical programming and Machine learning exists because MP provides the world with a set of crisp, well-defined problems &#8212; e.g. a linear programming problem, a quadratic programming problem, a second-order cone programming problem, a semdefinite programming problem, a semi-infinite programming problem (see the appendix in [1] for a description of each of these convex optimization problems) &#8212;  and then provides programmatic techniques for solving those problems. To state this using metaphors from software engineering (particularly object oriented programming), these well-defined problems are the <em>interfaces</em> that MP publishes, and the efficient algorithmic solvers within that field <em>implement</em> these interfaces.</p>
<p>The interface-problems  published by the MP community give ML researchers a rough target to hit in their efforts to carry out effective semi-principled reductions of difficult problems to ones for which efficient solvers exist. The interface-problems of MP are similar to the optimization problems that ML researchers ultimately end up with, but there are important differences: for example, MPers seek fast optimization whereas MLers merely want a high degree of adaptation, and have additional requirements such as scalability. Nevertheless, if a ML researcher works out a semi-principled reduction of a class problems to one of MP&#8217;s interface-problems, there are off-the-shelf algorithms within MP which allow her to quickly test whether her reduction is effective. MLers put great stock in the fact that each convex optimization algorithm in the MP pantheon has robust performance over its problem class and is theoretically proven to converge to the optimum.</p>
<p>Machine learning researchers spend their time trying to achieve effective reductions of difficult problems to problem classes for which efficient and robust solvers exist. An effective reduction may be a multistep affair (effectively reduce problem A to problem B, effectively reduce problem B to problem C, &#8230;.) but it most often bottoms out in an effective reduction to an optimization problem, most often a convex optimization problem. Different techniques and tricks may be used at each step of an effective reduction (e.g. probabilistic models, the kernel trick, etc.).</p>
<p>Convex optimization problems are rather easy as optimization problems go. Therefore a large amount of human ingenuity is required to carry out effective semi-principled reductions of classes of real world problems to convex optimization problems. These reductions often involves the use of “heavy machinary” from computer science, mathematics and statistics.</p>
<p>It seems to me that fundamental advances by the machine learning community are currently being limited by two factors (besides computational speed). The first is the sheer level of human ingenuity that is required to realize novel reductions of real world problems to convex optimization problems. The second is that the targets of these reductions have so far been limited to convex-optimization problems. The first factor cannot be changed. With regard to the second, I believe that the field of evolutionary computation may have a lot to contribute . Selecto-recombinative evolutionary algorithms are widely believed to perform efficient adaptation even on non-convex problems. Unfortunately there are currently no crisp descriptions of the classes problems that such algorithms can efficiently &#8216;solve&#8217;. I believe that if and when such interface-problems are determined the ML community will take much greater interest in Evolutionary Computation. EC might then play the same role w.r.t. ML that MP currently plays, i.e. ML researchers might then seek effective semi-principled reductions of real-world problems to interface-problems in the EC pantheon.</p>
<p>References:<br />
[1] Bennett, K.P., Parrado-Hernandez, E. The interplay of optimization and machine learning research, <em>Journal of Machine Learning Research </em>7 (2006) 1265-1281</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/hackingevolution.wordpress.com/10/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/hackingevolution.wordpress.com/10/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/hackingevolution.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/hackingevolution.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/hackingevolution.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/hackingevolution.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/hackingevolution.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/hackingevolution.wordpress.com/10/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/hackingevolution.wordpress.com/10/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/hackingevolution.wordpress.com/10/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.hackingevolution.net&amp;blog=3215331&amp;post=10&amp;subd=hackingevolution&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.hackingevolution.net/2007/02/02/optimization-adaptation-machine-learning-and-evolutionary-computation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">Keki</media:title>
		</media:content>
	</item>
	</channel>
</rss>
