I have been writing a book on unconventional optimization algorithms (for the lack of a better well known name). The phrase I use is "inspired algorithms", although people in the know may recognize the techniques drawn from fields such as computational intelligence algorithms, metaheuristics, biologically inspired algorithms, natural algorithms, biomimetic algorithms, and others. These are the type of global optimization methods used when traditional techniques are unavailable, ineffective, and/or do not meet the broader constraints of the problem (need a good enough result and need it fast). They are also typically easy to understand and interesting to investigate.
My vision for the book project has centered on producing a reference text, where a large and diverse set of algorithms (50+) are presented in a useful and consistent way (inspiration, strategy, procedure, configuration heuristics, etc.). Additionally, I have intended to provide working code for each technique (ruby), hoping that the final text would be useful to interested amateurs, programmers, and research scientists. Really, it's a book I'm writing for the enthusiastic ignorant kid with an aptitude for programming I was 10 years ago.
Problem
I have 'started' writing this book more than 6 times over the last year, and some incarnations have made it online, although at this point are not representative of what I am trying to achieve. In my most recent effort, I started by questioning the set of algorithms I'd previously selected to write about. I'd originally listed all algorithms I could think of and selected a subset of 50 that I felt were sufficiently diverse, interesting, and useful. The decision making process was purely subjective, based on and biased by my experiences, exposures and hacking.
This challenge lead to the question:
Is it possible to select a diverse, interesting, and useful set of inspired algorithms using a simple data driven method?
An ideal selection method would choose algorithms based on metrics that captured relative algorithm efficiency, efficacy, quality of research, and maybe even scope of innovation. Algorithms may ideally be grouped by relative similarity in computational process and problem class suitability. Unfortunately, many if not most of such measures are intractable, especially for the amount of time I'm willing to put in and for the expected benefit such a list would offer.
I need a metaheuristic for preparing a list of metaheuristics.
Superficially, the problem of preparing a set of algorithms to write about in a book is very much like the problem of selecting a set of algorithms to apply to a given problem (presuming suitability criteria listed above are met - that you are resorting to an unconventional search algorithms for defensible reasons). Such algorithms are (typically in my experience) easy to describe, easy to understand, easy to program, and easy to get some kind of solution quickly. The algorithms one selects to apply are strongly biased by what the books, colleagues, and experiences one has had with such algorithms, much like the bias a guy like me has in selecting algorithms to write about in a book. This suggests that a data driven algorithm selection method may be useful for a practitioner trying solving a problem.
This further raises the question as to how to assess a produced 'set of selected algorithms'. Why is one set better than another? Is it the number of books sold? Is it the number of in-references achieved after 2, 5, or 10 years? If one were solving a problem and applying an array of techniques, is one technique better than another because it produced a slightly better solution? Are you sure it was coded correctly, that it was sufficiently tuned to the specifics of the domain?
Some 'algorithms' are more general methods than strict procedures. If one were considering array sorting algorithms, one could prepare an extensive list of known algorithms and compare each precisely by min, max, avg, etc computational complexity over resources like space and time. This is because such algorithms can be precisely described mathematically and then programmatically. I presume this applies to the so-called heuristic methods of fields like operations research and (linear, quadratic, ...) programming for the same reasons. This generally does not apply to so-called metaheuristics either because the field has chosen not to do so, or (as I think is more likely) because it is not possible because their generality is a core strength.
Nevertheless, comparing the resultant sets of selected algorithm may not be tractable, and if it is, it may not be meaningful.
Rather than walk away from this challenge, I decided to have a crack at it. I likely got it wrong, but wrong can be fixed and at least we'll (I'll) have a base for improvement (no doubt an axiom of the general fields that originally produced these algorithms in the face of hard optimization problems).
MethodI'll call this method the 'Google method' or the 'common usage' method as it somewhat resembles
the principle Google use to offer suggested corrections to misspelled search queries, as well as their translation services. The principle of my algorithm evaluation and selection method is:
An algorithm or method is as good its name, as useful and interesting as it is abundant.
A good optimization method is used a lot, and people write about using it. Further, a method that is used a lot stabilizes into a canonical form and coalesces under a common name. A corollary of this assumption that relates to the previous discussion is that those techniques that are not well described or implement correctly are not used and do not propagate. They are simplified and re-communicated for use or lost in the ever increasing flood of new material.
An algorithm that is used (written about) is an algorithm that is useful (worth knowing about).
The proposed data-driven method involves 1) eliciting an extensive list of algorithm names, 2) selecting algorithm names in a consistent manner, 3) tallying the number of search results hits for the exact algorithm name across a number of domains, and 4) compressing the tally measure into an ordinal algorithm ranking metric. The set or sets of useful and interesting algorithms may then be drawn from the the ranked listing.
It feels like an algorithm popularity contest, because that is exactly what it is, although popularity is re-measured across somewhat bounded to domains. A thread of research on an algorithm may publish and/or perish irrespective of the methods inherent qualities, and that may be the core argument against the proposed method.
ExperimentHere is what I did (a few hours a day between Wednesday 29th and yesterday):
- Wrote down all the algorithms I could think of, generally limited to algorithms for finding something in a domain (optimization or search) that broadly fit into the fields listed at the beginning of this post (excluding traditional heuristics like branch and bound and A* and popular computational intelligence fields like fuzzy logic and artificial neural networks).
- I tweeted my followers and emailed colleagues to send me a listing of their top 10 computational intelligence and metaheuristic algorithms, where the definition of favorite was left open. When domain experts start talking about their favorite things about their domain, one should listen: the collected survey data is valuable. I originally thought that I could construct a list from this survey data alone, although the lack of volume meant that at best I could do (at this point) is use it to update the master algorithm list.
- Normalized algorithm names to their full unabbreviated form. No acronyms wherever reasonably possible.
- Assigned each algorithm a superficial grouping or kingdom (I thought for use in a book chapter structure) based on general observation and intuition. Kingdom names were generally based on the motivation for the methods strategy, including: evolutionary, immunological, swarm, physical, probabilistic, and stochastic.
- Recorded the estimated-total number of search results for the exact algorithm name (quoted) in Google Web Search, Google Scholar Search, Google Book Search, Springer Search, and Scirus Search.
- Calculated the relative algorithm rank metric as the sum of the normalized search result measures.
ResultsThe raw results including all data collected is provided in a Google docs spreadsheet (note: with multiple sheets/tabs), entitled simply
Algorithm List.
Have a play with the data (if you're that way inclined) and postulate some improvements in the comments of this post. If their is moderate interest, I'll update the algorithm listing, ranking equation, et cetera and edit/re-post the results. If there is significant interest, I bash out a web application to allow open algorithm name submission and automated assessment and rankings (I hope so!).
There are a lot of things that can be drawn from the collected data, at the very least the top 10 algorithms in each of the imposed algorithm groups (at least those groups with more than 10 algorithms). Pay heed optimization algorithm
padawans:
- Top 10 Algorithms: Genetic Algorithm, Simulated Annealing, Genetic Programming, Random Search, Tabu Search, Evolutionary Programming, Evolution Strategies, Particle Swarm Optimization, Expectation-maximization algorithm, Bayesian Optimization Algorithm
- Top 10 Evolutionary Algorithms: Genetic Algorithm, Genetic Programming, Evolutionary Programming, Evolution Strategies, Differential Evolution, Grammatical Evolution, Learning Classifier System, Non-dominated Sorting Genetic Algorithm, Strength Pareto Evolutionary Algorithm, Messy Genetic Algorithm
- Top 10 Swarm Algorithms:Particle Swarm Optimization, Ant System, Ant Colony System, AntNet, MAX-MIN Ant System, Multiple ant colony system, Population-based Ant Colony Optimization, Bees Algorithm, Rank-Based Ant System, Bee Colony Optimization
- Top 10 Probabilistic Algorithms: Expectation-maximization algorithm, Bayesian Optimization Algorithm, Compact Genetic Algorithm, Extended Compact Genetic Algorithm, Cross-entropy method, Population-based incremental learning, Hierarchical Bayesian Optimization Algorithm, Probabilistic Incremental Program Evolution, Univariate Marginal Distribution Algorithm, Bivariate Marginal Distribution Algorithm
- Top 10 Stochastic Algorithms:Random Search, Tabu Search, Scatter Search, Hill Climbing Search, Iterated Local Search, Guided Local Search, Greedy Randomized Adaptive Search Procedure, Variable Neighbourhood Search, Adaptive Random Search, Stochastic gradient descent algorithm
Observations- Algorithm relevance. How wrong is the compiled list of algorithms? A factor of 2 or 3 (likely)? An order of 1 or 2 (it's possible)? I have seen many cases in my own trawling where researchers have renamed an algorithm for each paper published. How relevant are such algorithms to the practitioner they may claim to facilitate if one cannot name them to find them? One might argue that if an approach cannot be found (or cannot be understood) by those who might use or extend it, then it may as well not exist. I am not ignoring the converse argument that such published nuggets may be found and exploited at a later date (I guess like mendel's post hoc initiation of the modern synthesis), I think it is moot given the shear abundance of publications. There could be explicit contributions to the thread of research as well as explicit contributions to engineering (practitioner) without confusing both sides. (end rant, and yes of course I have been part of the problem in the past)
- Algorithm awareness. How many algorithms does the average researcher or even research group know about? One may research a field, generate an algorithm and miss the observation that the same general computation had already been investigated in one or more closely related sister fields. Further, there is a lot of algorithm racing in the field, which is a shame. The utility of such racing is further diminished if you consider the limited scope of possible comparisons in the average racing paper. The general field/s will continue to spin their wheels until they start extending and fixing the broken techniques that came before, rather than re-proposing them.
- Potential research opportunities. The low raking algorithms and fields may represent new approaches that have the potential to rise. At the very least, they may represent an opportunity for new and interesting research. A related observation is that the popularity approach used to collate the top algorithms may result in a bland and already over-represented algorithm compendium. This is a crucial observation which I will come back to.
LimitationsThe following are the areas for improvement based on observed (and obvious) limitations:
- A more extensive algorithm list. The listing is biased by my graduate experience with evolutionary computation and artificial immune systems, and my exposure to swarm algorithms. There are a lot more optimization algorithms in the published literature, and more importantly, there are a lot more algorithms that should appear on this list. I hope I can find or be made aware of them (unsubtle appeal!).
- An algorithm naming or search strategy that avoids ambiguity. There are algorithm names that are a subset of many other algorithm names, and thus receive their Google Juice. Additionally, there are likely algorithms in the list that are not using their most common name, and algorithms in the list that are more commonly known by their acronym. A consistent algorithm naming convention is the cornerstone of the adopted method. The list requires careful cleaning and the searches require careful exclusion, both of which require a reliable method in their own right.
- Algorithm grouping could (should) be data driven. A set of 'fields of study' of varying generality could be assembled and search in conjunction with each algorithm name to locate each has the higher level of association. Such fields could then be distilled into appropriate groups.
- More measures. The domains and their measures were selected for convenience. Additional search domains may be used to add fidelity to the calculated ranking. Additional algorithm measures may be used, such as citation rank of seminal papers, articles, and books.
- A relative algorithm rank metric that exploits the importance of the search domain. An algorithm references on a webpage means less than a mention in a research paper, which again is less valuable than a mention in a book. Google's index is not complete, and it is quite possible that the book and scholar indexes are not direct subsets of the web search. The metric must be updated such that the known relationship between the measures is accounted for in an algorithms rank.
- An absolute algorithm rank metric from the collected absolute measures. The measures collected are absolute, and although relative inferences can be drawn a metric that exploited the absolute nature of the collected measures may be more robust and additionally amenable broader comparison.
- Statistical significance in tally measures and resultant ranking metric. Algorithms are ordered by their relative rank, although there is no account of the statistical significance of measures or the metric, only their ordinal relationship. For example, algorithms in the current listing may be assigned different ranking metrics, the difference between which is not a significant basis for discrimination.
OutcomesIt took this micro-study for me to realize what may be apparent: the role of a book such as the one I want to write should be to educate and enlighten. The expertise I bring to the editing of the book - such as the choice and presentation of algorithms and sub-fields - is the key and true contribution of the text, more than the distillation of the research on each specific subject.
I will continue to curate the list, I think it is still interesting. At the very least, it will provoke reasonable attention to algorithms and sub-fields that I had thought less relevant and depressed others that I thought were more extensive than they possibly are.
Algorithm popularity alone will not define the contents of the book. Reflecting, I can remember books like
New Ideas and Optimisation and first reading about Differential Evolution. I remember colleagues discovering algorithms in a similar manner that eventually took on large roles in their own research agendas. The skill in preparing this book will come in finding those nuggets, those under represented techniques that are interesting and could propagate.
Using a blog post as a publication medium for ad hoc experiment and rant is liberating! Thanks to those kind enough to send in your top 10 lists, and thanks to
Dan Angus for giving my algorithm list a nice boost.