George Rebane
“GIVEN the choice between a flesh-and-blood doctor and an artificial intelligence system for diagnosing diseases, Pedro Domingos is willing to stake his life on AI. "I'd trust the machine more than I'd trust the doctor," says Domingos, a computer scientist at the University of Washington, Seattle.” So starts Anil Ananthaswamy’s ‘I, algorithm: A new dawn for artificial intelligence’ in the latest issue of New Scientist.
Probabilistic programming is being touted as the “new dawn” of artificial intelligence. I would reserve that heroic vision for something more significant. Probabilistic programming (PP) is the joining of Bayesian networks and various forms of logics or rule based programming into a unified software environment (algorithm) that can better model realworld processes with all of their intrinsic uncertainties. The big deal since the late eighties has always been how can machines learn and process with uncertainties, and in that way replicate how humans deal with things not known exactly – in short, everyday important stuff.
PP is just the next stage of purposive capture and implementation of human knowledge into machines. Currently various university computer science and engineering departments are busy coming up with the initial tranche of PP languages that will enable easier development of the next generation of smart machines (as here). For example, here and here you can find out what MIT is doing in PP. If you google ‘probabilistic programming’, you get a snoot full of hits; but please notice that PP has not yet made its way into Wikipedia. In today’s world that may be an operational definition of ‘new’.
For those who follow these Singularity precursors (see RR category on right), we should note that the algorithmic approach of PP to machine learning is distinct from the ‘unscented’ or non-algorithmic. In the algorithmic approach the domain expert or programmer has to explicitly represent the modeled process as it is analyzed into its components, their linkages, and the involved stochastics (probability distributions and their inter-operations). Of course, another domain expert may come up with a slightly different model and so on. In this sense, each modeler brings and leaves his own ‘scent’ on how the process is modeled.
In the unscented approaches to machine learning, the machine is programmed with generalized learning structures that may or not be analogs of how (parts of) critter brains work. Temporal Hierarchical Memory (Jeff Hawkins, 2004) is one such software structure or building block that can be combined into more complex structures. The final learning HTM ‘program’ has no apprehension about how the realworld process works that generated the data from which it will abstract (learn) its version of the process. It is just exposed to the data and goes through an iterative sequence of absorbing data, predicting and correcting what the underlying realworld process does, until it hopefully converges into a useful program about the process.
My own opinion is that Singularity will occur (spontaneously most likely) with the unscented approach that combines whatever processing complexity will be available at that future time. And who knows, that unscented amalgam may even contain some sweet whiffs of PP.
In any event, the addition of PP into the expanding AI toolbox just increases the rate at which machines will take over and outperform humans in an ever expanding array of jobs. “Would you also like fries with that?”
Comments