Wolfram|Alpha Demoed at Harvard: Limits on Human Understanding?

Yesterday Stephen Wolfram gave the first demo of Wolfram|Alpha, coming in May, what he modestly describes as a system to make our stock of human knowledge computable. It includes not just facts, but also our algorithmic knowledge. He says, “Given all the methods, models ,and equations that have been created from science and analysis – take all that stuff and package it so that we can walk up to a website and ask it a question and have it generate the knowledge that we want. … like interacting with an expert.”

It’s ambitious, but so are Wolfram’s previous projects: Mathematica and Mathworld. I remember relying on Mathworld as a grad student – it was excellent, and so I remember when it suddenly disappeared when the content was to be published as a book. In 2002 he published A New Kind of Science, arguing that all processes, including thought, can be viewed as computations and a simple set of rules can describe a complex system. This thinking is clearly evident in Wolfram|Alpha and here are some key examples.

The Approach to Natural Language Processing

Since the website is intended to take natural language human queries (well, natural “query” language is best: the simple, often prepositionless, shorthand we all eventually start typing into search engines), some kind of NLP is essential to the project. Wolfram described how his approach differed from traditional NLP, which he describes as learning from lots of perfect text. Wolfram believes language can be reduced to a set of symbolic representations of objects useful in computation. His approach to the problem, which he claims to have made groundbreaking headway on, is mapping our utterances to these few symbolic representations. I’m not sure this idea is so new. In a recent defense of data-intensive computation for NLP, entitled “The Unreasonable Effectiveness of Data,” Halevy, Norvig and Perelra argue that a search for parsimonious models underlying natural language is ineffective. They say, “[e]very day, new words are coined and old usages are modified.
This suggests that we can’t reduce what we want to say to the free combination
of a few abstract primitives.” But maybe Wolfram is really using continually updated massive table lookup, we can’t know, because he won’t share code or methods. Which brings me to my next point.

The Epistemology of Computational Problem Solving

Wolfram|Alpha’s goal is to compute things actively rather than using the passive tools of, say, a search engine where the solutions have to have been explicitly written down beforehand. Wolfram outlines 4 components of his project 1) data curation, 2) algorithms, 3) linguistic analysis for understanding interactions, and 4) the need to automate the presentation of the solution and what people consider to be the interesting things. With regard to the data used in Wolfram|Alpha’s computation of answers, Wolfram said they go through a process of auditing, cleaning, and cross checking before being used, and decisions are made as to the reliability of the data. As anyone with experience in data analysis knows, the decisions made in this process will affect the final results, sometimes dramatically. In Wolfram|Alpha this process isn’t open for inspection. When pressed on this, Wolfram unsatisfyingly stated the they “do the best job they can” and where there is scientific dispute (I suppose Wolfram|Alpha decides this) results will link to the data source on the web. Seeing the source is very useful, but if there is internal filtering through an ostensive black box, it’s difficult to understand what the results mean. Wolfram also said something interesting, “after [the data have] gone through these computations there is no good human way to describe what happened.” Perhaps the data curation process is also being subject to the thinking embodied in A New Kind of Science, where each process is represented by computation, but computation that is impenetrable to human understanding? He gives a clearer example next.

When pressed on the openness of the code and algorithms, the best Wolfram could do was provide a link to the mathematical formula used, when there was such a closed form formula used in answering the question. Wolfram|Alpha was very impressive at doing math – calculating integrals and other mathematical equations, even ones without a closed form solution. For example, Wolfram enters y^x+siny = cosx and a solution is quickly returned. But here again we are to accept that, as Wolfram says, they “try and do the best quantitative analysis they can.” He relates a story about how he used to put time and effort into cleaning up code and making it public (presumably he means his Mathematica days), but it was frustrating as they discovered no one actually read the code. So now he advocates using test cases to check the code rather than inspecting the code itself. In fact, he says machine generated proofs are full of steps that each make sense but the whole is not cognitively accessible to a person. Humans reading this type of proof are just less efficient than letting machines check whether it is right. He claims it would take many PhD students’ dissertations-worth of work to figure out the computations behind Wolfram|Alpha’s solution to y^x+siny = cosx.

If this is really the case, and there are certainly other examples where this happens (see organic programming), then it outlines boundaries for reproducbility of results, in the sense of checked results by inspection of the steps taken. If the process used to generate those results is opaque, as Wolfram states there might be other ways to check whether it appears to be functioning correctly. But the question remains about how this process is a contribution to our underlying stock of knowledge and understanding of the world. It reminds me of the debates in my former department about whether a computer can be used to obtain a mathematical proof. Those debates tended to be centered around the case of using simulation to show mathematical properties that previously a mathematical proof would be expected, so generally speaking you could inspect and understand the code, but the reasoning is similar for inscrutable code – you aren’t contributing to the underlying body of logic that explain our world. This isn’t new. For example evolved systems, which some AI theorists posit to explain the complexity of our brains, have been producing mysterious code for years (for an interesting example see Danny Hillis’s evolved sort algorithm as described by Kevin Kelly in Out of Control. Hillis describes the principle in the last chapter of his book The Pattern on the Stone. See also Steve Jurvetson’s explanation, The Dichotomy of Design and Evolution.).

Wolfram|Alpha seems to be a stunning contribution for the ability to calculate complex mathematical equations quickly and transparently alone, let alone the other features. He claims the project will provide an API and encourage others to use its output. He imagines a business model of customers paying to have their data uploaded, and hence analyzed, within the Wolfram|Alpha system. But from what I saw the data output require a greater verification before I’d be comfortable trusting them.

2 Responses to “Wolfram|Alpha Demoed at Harvard: Limits on Human Understanding?”


  • I can’t wait to play with it.

    Also, this stood out as a good practical observation:

    “no one actually read the code. So now he advocates using test cases to check the code rather than inspecting the code itself.”

    I wonder if this affects your thinking on publishing reproducible results? Would it be as, or even more, satisfying to be able to play with an algorithm’s input and output than to have access to the code?

  • It’s something I’m thinking about – and Wolfram has made me consider boundaries to the creation of a human-accessible model of outcomes in the world (what I’ve always thought of as the grand goal of science). His example doesn’t make an argument against reproducibility, but it suggests reproducibility in the sense of comprehensible scripts might not always be feasible. So I’m starting to ask what conditions does a testbed need to satisfy before it can be said to verify code?

    Another example that removes reproducibility from ‘fixer of all problems in science’ status (if it ever was there) is examples in climate change modeling where code is open, but models are so complex and researchers so invested in them, that well-known bugs can’t be fixed. Can’t in the political sense, not in the technical sense.

    Well, I can’t wait until you get a chance to play with Wolfram|alpha either, Garrett. :)

Leave a Reply