The Feldman File: Palantir

(Updated October 4, 2014) If you follow me on Twitter, you probably know that I'm a rabid fan of CBS's "Person of Interest", a wonderfully-written drama about the dystopia of permanent, continuous surveillence disguised as a "crime of the week" thriller. It has several key human characters, but the character that drives the show isn't human at all--it's a computer, or more correctly, a whole bunch of computers, called "The Machine." The fundamental purpose of The Machine is to identify threats to U.S. national security wherever they may be, so that they can be "neutralized" (and we all know what that means.) Last season, The Machine was displaced, but not eliminated, by another system based on Artificial Intelligence software. This new system, called Samaritan, uses quantum processors that Samaritan's builders, Decima Technologies, stole from the NSA.

This season, POI is focusing on the impact and ethics of AI. Showrunners Jonathan Nolan and Greg Plagerman see AI as potentially having the same magnitude of effect on the world as the atomic bomb, and call AI's development our era's Manhattan Project. I'm not sure that a system like The Machine needs Manhattan Project-scale development in the state-of-the-art of AI in order to fulfill its primary objective.

Here are the key functions that both The Machine and Samaritan perform:

Signals Intelligence: Both systems are connected to all of the same sources and feeds as the NSA, CIA, FBI and presumably National Reconissance Office (NRO), Defense Intelligence Agency (DIA) and other U.S. and Five Eyes (U.S. plus Canada, U.K, Australia and New Zealand) country sources. That means that it can get virtually every phone call, email, text, tweet, webpage and app data transfer anywhere in the world. It can geolocate any mobile phone call, and turn the microphones of certain mobile phones on for surreptitious listening, even if the phone itself is turned off.
Image Recognition: They can access the images from security cameras around the world, and use those images to recognize peoples' faces. It can also, presumably, categorize the actions in the images and analyze them to determine whether or not they represent threatening behavior.
Database: They have massive databases of all the data they've collected, as well as a lot of historical data.
Pattern Recognition and Classification: The systems have to be trained, or train themselves, on previous patterns of activity that indicate a threat. So, for example, they would be given every piece of information related to the 9/11 attack: Who the terrorists were, where they came from, where they traveled to, who they met, who they talked to, where they lived, how they trained, etc. Those data would then be analyzed to build a pattern that indicates a terrorist event being planned. That pattern would be modified with new information continuously. Other patterns, based on subsequent terrorist attacks and changes in terrorist behavior, would be identified. Then, as the systems see current activity, they'll try to match it with previous patterns and calculate the probability that what's going on is actually leading up to an attack. Both systems probably also have the ability to learn from previous attacks and make inferences about the activity even if no previous attack is well-matched. If a probable attack is identified, the systems alert human analysts and provide their analysis and underlying data.
Voice Response and Recognition: Both systems have voice response interfaces and accurate voice recognition.

In addition to these functions, both The Machine and Samaritan are what's called in the AI community "Hard AI." Hard AI has the ability to reason and independently solve problems without human intervention. Beyond that, Hard AI is self-aware--conscious, although its form of consciousness may not look or act like human consciousness.

That's a lot for any system to do, so where are we now (at least in developments that the public knows about?):

Signals Intelligence: All the databases and sources that I listed above exist. The problem comes in access and coordination. The Five Eyes countries have extensive data sharing systems, but not all of their data are shared with all of the other partners, not all of the data are in online databases, and we can't assume that U.S. intelligence agencies have 100% of the functionality of other Five Eyes intelligence services available to them. For example, a Five Eyes member may have the ability to get geolocation information, caller identity and even the content of a phone call, but it may not have the legal authority to provide all of that information to the U.S. in real time. In addition, non-Five Eyes countries such as Russia and China may have the ability to limit, disrupt or completely block U.S. and Five Eyes access to their signals. That would keep a system like The Machine from getting every signal, everywhere, in real time.
Image Recognition: This is the biggest problem for building a Machine-like system today, not because the quality of image (face) recognition is unacceptable (it's getting better every day) but because of the scarcity of networked security cameras. In New York, where POI is shot, you'd say that the last sentence is dead wrong, because there are security cameras everywhere. New York police had real-time access to 6,000 public and private security cameras and 220 license plate cameras last year, and the ACLU reports that Chicago police had access to 22,000 security cameras last year, but it's not clear how many of them offered real-time access.

When you get beyond those two cities and a handful of others, including London, the number of cameras per 1,000 residents goes down significantly. However, even in the high-camera cities, there are many cameras on private property and in buildings that are not accessible from the Internet, either because they're on a private network or they're not networked at all. Getting images from these cameras requires physical access to the cameras or video recorders. Sometimes, the local police department or FBI has to take the entire video recorder to its facilities in order to copy the video. So, the real-time acquisition of video from every security camera everywhere simply isn't possible today.
Database: The initial capacity of the NSA's Bluffdale, UT Data Center has been estimated to be between 3 and 12 exabytes (3 and 12 million terabytes,) and that's just one site. Storage developers are starting to think in terms of zettabytes (1000 exabytes) and even yottabytes (1 million exabytes.)
Pattern Recognition and Classification: The world leader in this technology so far (at least the one that we know publicly) is Palantir, which was a spin-off from PayPal's fraud detection team. The company has two publicly-disclosed products: Gotham, which is a data management system for management and analysis of complex datasets containing both structured and unstructured data that can be both quantitative and qualitative. Metropolis is for model-based analysis of structured, quantitative data. Both systems require analyst and Palantir engineer inputs: Gotham requires Palantir engineers to build the model that maps all the data together, and analysts to query the data and develop their own hypotheses and conclusions, while Metropolis requires analysts to create and modify models.

The functionality of both The Machine and Samaritan is a combination of both Gotham and Metropolis--they can build models that incorporate all types of data, not just quantitative. In addition, they have the ability to build and modify their own models. It's likely that The Machine's creator, Harold Finch (Michael Emerson,) initially trained it before it took over the task of model building and analysis.
Voice Response and Recognition: Both voice response and voice recognition are mature but still improving technologies. As an example, the voice recognition in Apple's Siri hasn't been as good as that in Google Now, in large part because Siri does its recognition on the mobile device, while Google does it on its own computers with dramatically more horsepower.

So, where does that leave us in building an all-knowing, all-seeing AI security system? We've still got a long way to go, but all the pieces are there. Voice and image recognition use both AI and non-AI technologies. There are some image processing systems that can analyze video to identify anomalies and threats. To my knowledge, there are no pattern recognition and classification systems that work on multiple types of data and that build and test models without human intervention, but research on neural network training and optimization methods (backpropgation, for example) is making big strides, fast enough that we may be five years away from a commercially-viable Machine. All of this is with processors using a conventional von Neumann architecture; it doesn't need quantum processors, although they could eventually dramatically speed up parts of the problem best suited to superposition and entanglement.

Also, let me be clear: The resulting system won't be conscious. We don't yet even have a consensus scientific definition of consciousness. This system would be what's called Soft AI: It can solve a specific problem that's it's been programmed to handle, but it's not self-aware. It may be able to analyze data and make decisions about a class of problems, and it may be able to hold a conversation, but beyond that, it will only do other things if it's programmed to do so by its developers. I'd hope that its developers will have seen "War Games" or "Colossus: The Forbin Project" and don't give it the ability to launch ICBMs all by itself.

It doesn't hearten me that the biggest obstacles to building The Machine or Samaritan are time, politics and bureaucracy, not fundamental science, but I can only hope that the benefits to medicine, science, education and engineering outweigh the risks to civil liberties.

Late last year, several observers wrote about what they believe is deterioration in the quality of Google's search results:

Jeff Atwood of Stack Overflow wrote that sites that have simply copied Stack Overflow's articles and surrounded them with advertisements are showing up higher in Google's search results than the original Stack Overflow content. According to Google's own rules, that's not supposed to happen.
Vivek Wadhwa wrote in TechCrunch: "Google has become a jungle: a tropical paradise for spammers and marketers. Almost every search takes you to websites that want you to click on links that make them money, or to sponsored sites that make Google money. There’s no way to do a meaningful chronological search."
The "wheels fell off" of Google Search for Paul Kedrosky when he tried to decide which new dishwaher to buy. Thanks to keyword-driven content generated by content mills such as Demand Media and the Yahoo Contributor Network (formerly Associated Content), "Pages and pages of Google results...are just, for practical purposes, advertisements in the loose guise of articles, original or re-purposed. It hearkens back to the dark days of 1999, before Google arrived, when search had become largely useless, with results completely overwhelmed by spam and info-clutter."

Content providers have been trying to "game" Google's results ever since Google became a serious search engine player, but Google has always been able to adapt its algorithms to keep the best results coming up at the top. Now, however, it looks like the gamers are winning, and that's opening the door for other search engines.

This may not be a perfect, or even a relevant, analogy, but it may help explain what Google is facing. 25 years ago, Kurzweil was the only company that could read and convert virtually any typeface to ASCII (OCR, or optical character recognition). They did it by having the machine operator scan in examples of the material to be converted, and then individually identify each character ("this is an "L"...this is an "I"...this is a lower-case "i") until the reader could understand the test set. Then, the operator could scan in the complete set of documents, and the Kurzweil device would read and convert them. However, there were always characters that it still couldn't read, and the operator would have to stop and correct the mistakes. These corrections would further train the system.

The Kurzweil system could only recognize a limited number of typefaces at a time, because it would get confused. Over time, more training and corrections actually led to lower accuracy, as the system could no longer distinguish between similar characters such as "e", "o" and "q", "E" and "F", "D" and "O", or "I", "i", "L", "l" and "1". Early systems relied on character shapes alone and didn't use dictionaries or context checks. As a result, at some point the operator had to discard the training set and train the device all over again.

True algorithmic recognition systems from Palantir/Calera eventually solved the problem and were able to read the vast majority of typefaces without any training. Eventually, through acquisitions and mergers, the technologies of Kurzweil and Palantir/Calera fell under one roof at ScanSoft, and are currently sold as OmniPage 17 by Nuance.

My point is that the training technology of Kurzweil eventually reached its limit. Even after adding the best fixes the company could think of, its technology was eventually supplanted by algorithimically-based shape recognition, augmented with dictionaries and context analysis. Google could now face the same challenge. Having tweaked and augmented its search algorithms for years, it may no longer be able to keep up with attempts to game its system. In order to truly fix the problem, Google may have to either switch to a fundamentally different search and filtering technology, or bolt on a radically different approach, such as social searching.

As the Kurzweil case suggests, technologies have limits, and once those limits are reached, it may take radical, not just incremental, changes to the technologies in order to either get further improvements or to avoid going backward.

The Feldman File

Saturday, September 27, 2014

Hey Kids! Let's Build "The Machine"!

Tuesday, January 04, 2011

Google and the limits of tweaking