Finding Suspects Isn’t the Problem

by on December 18, 2006 · 4 comments

I’ve just finished reading Cato’s new paper on predictive data mining as an anti-terrorism strategy, which co-author Jim Harper discussed last week. It is excellent, and I encourage you to read it. I found this part particularly interesting:

The terrorists not only operated in plain sight, they were interconnected. They lived together, shared P.O. boxes and frequent flyer numbers, used the same credit card numbers to make airline travel reservations, and made reservations using common addresses and contact phone numbers. For example, al-Mihdhar and Nawaf al-Hazmi lived together in San Diego. Hamza al-Ghamdi and Mohand al-Shehri rented Box 260 at a Mail Boxes Etc. for a year in Delray Beach, Florida. Hani Hanjour and Majed Moqed rented an apartment together at 486 Union Avenue, Patterson, New Jersey. Atta stayed with Marwan al-Shehhi at the Hamlet Country Club in Delray Beach, Florida. Later, they checked into the Panther Inn in Deerfield Beach together.

When Ahmed al-Nami applied for his Florida ID card he provided the same address that was used by Nawaf al-Hazmi and Saeed al-Ghamdi. Wail al-Shehri purchased plane tickets using the same address and phone number as Waleed al-Shehri. Nawaf al-Hazmi and Salem al-Hazmi booked tickets through Travelocity.com using the same Fort Lee, New Jersey, address and the same Visa card. Abdulaziz al-Omari purchased his ticket via the American Airlines website and used Atta’s frequent flyer number and the same Visa card and address as Atta (the same address used by Marwan al-Shehhi). The phone number al-Omari used on his plane reservation was also the same as that of Atta and Wail and Waleed al-Shehri. Hani Hanjour and Majed Moqed rented room 343 at the Valencia Hotel on Route 1 in Laurel, Maryland; they were joined by al-Mihdhar, Nawaf al-Hazmi, and Salem al-Hazmi. While these are plentiful examples of the 9/11 terrorists’ interconnectedness, even more connections existed.

If data mining were useful, it would be in the first step of the investigation process–the part where investigators get leads for further study. In the most optimistic scenario, data mining can only point to people and activities that might be suspicious. It’s up to human investigators to pick up those leads and follow up on them.

But the problem with the 9/11 reports was not a shortage of leads. We already knew that several of the 9/11 hijackers were affiliated with Al Qaeda, that they had traveled to Afghanistan, that they had been connected to previous terrorist attacks, etc. What was needed was more manpower focused on the leads we already had. We needed several dozen investigators to go out and start investigating the terrorists we already knew were in the country. Had they tapped their phones, subpoenaed their credit card records, and talked to the relevant intelligence experts at the CIA, FBI, State Department, and other agencies, they would have quickly discovered all the information that data mining would have uncovered, and then some.

If anything, the problem was that our intelligence and law enforcement resources were stretched too thin. One of the 9/11 hijackers was literally on an FBI agent’s “to do” list. Throwing tens of thousands of additional leads on the pile (and we’ll be lucky if we can get the number of false positives down to the tens of thousands) will just stretch those resources even thinner.

The paper makes some other good points as well, so I encourage you to go check it out.

  • http://matlabdatamining.blogspot.com/ Will Dwinnell

    Much of the on-line commentary surrounding data mining’s use against terrorism has, in my opinion, taken the technically naive perspective that data mining output is “right” or “wrong”. Some have dressed up this perspective in terms of “false positives” and “false negatives”.

    In practice, most classification systems yield probabilities, not simple classifications, which can be sorted to prioritize treatment.

    In marketing, for instance, business people are not interested in classifying prospects as “purchasers” and “non-purchasers”. Seldom do real-world classification problems yield solutions of sufficient quality as to simply lump people into two groups. Instead, potential customers are ranked by the predicted probability of purchasing, allowing the finite resource of treatment (advertising, for instance) to be directed at those most likely to respond. It is common in many fields to assess such predictive models in terms of how many target class individuals (purchasers, for example) end up in the most likely 5%, 10%, etc.

    Given the scarce resource of investigation time that you identify, I suggest that data mining is one of a number of useful tools to be used to direct its application.

  • http://matlabdatamining.blogspot.com/ Will Dwinnell

    Much of the on-line commentary surrounding data mining’s use against terrorism has, in my opinion, taken the technically naive perspective that data mining output is “right” or “wrong”. Some have dressed up this perspective in terms of “false positives” and “false negatives”.

    In practice, most classification systems yield probabilities, not simple classifications, which can be sorted to prioritize treatment.

    In marketing, for instance, business people are not interested in classifying prospects as “purchasers” and “non-purchasers”. Seldom do real-world classification problems yield solutions of sufficient quality as to simply lump people into two groups. Instead, potential customers are ranked by the predicted probability of purchasing, allowing the finite resource of treatment (advertising, for instance) to be directed at those most likely to respond. It is common in many fields to assess such predictive models in terms of how many target class individuals (purchasers, for example) end up in the most likely 5%, 10%, etc.

    Given the scarce resource of investigation time that you identify, I suggest that data mining is one of a number of useful tools to be used to direct its application.

  • http://www.techliberation.com/ Tim Lee

    Will,

    I suggest you take a look at the paper, which does a pretty good job of drawing some important distinctions here:

    There are two loose categories of data analysis that are relevant to this discussion: subject based and pattern based. Subject-based data analysis seeks to trace links from known individuals or things to others. The example just cited and the opportunities to disrupt the 9/11 plot described further above would have used subject-based data analysis because each of them starts with information about specific suspects, combined with general knowledge. In pattern-based analysis, investigators use statistical probabilities to seek predicates in large data sets. This type of analysis seeks to find new knowledge, not from the investigative and deductive process of following specific leads, but from statistical, inductive processes. Because it is more characterized by prediction than by the traditional notion of suspicion, we refer to it as “predictive data mining.”

    If there are data analysis tools that allow investigators to rank potential suspects from a list of people or activities that are already under investigation, that could conceivably be useful. On the other hand, using large data sets to find brand new suspects is not likely to be useful, because even the best algorithm is likely to find thousands of false leads for every actual suspect it finds. And importantly, predictive data mining requires surveillance of millions of innocent Americans, whereas analysis of existing data doesn’t.

  • http://www.techliberation.com/ Tim Lee

    Will,

    I suggest you take a look at the paper, which does a pretty good job of drawing some important distinctions here:

    There are two loose categories of data analysis that are relevant to this discussion: subject based and pattern based. Subject-based data analysis seeks to trace links from known individuals or things to others. The example just cited and the opportunities to disrupt the 9/11 plot described further above would have used subject-based data analysis because each of them starts with information about specific suspects, combined with general knowledge.

    In pattern-based analysis, investigators use statistical probabilities to seek predicates in large data sets. This type of analysis seeks to find new knowledge, not from the investigative and deductive process of following specific leads, but from statistical, inductive processes. Because it is more characterized by prediction than by the traditional notion of suspicion, we refer to it as “predictive data mining.”



    If there are data analysis tools that allow investigators to rank potential suspects from a list of people or activities that are already under investigation, that could conceivably be useful. On the other hand, using large data sets to find brand new suspects is not likely to be useful, because even the best algorithm is likely to find thousands of false leads for every actual suspect it finds. And importantly, predictive data mining requires surveillance of millions of innocent Americans, whereas analysis of existing data doesn’t.

Previous post:

Next post: