NSA Eavesdropping

Save to del.icio.us

“The thought police would get him just the same. He had committed??would have committed, even if he had never set pen to paper??the essential crime that contained all others in itself. Thoughtcrime, they called it. Thoughtcrime was not a thing that could be concealed forever. You might dodge successfully for a while, even for years, but sooner or later they were bound to get you.”
- George Orwell, 1984

Orwell, were he still alive, would have been rather fascinated at the technological advances that now allow governments to spy on their citizens. He would probably also be amazed to realise that it no longer requires a physical person to connect the dots to suspect someone of being a possible terrorist, or to monitor thought crime. The Great Firewall of China is an excellent example how an increasingly technological age allows for greater methods of control and surveillence.

Another great example is the “revelation” from USA Today that the NSA is compiling a massive database of Americans’ phone calls, “secretly collecting the phone call records of tens of millions of Americans, using data provided by AT&T, Verizon and BellSouth”. I say “revelation” in inverted commas because it’s not as if we should really be surprised. We know, ever since the Total Information Awareness scheme was dreamed up shortly after 9/11, that the TIA never actually died when it was shut down in 2003. According to the Christian Science Monitor it was merely transformed and funded under a different name. Where did it go exactly? To the Advanced Research and Development Activity (ARDA) at NSA headquarters:

The National Journal reports that the Pentagon transferred two of the most important TIA components of TIA to Advanced Research and Development Activity (ARDA), located at NSA headquarters in Fort Meade, Md. One piece was the Information Awareness Prototype System. It helped extract, analyze and disseminate data collected under the project. Once the Senate cut off funding, ARDA stepped forward to fund the program and it was given a new name “Basketball.” All references to TIA were dropped.

The other key component of the original plan was known as Genoa II, “which focused on building information technologies to help analysts and policy makers anticipate and pre-empt terrorist attacks.” It was renamed “Topsail.” While Topsail was active as late as October of 2005, intelligence sources indicate that its funding, also from ARDA, may be in question.

We would have known in 2004 that Bush had authorized the NSA to “eavesdrop on Americans and others inside the United States to search for evidence of terrorist activity without the court-approved warrants ordinarily required for domestic spying”. Unfortunately, the New York Times had “delayed publication for a year to conduct additional reporting” after “meeting with senior administration officials to hear their concerns” that the article “could jeopardize continuing investigations and alert would-be terrorists that they might be under scrutiny”.

Also, in February this year, The New Standard reported that Verizon was being sued in a class-action suit by lawyer Michael Pascazi, who “alleged that Verizon provided the spy agency with communications records of customers and non-customers alike, violating consumer trust and numerous laws”. Then of course, we also already know that AT&T are forwarding all internet traffic into the NSA.

It’s not as if people didn’t already suspect this was happening.

I am also baffled when I read things like “This program does not involve the NSA listening to or recording conversations” (as the USA Today article states). Is that supposed to make people feel better? This doesn’t mean anything, because the actual content of the data itself isn’t always what’s important. It’s the relationships of the callers and users that are often more interesting. From GovExec.com, analysing Attorney General Gonzales’ statements regarding NSA spying:

With exacting language, [Gonzales] narrowed the scope of his comments to address only “questions relating to the specific NSA activities that have been publicly confirmed by the president.” Then, as if to avoid any confusion, Gonzales added, “Those activities involve the interception by the NSA of the contents of communications” involving suspected terrorists and people in the United States.

Slightly, and with a single word, Gonzales was tipping his hand. The content of electronic communications is usually considered to be the spoken words of a phone call or the written words in an electronic message. The term does not include the wealth of so-called transactional data that accompany every communication: a phone number, and what calls were placed to and from that number; the time a call was placed; whether the call was answered and how long it lasted, down to the second; the time and date that an e-mail message was sent, as well as its unique address and routing path, which reveals the location of the computer that sent it and, presumably, the author.

Considering that terrorists often talk and write in code, the transactional data of a communication, properly exploited, could yield more valuable intelligence than the content itself.

“You will get a very full picture of a person’s associations and their patterns of activity,” said Jim Dempsey, the policy director of the Center for Democracy and Technology, an electronic-privacy advocacy group. “You’ll know who they’re talking to, when they’re talking, how long, how frequently…. It’s a lot [of information]. I mean, a lot.”

What exactly does “listening to” or “recording conversations” mean in this day and age of telecommunications, anyway? It means very little, really, if we consider the technologies that are available today for monitoring traffic in real time. They don’t really involve someone “listening in”; at least, not initially. It’s questionable whether you need to record anything, either. Images of some super-spy sitting in a dark room smoking a cigarette listening on his headphones to someone talking on a tapped phone while a tape recorder spins slowly on the table belong to an ancient age of cheap spy thrillers.Consider the lawsuit filed by the EFF against AT&T. The whistleblower, Mark Klein, stated that “the NSA is capable of conducting what amounts to vacuum-cleaner surveillance of all the data crossing the internet”. We know, thanks to the DailyKos that “one NarusInsight machine” - the hardware being used by the NSA and AT&T - “can look at about 39,000 DSL lines at once”. It can do this conducting what is known as “semantic” searches. In other words, it doesn’t just look for keywords in data, but it looks at the meaning behind the words. All done, apparently, in real time.

The beauty of it all is that you just let machines do the work, creating models of relationships that allow the NSA to isolate possible “terrorists” and then place them under investigation. Unfortunately for Americans, it’s not always quite just “terrorists” that fall under suspicion: police forces monitoring activists, FBI spying on activists … you know, serious threats like that. If history is any judge, the same patterns are being repeated as when the FBI ran their COINTELPRO operations. No surprise to hear from the NY Times that the NSA has been sending the FBI a “flood” of “telephone numbers, e-mail addresses and names to the F.B.I. in search of terrorists”, where most lead to dead ends or innocent Americans (except for activists, apparently).

What if you could mine voice data using some sort of analysis tool that searches using either semantic or topic searching, and when you hit some sort of recognized pattern, this person’s phone number and who they called gets flagged and then somebody can go off and get an official warrant if they want. Is this possible? A good place to start to answer that would be to have a look at Echelon, the super world-wide surveillance tool. Although there have been claims in the past that Echelon relies upon keyword searching, or “word spotting”, this appears to be largely ineffective. Instead, it is better to rely on what’s called topic analysis and traffic analysis. From investigative journalist Duncan Campbell’s report:

Traffic analysis is a method of obtaining intelligence from signal related information, such as the number dialled on a telephone call, or the Calling Line Identification Data (CLID) that identifies the person making the call. By analysing calling patterns, networks of personal associations may be analysed and studied. This is a principal method of examining voice communications. Traffic analysis is particularly effective in studying military communications, where the timing and pattern of message exchanges may allow analysts to deduce the hierarchy and command structures of their targets.

Powerful though Dictionary methods and keyword search engines may be, however, they and their giant associated intelligence databases may eventually be replaced by ??topic analysis?, a more powerful and intuitive technique, and one that NSA is developing strongly. […] Among the new techniques that NSA researchers have reported to [Text REtrieval Conferences] are ??n-gram analysis? and ??Semantic Forests?. Both are forms of topic analysis. Topic analysis searches databases to answer questions formulated as ??find me messages about a subject?. Instead of listing keywords, the search system may be referred to a collection of other messages or reports that define the subject of interest.

All this was done around 1999/2000. Maybe they’ve had a bit more luck with their word spotting? Massive amounts of funding that have since been pumped into Homeland Security since 9/11, so it’s not difficult to imagine that these tools have greatly expanded since then. Tom Keating’s VOIP blog points out that a company called Nexidia has a product that uses phonetics analysis to “able to search audio at 100,000x faster than real-time playback”. It “is more scalable and more accurate than ever before — whether it’s searching broadcast-quality audio or a cell phone transmission. It’s also browser-based with expanded analytical tools to help slice and dice search results.” They point out that “most audio search technologies operate by first converting spoken words to text (similar to closed captioned programming) and then searching the transcripted text for the desired information”. This type of transcribed text system is what, according to Duncan Campbell’s Echelon report, the NSA was patenting as far back as 1997; he concluded in his repot that “the state of the art in automatic speech processing is still under development, but that speech recognizer transcription systems are available”. It’d be interesting to know what technology the NSA were using in this field today, almost ten years later, if Nexidia are using phonetics.

The ACLU points out:

Data mining is a broad dragnet. Instead of targeting you because you once received a telephone call from a person who received a telephone call from a person who is a suspected terrorist, you might be targeted because the NSA’s computers have analyzed your communications and have determined that they contain certain words or word combinations, addressing information, or other factors with a frequency that deviates from the average, and which they have decided might be an indication of suspiciousness. The NSA has no prior reason to suspect you, and you are in no way tied to any other suspicious individuals ?? you have just been plucked out of the crowd by a computer algorithm’s analysis of your behavior.

Probably something to think about.

Comments: 1 Response

[…] Or so says the blog Lowfatbrains, and the argument is pretty convincing: Another great example is the ??revelation? from USA Today that the NSA is compiling a massive database of Americans?? phone calls, ??secretly collecting the phone call records of tens of millions of Americans, using data provided by AT&T, Verizon and BellSouth?. I say ??revelation? in inverted commas because it??s not as if we should really be surprised. We know, ever since the Total Information Awareness scheme was dreamed up shortly after 9/11, that the TIA never actually died when it was shut down in 2003. According to the Christian Science Monitor it was merely transformed and funded under a different name. Where did it go exactly? To the Advanced Research and Development Activity (ARDA) at NSA headquarters. […]

Leave a comment




XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>


Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.