Sharing cybersecurity data involving threats, probes, breaches, and information on attackers between companies and government agencies is a great idea. However, although shared data will strengthen the cybersecurity defenses, the Computer and Communications Industry Association (CCIA), backed by Amazon, Facebook, and Google, among others, and the Business Software Alliance (BSA), which is backed by Apple, Microsoft, and Oracle, are both against it.
Smart companies are already doing something similar. At the RSA Security Conference three years ago (a century ago in cybersecurity time) Zion Bankcorp's data scientists explained how the bank went from reacting to law enforcement warnings of cyberthreats to becoming proactive, frequently reporting threats to law enforcement, who subsequently relayed official warnings to other organizations.
Google collects all cybersecurity data throughout their networks and proactively predicts threats. Consequently, Google is rarely the successful target of Denial of Service (DoS) attacks; Gmail is nearly spam and malware free; and the number of potentially harmful apps (PHA) for Android smartphones has fallen to a tiny fraction compared to five years ago.
Content delivery network (CDN) and cloud services firm Akamai holds the catbird seat for pinpointing cyberthreats. With between 15% and 30% of the internet's traffic crossing its network at any given time and 85% of all internet users within one hop of its network, Akamai also captures a lot of forensic and active cyberattack data for analysis.
Akamai, Google, and Zion Bank take all this data from servers, routers, access points, and endpoints of every kind, including everything from ATMs to smartphones, and store it in huge data warehouses called Hadoop clusters. Hadoop is an open source framework for storing structured and unstructured information that data scientists use to make predictions based on complex analysis. Just a few examples: Zion Bankcorp can predict when a hacker in China is trying to breach its security; Google predicts which Android apps are suspicious and should be manually examined even after they have passed automated testing; and Akamai as a paid service warns customers of a DoS attack.
If one company's collection of cybersecurity data can predict attacks and prevent breaches, more data shared amongst multiple enterprises should make everyone safer. Why not start sharing right away?
Yes, enterprises should start sharing as soon as possible, because without sharing enterprises already see positive results from this type of cybersecurity defense, and some are already informally sharing data to improve these results. Enterprises need this bill to definitively make sharing cyber data legal and prevent conflicts with other data privacy laws.
But there is a very important reason not to encourage this bill. It puts the NSA and other law enforcement agencies back in the surveillance business.
Hadoop was created in Silicon Valley as a joint project to which internet companies like Google, Yahoo, and Facebook contributed software development. The potential applications for Hadoop are unlimited. But, to no one's surprise, Hadoop is most widely applied to predicting which ads placed on which web pages produced the highest yield and which products e-commerce companies should sell.
Many of the CCIA and the BSA members operate Hadoop clusters in predicting important aspects of their businesses. Both associations oppose this regulation because their members understand firsthand how unrestricted access to this treasure-trove of cybersecurity data could be used for surveillance by the CIA and NSA if these agencies have unfettered access to the data that will be in the custody of the Department of Homeland Security.
The Snowden Revelations revealed that Hadoop is/was also used by NSA to analyze and predict potential terrorist activities from telephone metadata and other data sourced by the agency. Many believe this to be a violation of the Fourth Amendment, which, until the Patriot Act, required a judge to issue a narrowly defined warrant before law enforcement searched the real and digital worlds. The varied accuracy of data scientists' predictions caused the controversy. The consequences of an inaccurate prediction of a digital advertisement or of product inventory don't violate the constitution. However, using data like this to investigate crimes could very easily violate the Fourth Amendment. A person who is investigated for erroneous suspicions of terrorism, for example, could end up being charged with an unrelated crime during the investigation.
These huge clusters of data can't be sanitized and anonymized of personal data. Consequently, the cyber data gives the NSA a back door to reinstituting surveillance.
Enterprises understand the benefit of sharing cybersecurity data, and those that aren't sharing it now would like to start. But the use of this data by all parties needs to be restricted to the detection, prevention, and prosecution of cybercrime.