In 2014, there were 317 million new malware variants; in 2015, a 36 percent increase was encountered, resulting in 431 million new malware variants discovered, according to a 2016 Symantec report. With over 1.18 million new variants released daily, how can we honestly believe that we can defend our networks and information utilizing traditional methods?
With many organizations relying on mechanisms, such as signature-based protection to defend their infrastructure, they will always be competing against an opponent that is one step ahead of them. In fact, there is example after example in which organizations from all fields are losing in the cyber domain. An annual study released in 2016 by the Ponemon Institute on the privacy and security of healthcare data estimates that data breaches have cost the healthcare industry $6.2 billion. The average cost to each healthcare organization from a data breach was more than $2.2 million.
In the retail industry, Target was victim to a large-scale data breach in 2013. According to a United States Senate Report released in 2014, over 40 million customer account details were stolen resulting in further fraud and financial and time losses to Target customers.
Technology has contributed to the advancement of societies and the improvement of humanitarian conditions, but it has also resulted in the world’s most powerful weapons and greatest catastrophes. Whether we like it or not, our future will be marked by artificial intelligence (AI), and globally, advancement is being fully pursued. Is this inevitable technology necessary for our national security? Now is the time for organizations to review their security frameworks and assess their need of adopting AI to further protect their data and infrastructure.
By no means am I stating that AI is not in use today to protect our information but, rather, it may not be used in a manner promoting the greatest possible benefit. It is not uncommon to see some form of analytics and AI used at the enterprise level, but this is not always the case. What is being done at the unit and end-point level? If this technology is not being used at the enterprise level, imagine how less likely it is to be found at the more granular levels. Individual units, subordinate organizations and end-points are entry points into the enterprise, and it is critical to secure these entities, to protect the enterprise as a whole. Organizations within any industry, whether they are Fortune 500 corporations, critical infrastructure facilities, garrison headquarters, afloat platforms, to even end-user devices can reap the benefits that AI and machine learning provide in protecting their information and networks.
Data is a priceless commodity and the maintenance, policies and security implemented to ensure its confidentiality, integrity and availability must constantly be reviewed because the compromise or unavailability of information may be costly. The unauthorized access or unavailability of data may result in the damage of national security, unwanted expenses to a corporation, and the financial and time burden experienced by victims of a data breach. The fundamental problem is that traditional security methods are not enough to protect organizations, especially when organizations are susceptible to zero-day exploits. It is my belief that individual entities must not only rely on their enterprise cyber defense mechanisms, but also be armed with their own advanced defense mechanisms.
Many of us, me included, have been impacted by the Office of Personnel Management (OPM) breach in 2015, but what may not be realized is that AI was utilized to limit further damage to national security and to our individual interests. The OPM case study is a scenario that every information systems professional should be familiar with. Not only because it serves as a lesson in the importance of protecting critical information, but in how AI greatly limited the damage from the hack and how it may be applied to protect organizations of any industry.
Subsequently, I will focus on the OPM data breach to exemplify my belief that organizations must stringently evaluate their current cyber posture and plan for a future consisting of advanced tactics such as AI.
The assessed impact of the OPM hack will be fully realized over time, but it has been devastating from a security and personal perspective thus far. The data breach has potentially grave consequences to U.S. national security because of the type of information stolen. According to former head of the National Security Agency and the Central Intelligence Agency, retired General Michael Hayden, “the theft of personnel records are considered to be a legitimate intelligence target for foreign nations.”
According to a paper written by Stephanie Gootman for the Journal of Applied Security Research, U.S. intelligence personnel had to be recalled from Beijing due to safety concerns. As discussed by Gootman, sensitive personal information, such as confidential health records and locations of foreign assets were stolen during the hack, as cited by OPM, which resulted in serious concerns to U.S. national security and U.S. government employees. From a personal perspective, the loss of Social Security numbers, full names and dates of birth have left those identified in the breach potential targets to fraud.
It is not known to many that AI was utilized to further prevent damage and exploitation of government networks. Cylance Inc. is a corporation that provides artificially intelligent information security tools which were made available to OPM. In fact, these information security tools were made available in June 2014 — four months after the data breach was noticed. What remains troubling is that these tools were not utilized until April 2015, after the confidentiality of the network was compromised, according to a 2016 report by the Committee on Oversight and Government Reform.
Based on testimony and key findings presented in the Committee’s report, Cylance was vital in improving OPM’s security conditions. At the time of “The OPM Data Breach” report, Cylance provided at least two information security products. The first, CylanceV, is a relatively limited tool compared to its CylanceProtect application. CylanceV is an end-point detection product and does not prevent intrusions. In other words, CylanceV will locate potential threats, so that IT professionals can rectify the degradation. With the use of this system, IT professionals are only made aware of an issue after a problem has occurred, which is not different from our reliance on signature and semantic-based security systems.
CylanceProtect is a preventive tool, designed and developed to thwart malicious activity. The machine learning algorithms in Cylance ultimately resulted in the discovery of an archive of Roshal Archive (RAR) files performing malicious tasks. RAR files possess three qualities that lend themselves to attacks such as the OPM hack. First, they can be compressed for quicker exfiltration. Second, they can be encrypted, hiding the contents from everyday users and administrators. Third, the characteristic of extraction will unpack the contents of the file in a directory to place the compressed and encrypted files for easy access by malicious users. By April 21, 2015, Cylance discovered over 1,100 threats, the RAR archives and two Trojan files located on key servers, all of which compromised the network and ultimately, our nation’s national security, according to the Committee’s report.
After the dust settled from the OPM hack, the U.S. Federal Government commenced four initiatives to improve its cyber posture and attempted to protect employees against fraud. The initiatives include: “establishing a new National Background Investigations Bureau (NBIB); implementing a 30-day “Cybersecurity Sprint”; implementing a 60-day “Clean Slate Review”; and putting together a “Cybersecurity Strategy Implementation Plan” (CSIP), according to researchers Sarah Harvey and Diana Evans writing in the 2016 edition of the Proceedings of the National Conference on Undergraduate Research.
The NBIB is a new department within OPM that is tasked to perform all further background investigations, but will leverage secure Department of Defense IT systems. Both the 30- and 60-day reviews examined existing procedures and policies within OPM to discover additional system vulnerabilities, Gootman wrote. Finally, the CSIP is a cybersecurity design plan to further secure OPM infrastructure. On an individual basis, U.S. government personnel have been offered the use of credit monitoring services to prevent identity theft fraud.
The bottom line: the OPM hack significantly impacted our national security, incurred significant financial and time costs to the U.S. government and impacted the quality of life for thousands of government employees. The hack resulted in grave damage to the national security posture of the United States, costly network infrastructure upgrades, and time-consuming policy modifications. The policy reviews and changes identified by Harvey and Evans and the technical upgrades reported by Gootman incurred great costs to the United States, some of which will never be resolved by time or money. With the assistance of AI-based systems, the exploitation of sensitive information was limited, which could not have been done with commonly used signature-based security. Imagine how things may have been different had AI security been implemented from the beginning.
The ability to detect and protect networks from zero-day attacks and malware variants will only become more important, especially as adversaries pursue the capability of AI cyber-attacks and autonomous weapons systems. As such, it is urgent for national security that the development of artificially intelligent cyber defense systems be pursued to the fullest extent.
The goal should be the successful implementation of systems that rely on unsupervised learning techniques that remove administrators from the equation and protect networks at near real-time.
This should be the goal, but unfortunately, the algorithms are just not there yet. Therefore, most of the successful AI cyber-defense systems rely on some variation of feedback from the administrator to succeed. Researchers from the Massachusetts Institute of Technology (MIT) have developed an artificially intelligent cyber-defense system called AI2. AI2 utilizes a hybrid of supervised and unsupervised techniques, which provides the best of both algorithms.
The system’s outlying detection system utilizes unsupervised machine learning to present a small set of events to the analyst to review, according to MIT’s research paper, AI2: Training a big data machine to defend. The system then utilizes the analysts feedback with supervised learning techniques to produce a supervised model. AI2 utilizes the supervised model and unsupervised techniques to predict and classify attacks. The system achieves a detection rate of 86.8 percent to include detection of zero-day malware, which is 10 times better than the 7.9 percent encountered by unsupervised systems. This technology reduces false-positives by a factor of five when compared to other unsupervised systems. This results in a hybrid system that overcomes the deficiencies experienced by pure supervised and unsupervised systems. AI2 is more effective than unsupervised systems and reduces the data burden to the analyst experienced in supervised systems, according to MIT’s research paper.
On the other front, researchers from Endgame and the University of Virginia have developed a mechanism to evade detection from machine learning algorithms by adding noise variation to existing malware. This is done by changing section names, creating new and unused sections of code, adding bytes to the end of sections, and modifying the header checksum, to name a few. The methodology, described in Evading Machine Learning Malware Detection, evaded machine learning algorithms by 16 percent during early experimentation. Whether a group’s intent in developing such algorithms is purely educational or malicious, the pursuit of autonomous methods of bypassing security measures is underway. Researchers and developers must strive to design systems able to go head-on with autonomous cyber-attack systems.
While artificially intelligent systems are being developed for the benefit of society and the protection of information, there is an arms race emerging with equally intelligent malicious systems. Artificially intelligent cybersecurity measures perform well and provide immediate benefits to an organization’s cyber-defense posture. It is our responsibility as information system professionals to fully pursue the best available defense mechanisms and if artificially intelligent cyber defense systems help accomplish this, we must push for its implementation and further enhancement at the enterprise, organizational and end-point levels.
The views expressed here are solely those of the author, and do not necessarily reflect those of the Department of the Navy, Department of Defense or the United States government.
References for further study:
Symantec. (2016). Internet Security Threat Report.
Ponemon Institute LLC. (2016). Sixth Annual Benchmark Study on Privacy & Security of Healthcare Data. Ponemon Institute Research Report.
Harvey, S. & Evans, D. (2016). Defending against cyber espionage: The US Office of Personnel Management Hack as a case study in Information Assurance. Proceedings of the National Conference on Undergraduate Research (NCUR) 2016.
Gootman, S. (2016). OPM Hack: The Most Dangerous Threat to the Federal Government Today. Journal of Applied Security Research, 11:4, 517-525.
Committee on Oversight and Government Reform U.S. House of Representatives (114th Congress). 2016. The OPM Data Breach: How the Government Jeopardized Our National Security for More Than a Generation.
Veeramachaneni, K., Arnaldo, I., Cuesta-Infante, A., Korrapati, V., Bassias, C., & Li, K. (2016). AI2: Training a big data machine to defend .
Anderson, H. S., Kharkaw, A., Filar, B., & Roth, P. (2017). Evading Machine Learning Malware Detection.