Supply chain and third-party risks have been a top-of-mind security concern in recent years. Attacks such as Solarwinds and Kaseya have demonstrated the cybersecurity risks that organizations face due to their relationships with third-party providers.
As machine learning (ML) plays a growing role in companies’ cyber defenses, third-party risks will likely become even more dangerous and less detectable. In a recently published article, Goldwasser et al proved that malicious backdoors inserted into third-party ML models are undetectable by users.
Machine Learning Classifiers Learn by Doing
A computer’s ability to analyze and classify massive volumes of training data makes ML a promising cybersecurity tool. A ML classifier has the potential to bolster corporate cyber defenses, detecting potential attacks and making access decisions based on various factors that are incorporated into its internal model.
ML algorithms build their models based on an analysis of labeled training data. ML classifiers start in a random state and refine their internal model via a process known as “reinforcement learning.” When presented with a training input, the ML algorithm generates a classification. Based on whether or not that classification is correct, the ML algorithm updates its internal state. Over time, the ML model is tuned to make accurate classifications based on the patterns and trends it detects in the underlying data. The better the training data, the more accurate and useful the ML classifier.
Attackers Can Insert Undetectable Backdoors in ML Algorithms
ML algorithms develop models based on their training data. If instances of a particular event exist in the training data, then the classifier’s model will incorporate that event. Cybercriminals can exploit this by inserting backdoors into ML models. For example, if the training data includes malicious events that are labeled as benign, the model will consider them benign. Or a login attempt to an administrator account with a particular password may always be labeled as benign, providing an attacker with the ability to slip past an organization’s ML-based defenses.
Corrupted training data jeopardizing ML-based cyber defenses is a threat that has been known for some time. The article by Goldwasser et al went a step further, proving that these backdoors could be made undetectable to an organization using the system. The underlying assumptions of their proof are the same as those used in modern cryptographic algorithms, which are vital to the operation of the modern Internet.
The article’s proof applies to machine learning models that were trained by an untrusted third party that may have used corrupted training data. With access to this data, an organization could identify the malicious training events and detect the backdoor in the model. However, since this training data is the “secret sauce” that makes an ML-based classifier unique and valuable, it is unlikely that the solution developer will be willing to provide it for review. Alternatively, the solution developer could offer proof that the model was trained correctly; however, this can be complex and means that ML-based cybersecurity solutions cannot be trained and trusted “out of the box.”
How MorganFranklin Can Help
As companies become reliant on ML-based security tools, managing the associated risks is essential. MorganFranklin experts help organizations develop strategies to manage their third-party and fourth-party risks and implement defense in depth by developing security control strategies that are not wholly dependent on ML. They also assist with acquiring and vetting ML-based solutions by reviewing the maturity of the algorithms used, and they help with the development of structured machine learning, providing your organization with additional control over the accuracy and reliability of its ML-based cybersecurity solutions.