Cybersecurity researchers are sounding the alarm on the significant risks posed by vulnerabilities in the software supply chain for machine learning (ML). In recent findings, over 20 vulnerabilities have been identified in MLOps platforms that can potentially enable attackers to execute arbitrary code or load malicious data sets.
MLOps platforms play a crucial role in the design and deployment of machine learning models by storing them in repositories for future use in applications or by providing access through APIs. However, the very features that make these technologies efficient also expose them to potential attacks.
According to researchers from JFROG, vulnerabilities in MLOps platforms are often rooted in fundamental design flaws and implementation errors. For example, attackers can exploit features like automatic code execution when downloading models in formats like Pickle files, opening up avenues for malicious activities.
Another concerning aspect is the use of popular development environments like Jupyterlab, which allows code execution and interactive result display. However, the automatic rendering of HTML and JavaScript by browsers poses a risk of cross-site scripting (XSS) attacks, as demonstrated by a vulnerability found in MLFLOW due to inadequate data filtering.
Furthermore, vulnerabilities stemming from implementation drawbacks, such as lack of authentication on MLOps platforms, can give attackers network access to exploit ML pipeline functions. These types of attacks have already been observed in real-world scenarios, including instances where cryptocurrency miners were deployed on vulnerable platforms.
One specific vulnerability highlighted is the compromise of containers in SeldonCore, enabling attackers to breach the code framework and infiltrate the cloud environment, gaining unauthorized access to models and data belonging to other users.
All these vulnerabilities not only pose a threat to individual systems but also have the potential to spread within organizations, compromising servers and critical data. Researchers stress the importance of isolating and securing the environment in which models operate to prevent unauthorized code execution.
The JFROG report comes on the heels of recent discoveries of vulnerabilities in other open-source tools like Langchain and Ask Astro, which also carry risks of data leakage and security breaches. As cyberattacks on AI and ML supply chains grow more sophisticated, cybersecurity experts face new challenges in safeguarding these vital technologies.