Why Software Supply Chain Weaknesses are a Target for AI Tools

Reading time 4 minutes

GUEST OPINION: Attacks on software supply chains are becoming an increasing concern for security teams around the world. These attacks can cause significant disruption or financial losses for those targeted.

The challenge has become even more acute with the rise of open-source artificial intelligence (AI) software. Cybercriminals are actively embedding rogue code into components that are then used by developers to create new AI models.

A recent investigation into the popular site Hugging Face (think GitHub for AI models and training data) uncovered up to one hundred potentially malicious models residing in its platform.

Unfortunately, Hugging Face does not stand alone in its vulnerability. PyTorch, another open-source ML library developed by Facebook’s AI Research lab (FAIR), is widely used for deep learning applications and provides a flexible platform for building, training, and deploying neural networks.

Its recent compromise raises specific concerns about blindly trusting AI models from open-source sites for fear the content has been previously poisoned by malicious actors.

Development shifting from DevOps to LLMOps

LLMs and AI have increased concern over supply chain security for organisations, particularly as interest in incorporating LLMs into product portfolios grows across a range of sectors. For cybersecurity leaders whose organisations are looking to adapt to the broad availability of AI applications, they must stand firm against risks introduced by suppliers not just for traditional DevSecOps, but now for ML operations (MLOps) and LLM operations (LLMOps) as well.

CISOs and security professionals should be proactive about detecting malicious datasets and responding quickly to potential supply chain attacks. To do that, they must be aware of what these threats look like.

The basics of LLM-specific weaknesses

The Open Worldwide Application Security Project (OWASP) is a nonprofit foundation working to improve the security of software, through community-led open-source projects including code, documentation, and standards. It is a global community of more than 200,000 users, in more than 250+ local chapters, and provides industry-leading educational and training conferences.

The work of this community has led to the creation of the OWASP Top 10 vulnerabilities for LLMs. LLM-specific vulnerabilities, while initially appearing isolated, can have far-reaching implications for software supply chains, as many organisations are increasingly integrating AI into their development and operational processes.

For example, a Prompt Injection vulnerability allows adversaries to manipulate an LLM through cleverly crafted inputs. This type of vulnerability can lead to the corruption of outputs and potentially spread incorrect or insecure code through connected systems, affecting downstream supply chain components if not properly mitigated.

Other security threats are caused by the propensity for an LLM to hallucinate, causing models to generate inaccurate or misleading information This can lead to vulnerabilities being introduced in code that is trusted by downstream developers or partners. Malicious actors could exploit hallucinations to introduce insecure code, potentially triggering new types of supply chain attacks that propagate through trusted systems.

Overcoming security threats

To mitigate these threats, developers can incorporate security measures into the AI development lifecycle to create more robust and secure applications. To do this, they can implement secure processes for building LLM apps by following five simple steps: foundation model selection, data preparation, validation, deployment, and monitoring.

To enhance the security of LLMs, developers can leverage cryptographic techniques such as digital signatures. By digitally signing a model with a private key, a unique identifier is created that can be verified using a corresponding public key. This process ensures the model’s authenticity and integrity, preventing unauthorised modifications and tampering. Digital signatures are particularly valuable in supply chain environments where models are distributed or deployed through cloud services as they provide a way to authenticate models as they move between different systems.

Watermarking is another effective technique for safeguarding LLMs. By embedding subtle, imperceptible identifiers within the model’s parameters, watermarking creates a unique fingerprint that traces the model back to its origin. Even if the model is duplicated or stolen, the watermark remains embedded, allowing for detection and identification.

Model Cards and Software Bill of Materials (SBOMs) are also tools designed to increase transparency and understanding of complex software systems, including AI models. A SBOM is essentially a detailed inventory of all software product components and focuses on listing and detailing every piece of third-party and open-source software included in a software product. SBOMs are critical for understanding the software’s composition, especially for tracking vulnerabilities, licenses, and dependencies.

A key innovation in CycloneDX 1.5 is the ML-BOM (Machine Learning BOM), which is a game-changer for ML applications. This feature allows for the comprehensive listing of ML models, algorithms, datasets, training pipelines, and frameworks within an SBOM, and captures essential details such as model provenance, versioning, dependencies, and performance metrics, facilitating reproducibility, governance, risk assessment, and compliance for ML systems.

Ultimately, we recommend using both machine learning-based AI, and Generative AI to expedite threat detection, investigations, and data onboarding. This helps to reduce the workload of security teams and focuses organisations on the importance of AI in improving incident response, anomaly detection, and automating repetitive tasks in the security domain. Security teams should take the steps necessary to ensure any software developed is built on components from known sources and with clear pedigrees.

http://itwire.com/guest-articles/guest-opinion/why-software-supply-chain-weaknesses-are-a-target-for-ai-tools.html