A proposed AI Safety Bill working its way through the California State Legislature is getting much wider attention that merely within the state of California. The placement of the the headquarters of so many companies (including startups) developing AI models along with the reality that a company headquartered in Norway is also likely targeting customer/companies situated in California makes that attention obvious. The bill is currently called the Safe and Secure Innovation for Frontier Artificial Intelligence Models Act. The stated intention of the act is to require developers of AI systems to integrate a “kill switch” into their models enabling a full shutdown if necessary. It would also prohibit AI model developers from releasing a model, closed source (like OpenAI’s models) or open source (like a wide range of models already in use) if there is “an unreasonable risk that the covered model or covered model derivative can cause or enable a critical harm.” What is an “unreasonable” risk and what is a “critical harm” is where the meat of the concerns lie.
The Government AI Model Supervisor Will See You Now
The bill requires AI developers to submit their models to a newly created (by the bill if it passes) Frontier Model Division. That agency would review a certification submitted by the AI developer with the model that would be “under penalty of perjury” that the model meets the safety requirements. Once deployed and in use by the developer or its customers, the bill mandates reporting of all “safety incidents” related to the use of the model by customers or the developer.
The Cloud Getting Cloudier
The bill also mandates review by cloud providers (AWS, Microsoft Azure, Google Cloud as examples) of models and how they are being trained on their platform to decipher whether a cloud customer intends to train a model covered by the legislation. Something about these initial two requirements tells me that those drafting them have never actually trained or used an AI model either locally or in the cloud. The infrastructure and unending judgment call framework for cloud providers to investigate every model being trained on their platform would be enormous. Beyond that, it would be subject to constant errors. Customers would mislead the providers about their intentions. (Yes, humans are not always honest). Customers would train models with one intention, discover a new route for their startup and begin training a model differently without realizing they are now doing so in a way that the bill covers their conduct adding regulatory requirements previously not required. Any decent short story fiction writer or novelist could spin 10 or 20 more scenarios. A cloud provider nightmare of bureaucracy that would slow usage of their tools to a crawl.
The bill empowers the newly created agency to annually update regulations to continually modify (read expand) the definition of a covered model, i.e. a model that the eventual legislation would cover.
Safetyism Or Prudent Preemption?
The current draft of the bill defines an Artificial Intelligence Safety Incident as:
(c) “Artificial intelligence safety incident” means an incident that demonstrably increases the risk of a critical harm occurring by means of any of the following:
(1) A covered model autonomously engaging in behavior other than at the request of a
user that materially increases the risk of a hazardous capability being used.user.(2) Theft, misappropriation, malicious use, inadvertent release, unauthorized access, or escape of the model weights of a covered
model that is not the subject of a limited duty exemption.model.(3) The critical failure of technical or administrative controls, including controls limiting the ability to modify a covered
model that is not the subject of a limited duty exemption.model.(4) Unauthorized use
of the hazardous capabilityof a coveredmodel.model to cause or enable critical harm.
Each of these indicators of a safety incident capture behavior of models that has been part of their development and deployment for years. For example, people use AI models all the time to perform acts based upon the models’ internal decision making. Agentic LLM applications are and will continue to be the latest version of using LLMs to perform automated tasks. Most open source models are released to the public with their “weights” released as well.
Understanding AI Model Weights: A Primer for Lawyers
What are AI Model Weights?
AI model weights are numerical values forming the backbone of how an artificial intelligence (AI) model makes decisions. Think of weights as the adjustable settings that influence how the model interprets data and produces outputs. These weights are fine-tuned during the training process to optimize the model's performance. As I have mentioned repeatedly, this is just one of the dials that AI model developers can turn to intentionally inject bias into the output of AI models. Like a mantra, all AI models are biased. The only question is, how does that intentional bias affect my desire to rely on that model’s output?
How are AI Model Weights Used?
Training the Model:
Initial Setup: When an AI model is created, it starts with random weights.
Learning Process: During training, the model ingests data. For each piece of data, the model makes a prediction and then compares this prediction to the actual result.
Adjustment: Based on the difference between the prediction and the actual result (known as the error), the model adjusts its weights to improve accuracy. This process is repeated many times, with the model continually fine-tuning its weights to minimize errors.
Making Predictions:
Input Processing: When the trained model is used, it processes new data through the network of weights.
Output Generation: The model uses the weights to produce predictions or classifications based on the new data.
Deployment:
Static Weights: Once training is complete, the weights become fixed and are used to make consistent predictions in real-world applications.
Continuous Learning: In some cases, models can continue to adjust their weights as they encounter new data, a process known as “online learning.”
Why Are AI Model Weights Important?
Accuracy and Reliability: The accuracy of an AI model heavily depends on the adjustment of its weights. Well-trained weights ensure the model makes reliable predictions.
Model Behavior: Weights determine how the model responds to various inputs, essentially shaping the model's knowledge and decision-making.
Customization: By adjusting weights, AI model developers tailor models to specific tasks or datasets, modifying their performance in targeted applications.
Understanding AI model weights is crucial because they play a fundamental role in the development, functionality, and reliability of AI systems. Properly tuned weights ensure that AI models can effectively learn from data and make accurate predictions, which is vital for their application in various fields, including legal technology.
The third incident description is something that open source AI model developers know is going to happen to their models. That is, users will fine-tune those models on domain specific datasets and perhaps even attempt to recreate the models tuning outputs differently than the original developer. There is no known “control” an AI model developer can insert into a model to prevent later possessors of the model from modifying it. As an AI developer colleague of mine has often said, “any lock a human can make, another human can break.”
Finally, the “unauthorized” use section only involves causing a “critical harm.” Critical harms include:
1) “Critical harm” means any of the following harms caused or enabled by a covered model or covered model derivative:
(A) The creation or use of a chemical, biological, radiological, or nuclear weapon in a manner that results in mass casualties.
(B) Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from cyberattacks on critical infrastructure, occurring either in a single incident or over multiple related incidents.
(C) Mass casualties or at least five hundred million dollars ($500,000,000) of damage resulting from an artificial intelligence model autonomously engaging in conduct that would constitute a serious or violent felony under the Penal Code if undertaken by a human with the requisite mental state.
(D) Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive.
1(A) is self explanatory. Bad idea to design an AI model to facilitate a civilization ending nuclear explosion. Check. However, it is unclear from this description and the earlier incident definition whether AI model developers are liable to this harm generated by their model in a bad actor’s possession. This gets very close to the sides in the gun control debate. One side urging the banning or restrictive ownership of guns, while the other claims it is the humans, not the guns, that are dangerous. Similarly here, is it the AI model intended to assist in resume-writing that is then converted to writing instructions for building, transporting and detonating nuclear weapons liable or the person who transformed that model to that illicit purpose?
1(C) touches a concern of many in our society and others about the deployment of autonomous AI systems which make decisions causing harm. But, all the current AI models (LLMs) are making autonomous decisions of a sort all the time. There is no person responding to the 100 million or so daily requests to ChatGPT with answers. Instead, the model behind the ChatGPT app is simply, autonomously, responding to queries based upon its training data and adjusted weights. That seems to fit the definition of autonomous. It is making decisions as to what output to produce in response to every query.
The AI Developer's Gauntlet Before Hitting Launch
Adding pre-release AI model costs and delays, along with some reduction in innovation, are a list of requirements the legislation imposes before an AI model is released commercially or otherwise.
The law requires developers to implement various controls (administrative, technical and more) to guard against “unsafe” post training modifications to the model. It also requires these various established safeguards be written down and that testing be conducted to ensure compliance with these requirements. Those testing procedures must have sufficient clarity and detail to enable a third party to replicate them. You can already predict what those procedures will be used to accomplish. Following a safety incident, a representative or hired contractor of the government agency will step through those procedures and compare their results to the ones provided pre-release by the AI developer. If those results differ, an implicit failure to comply will be established.
An Industry Is Born
In one of the later sections of the legislation is contained this nugget: “(4) Beginning January 1, 2028, obtain a certificate of compliance from a third-party auditor….” It is likely these types of regulations will all contain this provision. It creates an industry of third party AI model auditors. Imagine if every state enacts similar legislation with slightly different auditing requirements - the complexity of developing and releasing AI models might well require….AI to get it all aligned. The legislation also immunizes employees from liability if they work for a company developing an AI model that they believe is violative of the act and disclosing that information to regulators. Companies developing AI models must also create a system enabling anonymous submission of information/complaints within the company of potential violations of the new law.
The Potential Penalties
The legislation outlines civil penalties for violations. The most notable calculation is that the civil penalty cannot exceed 10 percent of the cost of computing power used to train the covered model. Given that many of these models are consuming 7 figures to train (some in the 100M range) that penalty would be significant. The government is also empowered to enjoin developers from continuing to release such AI models.
One Final Interesting Twist
For the first time in all the AI legislation we have previewed and analyzed here, this law actually lays out guidance for judges. What guidance? It provides model jury instructions for use when instructing jurors on how to determine whether a given factual scenario qualifies as a violation of the statute. This is going quite a way into the determination of cases within legislation. Without a doubt, tech industry trade groups will lobby against this legislation and, if enacted, mount legal challenges thereafter. Virtually every line of the legislation contains potential for debate about legal meaning as applied to AI models. If enacted, there is no doubt that each of those provisions will be challenged preemptively. It will useful to watch the California courts as these challenges begin post enactment.
In developing the model jury instructions required by subparagraph (A), the Frontier Model Division shall consider all of the following factors:
(i) The level of rigor and detail of the safety and security protocol that the developer faithfully implemented while it trained, stored, and released a covered model.
(ii) Whether and to what extent the developer’s safety and security protocol was inferior, comparable, or superior, in its level of rigor and detail, to the safety and security protocols of comparable developers.
(iii) The extent and quality of the developer’s safety and security protocol’s prescribed safeguards, capability testing, and other precautionary measures with respect to the relevant risk of causing or enabling a critical harm.
(iv) Whether and to what extent the developer and its agents complied with the developer’s safety and security protocol, and to the full degree, that doing so might plausibly have avoided causing or enabling a particular harm.
(v) Whether and to what extent the developer carefully and rigorously investigated, documented, and accurately measured, insofar as reasonably possible given the state-of-the-art, relevant risks that its model might pose.