Comments for NIST’s “Zero Draft” on Testing, Evaluation, Verification, and Validation for Artificial Intelligence

Submitted to NIST's call for comments August 2025.

Aug 29, 2025

There are two topics that should be added to the appendices concerning practical application and socio-technical methods for TEVV: defining a level of model access to auditors of AI systems and creating exemptions from auditing requirements for open-sourced models. This paper will discuss why these issues are important, different approaches to the issues, and conclude with recommendations.

Level of Access

The level of access question is an important one that has gone unanswered by regulators. To ensure transparent and trustworthy audits, there must be a clear rule as to how much of a model an auditor can engage with. Although some existing laws already limit how much auditors can access,1 the current framework is unworkable and leaves too much open to interpretation. Without such a rule, companies are unlikely to give access above the bare minimum, as they are incentivized to share the least amount possible with auditors, due to costs, concerns for intellectual property, and security concerns.2

There are four areas of models that auditors can access that should be considered: the training data, the training procedure, the model architecture, and the trained model.3 Although a law that would grant an auditor access to all four areas may seem ideal, it is in most cases unnecessary and AI developers and industry are likely to heavily push back on such a proposal. This is because “the stronger the form of access, the closer the auditor is to being able to reproduce the AI system from scratch,”4 which presents competition issues as AI models are treated as closely guarded trade secrets.

Examining the training data would allow an auditor to evaluate the quality and inclusion of a model. AI models exhibit their data, so if the data set is biased the model is likely to be as well.5 But, such an evaluation into the attributes of the data set may prove useless. It is possible that the same training data creates “many different downstream models” because of how the data is utilized during the rest of the training process.6 Evaluating the training data may be ineffective to audit the model, but it can be helpful to audit the developer. A disclosure of training data, focusing on the data’s balance across demographic groups and origination can provide insight into how the developing company operates.7 This practice can demonstrate that the company is complying with relevant data privacy laws and ensure that there is proper data minimization where applicable, thereby “encourag[ing] AI developers to exhibit a certain level of care with their data.”8

Access to the training procedure is also an option. This is a “road-map” of how the model was developed that includes: “the broad class of models that they chose …, the objective functions that they optimize (e.g., the factors that a social media algorithm optimizes in its pipeline), and the algorithm that they applied on the chosen model.”9 For example, factors that a social media company promotes in their algorithm would be examined. Such an examination would prove useful, especially in the context of Facebook’s recommendation algorithm promoting misinformation by optimizing reactions over likes.10 In that case, the recommendation algorithm promoted posts with reactions five times more than likes, and in cases of controversial posts many of the reactions would be to show anger or disagreement. Had this been evaluated by an auditor, there is the possibility that it could have been prevented.

There are downsides to evaluating the training procedure as well. Like with training data, the procedure can also yield many different downstream models. Other factors such as the data and model weights applied later on can heavily influence how the model will interact with the user. Additionally, the training procedure is considered to be the “most valuable intellectual property” part of the model.11 Industry is likely to push back heavily on allowing auditors to access the training procedure.

Another option is accessing the model architecture. This examination of the untrained model includes the neural network architecture, the algorithms that the data is trained on (without including the data), and the decision tree used.12 This level of access is said to give “some of the most interpretable insights into an AI system design.”13 A prime example of this was X’s open-sourcing of their recommendation algorithm last year.14 Upon allowing anyone to access their model architecture, users discovered that 50% of the feed that they see on X is from people they follow, while the other 50% is not.

Such insight “can enable the auditors to pinpoint glaring flaws or identify discrepancies between a company’s claims and its actual implementation.”15 But access to model architecture has the same flaws as the two previously mentioned categories. Model architecture can help predict what kinds of outputs the model will produce, but ultimately this can vary greatly. There are also intellectual property concerns from smaller developers. While large companies who put their algorithms through expensive training do not need to worry about open sourcing their algorithm, smaller companies who utilize much smaller training and resources do. It is much easier to replicate a model that utilizes less compute and data than those who require the massive resources of the big tech companies.16 Giving this access to auditors can pose a competition risk that can disenfranchise smaller companies developing AI and ultimately lead to more monopolization.

The final category to consider is access to the trained model. The most common form of this is known as “black-box access” where an auditor would use the model just as any consumer would. One way of conceptualizing it is to think of black-box access as an auditor being able to crash-test a car while full access to the other categories previously mentioned would allow the auditor to inspect every component of the car.17

The advantages of this means of access is that there is no access to the inner workings of the model, alleviating intellectual property concerns. It also allows the auditor to see how the final model will look like, eliminating the uncertainty of the previous methods caused by the several downstream models issue. The costs of audits can be significantly reduced here as well, as standardized audits are more applicable and automated audits become possible.18

But some disagree that black-box access is enough to conduct an adequate audit. It can often give unreliable and misleading results that offer limited insights to address failures.19 Given the infinite number of possible inputs to a model, it can be very difficult to understand what the model is capable of and where its weaknesses lie.20 The results can then be misleading due to an auditor’s internal bias manifesting in the way they structure prompts and inputs. The explainability of the outputs is also unclear without being able to look at the neural network architecture or decision tree as models have been found to be incapable of explaining how they got their answer on their own.21

Because of the limitations of black-box access, auditors should have access to the training data along with black-box access. The training procedure and model architecture should be off-limits to auditors because the negatives (uncertainty, security, intellectual property) do not outweigh the positives (in-depth understanding). To protect competition in the AI development space, make audits more accessible, and have a realistic standard that both industry and regulators can agree on, this combination should be promulgated. Access to the data will further privacy protections and accountability, ensuring that AI developers don’t recklessly feed data into their algorithms and black-box access will help put in place adequate governance programs and cautious implementation procedures, as developers and deployers will be made more aware of how their model will interact with the real world.

Open Sourcing

There should be less stringent standards for developers who open-source their models. If the level of open-source is adequate, a developer should be able to avoid any auditing obligations all-together. Following the definition created by the “Open Source Initiative,” the model “must grant end-users the freedom to: use the system for any purpose, without asking permission; study how the system works and inspect its components; modify the system for any purpose, including changing its output; and share the system for others to use, with or without modifications, for any purpose.”22 Having open-sourced models allows for the highest level of accountability and has been the subject of many petitions in the past.23 For example, researchers investigated X’s algorithm and discovered that it consistently gives users political content, regardless of user choice.24 It also promotes innovation, as more researchers are able to access cutting-edge technologies and improve on them.

Along with this, developers should be required to provide legal and technical safe harbors for researchers evaluating their open-source model. As it stands, many AI developers have terms of service that “prohibit independent evaluation into most sensitive model flaws.”25 As a result, many researchers have been faced with legal threats or account suspension when attempting to research the developer’s models. To remedy this, and to promote open-sourcing, companies should provide a legal safe harbor for good-faith research and a technical safe harbor from suspending researchers' access to the models.

There are risks to open-sourcing. It is possible that doing so allows malicious actors to disable safeguards against misuse and to possibly introduce new dangerous capabilities via fine-tuning.26 It can also greatly increase attacker knowledge of possible exploits and make it more difficult to ensure improvements are implemented downstream.27 But these risks can be mitigated. Developers can create specific accounts for trusted universities or other research organizations that the researchers can work from, reducing the potential for malicious actors to access the system.

Khoa Lam et al., A Framework for Assurance Audits of Algorithmic Systems, ARXIV (May 28, 2024), https://arxiv.org/pdf/2401.14908. (“the CCPA states that ‘nothing in this section [on risk assessments] shall require a business to divulge trade secrets;’ GDPR similarly states that data processing measures (including audits) ‘should be appropriate, necessary and proportionate in view of ensuring compliance with this Regulation’”).

Sarah H. Cen & Rohan Alur, From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI Auditing, ARXIV (Oct. 7, 2024), https://arxiv.org/pdf/2410.04772 at 2.

Id.

Id. at 11.

Id. at 9.

Id.

Sarah Cen et al., Auditing AI: How Much Access Is Needed to Audit an AI System?, THOUGHTS ON AI POL’Y (Sep. 14, 2023), aipolicy.substak.com/p/ai-accountability-transparency-2.

Id.

Jeremy B. Merrill & Will Oremos, Five points for anger, one for a ‘like’: How Facebook’s formula fostered rage and misinformation, WASH. POST (Oct. 26, 2021), https://www.washingtonpost.com/technology/2021/10/26/facebook-angry-emoji-algorithm/.

Sarah H. Cen & Rohan Alur, From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI Auditing, ARXIV (Oct. 7, 2024), https://arxiv.org/pdf/2410.04772 at 10.

Id.

Supra footnote 11, at 7.

Jaime Ferrando Huertas, X's Open Source Algorithm - Unveiling the code, but not the secrets, SHAPED (Mar. 31, 2023), https://www.shaped.ai/blog/twitters-open-source-algorithm-unveiling-the-code-but-not-the-secrets.

Supra footnote 11.

Id.

Sarah H. Cen & Rohan Alur, From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI Auditing, ARXIV (Oct. 7, 2024), https://arxiv.org/pdf/2410.04772.

Stephen Casper et al., Black-Box Access is Insufficient for Rigorous AI Audits, ARXIV (May 29, 2024), https://arxiv.org/pdf/2401.14446.

Id. at 5. (“black-box methods have been shown to be unreliable for detecting failures that elude typical test sets including jailbreaks, adversarial inputs, or backdoors”).

Miles Turpin et al., Language Models Don’t Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting, ARXIV (Dec. 9, 2023), https://arxiv.org/pdf/2305.04388.

Blair Robinson, Vaishali Nambiar, Brenda Leong, ‘Open source’ in the age of AI, INT’L ASS’N OF PRIV. PRO. (Dec. 4, 2024), https://iapp.org/news/a/-open-source-in-the-age-of-ai/.

How safe are our online platforms? Let’s open the door for social media researchers, MOZILLA FOUNDATION (last accessed Dec. 19, 2024), https://foundation.mozilla.org/en/campaigns/unknown-influence/.

Jack Gillum et al., X Algorithm Feeds Users Political Content—Whether They Want It or Not, WASH. POST (Oct. 29, 2024), https://www.wsj.com/politics/elections/x-twitter-political-content-election-2024-28f2dadd.

Shayne Longpre et al., A Safe Harbor for AI Evaluation and Red Teaming, ARXIV (Mar. 7, 2024), https://arxiv.org/pdf/2403.04893.

Elizabeth Seger et al., Open-Sourcing Highly Capable Foundation Models: An evaluation of risks, benefits, and alternative methods for pursuing open-source objectives, ARXIV (Sep. 29, 2023), https://arxiv.org/pdf/2311.09227.

Id. at 3.

Rafal’s Substack

Discussion about this post

Ready for more?