AI Test & Evaluation Specialist

AI Test & Evaluation Specialist Work Role ID: 672 (NIST: N/A) Workforce Element: Data/AI

Performs testing, evaluation, verification, and validation on AI solutions to ensure they are developed to be and remain robust, resilient, responsible, secure, and trustworthy; and communicates results and concerns to leadership.


Items denoted by a * are CORE KSATs for every Work Role, while other CORE KSATs vary by Work Role.

Core KSATs

KSAT ID Description KSAT
22

* Knowledge of computer networking concepts and protocols, and network security methodologies.

Knowledge
108

* Knowledge of risk management processes (e.g., methods for assessing and mitigating risk).

Knowledge
182

Skill in determining an appropriate level of test rigor for a given system.

Skill
508

Determine level of assurance of developed capabilities based on test results.

Task
550

Develop test plans to address specifications and requirements.

Task
694

Make recommendations based on test results.

Task
858A

Test, evaluate, and verify hardware and/or software to determine compliance with defined specifications and requirements.

Task
858B

Record and manage test data.

Task
1157

* Knowledge of national and international laws, regulations, policies, and ethics as they relate to cybersecurity.

Knowledge
1158

* Knowledge of cybersecurity principles.

Knowledge
1159

* Knowledge of cyber threats and vulnerabilities.

Knowledge
5120

Conduct hypothesis testing using statistical processes.

Task
5848

Assess technical risks and limitations of planned tests on AI systems.

Task
5851

Build assurance cases for AI systems that support the needs of different stakeholders (e.g., acquisition community, commanders, and operators).

Task
5858

Conduct AI risk assessments to ensure models and/or other solutions are performing as designed.

Task
5866

Create or customize existing Test and Evaluation Master Plans (TEMPs) for AI systems.

Task
5873

Determine methods and metrics for quantitative and qualitative measurement of AI risks so that sensitivity, specificity, likelihood, confidence levels, and other metrics are identified, documented, and applied.

Task
5876

Develop machine learning code testing and validation procedures.

Task
5877

Develop possible solutions for technical risks and limitations of planned tests on AI solutions.

Task
5896

Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI.

Task
5901

Measure the effectiveness, security, robustness, and trustworthiness of AI tools.

Task
5910

Provide quality assurance of AI products throughout their lifecycle.

Task
5914

Report test and evaluation deficiencies and possible solutions to appropriate personnel.

Task
5916

Select and use the appropriate models and prediction methods for evaluating AI performance.

Task
5919

Test AI tools against adversarial attacks in operationally realistic environments.

Task
5920

Test components to ensure they work as intended in a variety of scenarios for all aspects of the AI application.

Task
5921

Test how users interact with AI solutions.

Task
5922

Test the reliability, functionality, security, and compatibility of AI tools within systems.

Task
5923

Test the trustworthiness of AI solutions.

Task
5926

Use models and other methods for evaluating AI performance.

Task
6060

Ability to collect, verify, and validate test data.

Ability
6170

Ability to translate data and test results into evaluative conclusions.

Ability
6311

Knowledge of machine learning theory and principles.

Knowledge
6490

Skill in assessing the predictive power and subsequent generalizability of a model.

Skill
6630

Skill in preparing Test & Evaluation reports.

Skill
6641

Skill in providing Test & Evaluation resource estimate.

Skill
6900

* Knowledge of specific operational impacts of cybersecurity lapses.

Knowledge
6935

* Knowledge of cloud computing service models Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS).

Knowledge
6938

* Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments.

Knowledge
7003

Knowledge of AI security risks, threats, and vulnerabilities and potential risk mitigation solutions.

Knowledge
7004

Knowledge of AI Test & Evaluation frameworks.

Knowledge
7006

Knowledge of best practices from industry and academia in test design activities for verification and validation of AI and machine learning systems.

Knowledge
7009

Knowledge of coding and scripting in languages that support AI development and use.

Knowledge
7020

Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable).

Knowledge
7024

Knowledge of how AI is developed and operated.

Knowledge
7025

Knowledge of how AI solutions integrate with cloud or other IT infrastructure.

Knowledge
7028

Knowledge of how to automate development, testing, security, and deployment of AI/machine learning-enabled software to the DoD.

Knowledge
7029

Knowledge of how to collect, store, and monitor data.

Knowledge
7030

Knowledge of how to deploy test infrastructures with AI systems.

Knowledge
7034

Knowledge of interactions and integration of DataOps, MLOps, and DevSecOps in AI.

Knowledge
7036

Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government.

Knowledge
7037

Knowledge of machine learning operations (MLOps) processes and best practices.

Knowledge
7038

Knowledge of metrics to evaluate the effectiveness of machine learning models.

Knowledge
7041

Knowledge of remedies against unintended bias in AI solutions.

Knowledge
7044

Knowledge of testing, evaluation, validation, and verification (T&E V&V) tools and procedures to ensure systems are working as intended.

Knowledge
7045

Knowledge of the AI lifecycle.

Knowledge
7048

Knowledge of the benefits and limitations of AI capabilities.

Knowledge
7051

Knowledge of the possible impacts of machine learning blind spots and edge cases.

Knowledge
7053

Knowledge of the user experience (e.g., decision making, user design, and human-computer interaction) as it relates to AI systems.

Knowledge
7054

Knowledge of tools for testing the robustness and resilience of AI products and solutions.

Knowledge
7065

Skill in explaining AI concepts and terminology.

Skill
7067

Skill in identifying low-probability, high-impact risks in machine learning training data sets.

Skill
7069

Skill in identifying risk over the lifespan of an AI solution.

Skill
7070

Skill in integrating AI Test & Evaluation frameworks into test strategies for specific projects.

Skill
7075

Skill in testing and evaluating machine learning algorithms or AI solutions.

Skill
7076

Skill in testing for bias in data sets and AI system outputs as well as determining historically or often underrepresented and marginalized groups are properly represented in the training, testing, and validation data sets and AI system outputs.

Skill
7077

Skill in translating operation requirements for AI systems into testing requirements.

Skill

Additional KSATs

KSAT ID Description KSAT
40

Knowledge of organization’s evaluation and validation requirements.

Knowledge
765B

Perform AI architecture security reviews, identify gaps, and develop a risk management plan to address issues.

Task
942

Knowledge of the organization’s core business/mission processes.

Knowledge
1133

Knowledge of service management concepts for networks and related standards (e.g., Information Technology Infrastructure Library, current version [ITIL]).

Knowledge
5850

Assist integrated project teams to identify, curate, and manage data.

Task
5889

Identify and submit exemplary AI use cases, best practices, failure modes, and risk mitigation strategies, including after-action reports.

Task
7012

Knowledge of current test standards and safety standards that are applicable to AI (e.g. MIL-STD 882E, DO-178C, ISO26262).

Knowledge
7040

Knowledge of Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data reusability considerations for AI solutions.

Knowledge