AI Test & Evaluation Specialist

Work Roles

AI Test & Evaluation Specialist Work Role ID: 672 (NIST: N/A) Workforce Element: Data/AI

Performs testing, evaluation, verification, and validation on AI solutions to ensure they are developed to be and remain robust, resilient, responsible, secure, and trustworthy; and communicates results and concerns to leadership.

Items denoted by a * are CORE KSATs for every Work Role, while other CORE KSATs vary by Work Role.

Core KSATs

KSAT ID	Description	KSAT
22	* Knowledge of computer networking concepts and protocols, and network security methodologies.	Knowledge
108	* Knowledge of risk management processes (e.g., methods for assessing and mitigating risk).	Knowledge
182	Skill in determining an appropriate level of test rigor for a given system.	Skill
508	Determine level of assurance of developed capabilities based on test results.	Task
550	Develop test plans to address specifications and requirements.	Task
694	Make recommendations based on test results.	Task
858A	Test, evaluate, and verify hardware and/or software to determine compliance with defined specifications and requirements.	Task
858B	Record and manage test data.	Task
1157	* Knowledge of national and international laws, regulations, policies, and ethics as they relate to cybersecurity.	Knowledge
1158	* Knowledge of cybersecurity principles.	Knowledge
1159	* Knowledge of cyber threats and vulnerabilities.	Knowledge
5120	Conduct hypothesis testing using statistical processes.	Task
5848	Assess technical risks and limitations of planned tests on AI systems.	Task
5851	Build assurance cases for AI systems that support the needs of different stakeholders (e.g., acquisition community, commanders, and operators).	Task
5858	Conduct AI risk assessments to ensure models and/or other solutions are performing as designed.	Task
5866	Create or customize existing Test and Evaluation Master Plans (TEMPs) for AI systems.	Task
5873	Determine methods and metrics for quantitative and qualitative measurement of AI risks so that sensitivity, specificity, likelihood, confidence levels, and other metrics are identified, documented, and applied.	Task
5876	Develop machine learning code testing and validation procedures.	Task
5877	Develop possible solutions for technical risks and limitations of planned tests on AI solutions.	Task
5896	Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI.	Task
5901	Measure the effectiveness, security, robustness, and trustworthiness of AI tools.	Task
5910	Provide quality assurance of AI products throughout their lifecycle.	Task
5914	Report test and evaluation deficiencies and possible solutions to appropriate personnel.	Task
5916	Select and use the appropriate models and prediction methods for evaluating AI performance.	Task
5919	Test AI tools against adversarial attacks in operationally realistic environments.	Task
5920	Test components to ensure they work as intended in a variety of scenarios for all aspects of the AI application.	Task
5921	Test how users interact with AI solutions.	Task
5922	Test the reliability, functionality, security, and compatibility of AI tools within systems.	Task
5923	Test the trustworthiness of AI solutions.	Task
5926	Use models and other methods for evaluating AI performance.	Task
6060	Ability to collect, verify, and validate test data.	Ability
6170	Ability to translate data and test results into evaluative conclusions.	Ability
6311	Knowledge of machine learning theory and principles.	Knowledge
6490	Skill in assessing the predictive power and subsequent generalizability of a model.	Skill
6630	Skill in preparing Test & Evaluation reports.	Skill
6641	Skill in providing Test & Evaluation resource estimate.	Skill
6900	* Knowledge of specific operational impacts of cybersecurity lapses.	Knowledge
6935	* Knowledge of cloud computing service models Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS).	Knowledge
6938	* Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments.	Knowledge
7003	Knowledge of AI security risks, threats, and vulnerabilities and potential risk mitigation solutions.	Knowledge
7004	Knowledge of AI Test & Evaluation frameworks.	Knowledge
7006	Knowledge of best practices from industry and academia in test design activities for verification and validation of AI and machine learning systems.	Knowledge
7009	Knowledge of coding and scripting in languages that support AI development and use.	Knowledge
7020	Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable).	Knowledge
7024	Knowledge of how AI is developed and operated.	Knowledge
7025	Knowledge of how AI solutions integrate with cloud or other IT infrastructure.	Knowledge
7028	Knowledge of how to automate development, testing, security, and deployment of AI/machine learning-enabled software to the DoD.	Knowledge
7029	Knowledge of how to collect, store, and monitor data.	Knowledge
7030	Knowledge of how to deploy test infrastructures with AI systems.	Knowledge
7034	Knowledge of interactions and integration of DataOps, MLOps, and DevSecOps in AI.	Knowledge
7036	Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government.	Knowledge
7037	Knowledge of machine learning operations (MLOps) processes and best practices.	Knowledge
7038	Knowledge of metrics to evaluate the effectiveness of machine learning models.	Knowledge
7041	Knowledge of remedies against unintended bias in AI solutions.	Knowledge
7044	Knowledge of testing, evaluation, validation, and verification (T&E V&V) tools and procedures to ensure systems are working as intended.	Knowledge
7045	Knowledge of the AI lifecycle.	Knowledge
7048	Knowledge of the benefits and limitations of AI capabilities.	Knowledge
7051	Knowledge of the possible impacts of machine learning blind spots and edge cases.	Knowledge
7053	Knowledge of the user experience (e.g., decision making, user design, and human-computer interaction) as it relates to AI systems.	Knowledge
7054	Knowledge of tools for testing the robustness and resilience of AI products and solutions.	Knowledge
7065	Skill in explaining AI concepts and terminology.	Skill
7067	Skill in identifying low-probability, high-impact risks in machine learning training data sets.	Skill
7069	Skill in identifying risk over the lifespan of an AI solution.	Skill
7070	Skill in integrating AI Test & Evaluation frameworks into test strategies for specific projects.	Skill
7075	Skill in testing and evaluating machine learning algorithms or AI solutions.	Skill
7076	Skill in testing for bias in data sets and AI system outputs as well as determining historically or often underrepresented and marginalized groups are properly represented in the training, testing, and validation data sets and AI system outputs.	Skill
7077	Skill in translating operation requirements for AI systems into testing requirements.	Skill

Additional KSATs

KSAT ID	Description	KSAT
40	Knowledge of organization’s evaluation and validation requirements.	Knowledge
765B	Perform AI architecture security reviews, identify gaps, and develop a risk management plan to address issues.	Task
942	Knowledge of the organization’s core business/mission processes.	Knowledge
1133	Knowledge of service management concepts for networks and related standards (e.g., Information Technology Infrastructure Library, current version [ITIL]).	Knowledge
5850	Assist integrated project teams to identify, curate, and manage data.	Task
5889	Identify and submit exemplary AI use cases, best practices, failure modes, and risk mitigation strategies, including after-action reports.	Task
7012	Knowledge of current test standards and safety standards that are applicable to AI (e.g. MIL-STD 882E, DO-178C, ISO26262).	Knowledge
7040	Knowledge of Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data reusability considerations for AI solutions.	Knowledge

List of topics

Cyber Exchange Training

DoD Workforce Innovation Directorate

Public Key Infrastructure/Enabling

List of Cyber Resources

Cyber Exchange Help

AI Test & Evaluation Specialist

AI Test & Evaluation Specialist

AI Test & Evaluation Specialist

U.S. Government Notice and Consent