AI Test & Evaluation Specialist
Performs testing, evaluation, verification, and validation on AI solutions to ensure they are developed to be and remain robust, resilient, responsible, secure, and trustworthy; and communicates results and concerns to leadership.
Core KSATs
KSAT ID | Description | KSAT |
---|---|---|
22 | * Knowledge of computer networking concepts and protocols, and network security methodologies. |
Knowledge |
108 | * Knowledge of risk management processes (e.g., methods for assessing and mitigating risk). |
Knowledge |
182 | Skill in determining an appropriate level of test rigor for a given system. |
Skill |
508 | Determine level of assurance of developed capabilities based on test results. |
Task |
550 | Develop test plans to address specifications and requirements. |
Task |
694 | Make recommendations based on test results. |
Task |
858A | Test, evaluate, and verify hardware and/or software to determine compliance with defined specifications and requirements. |
Task |
858B | Record and manage test data. |
Task |
1157 | * Knowledge of national and international laws, regulations, policies, and ethics as they relate to cybersecurity. |
Knowledge |
1158 | * Knowledge of cybersecurity principles. |
Knowledge |
1159 | * Knowledge of cyber threats and vulnerabilities. |
Knowledge |
5120 | Conduct hypothesis testing using statistical processes. |
Task |
5848 | Assess technical risks and limitations of planned tests on AI systems. |
Task |
5851 | Build assurance cases for AI systems that support the needs of different stakeholders (e.g., acquisition community, commanders, and operators). |
Task |
5858 | Conduct AI risk assessments to ensure models and/or other solutions are performing as designed. |
Task |
5866 | Create or customize existing Test and Evaluation Master Plans (TEMPs) for AI systems. |
Task |
5873 | Determine methods and metrics for quantitative and qualitative measurement of AI risks so that sensitivity, specificity, likelihood, confidence levels, and other metrics are identified, documented, and applied. |
Task |
5876 | Develop machine learning code testing and validation procedures. |
Task |
5877 | Develop possible solutions for technical risks and limitations of planned tests on AI solutions. |
Task |
5896 | Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI. |
Task |
5901 | Measure the effectiveness, security, robustness, and trustworthiness of AI tools. |
Task |
5910 | Provide quality assurance of AI products throughout their lifecycle. |
Task |
5914 | Report test and evaluation deficiencies and possible solutions to appropriate personnel. |
Task |
5916 | Select and use the appropriate models and prediction methods for evaluating AI performance. |
Task |
5919 | Test AI tools against adversarial attacks in operationally realistic environments. |
Task |
5920 | Test components to ensure they work as intended in a variety of scenarios for all aspects of the AI application. |
Task |
5921 | Test how users interact with AI solutions. |
Task |
5922 | Test the reliability, functionality, security, and compatibility of AI tools within systems. |
Task |
5923 | Test the trustworthiness of AI solutions. |
Task |
5926 | Use models and other methods for evaluating AI performance. |
Task |
6060 | Ability to collect, verify, and validate test data. |
Ability |
6170 | Ability to translate data and test results into evaluative conclusions. |
Ability |
6311 | Knowledge of machine learning theory and principles. |
Knowledge |
6490 | Skill in assessing the predictive power and subsequent generalizability of a model. |
Skill |
6630 | Skill in preparing Test & Evaluation reports. |
Skill |
6641 | Skill in providing Test & Evaluation resource estimate. |
Skill |
6900 | * Knowledge of specific operational impacts of cybersecurity lapses. |
Knowledge |
6935 | * Knowledge of cloud computing service models Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS). |
Knowledge |
6938 | * Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments. |
Knowledge |
7003 | Knowledge of AI security risks, threats, and vulnerabilities and potential risk mitigation solutions. |
Knowledge |
7004 | Knowledge of AI Test & Evaluation frameworks. |
Knowledge |
7006 | Knowledge of best practices from industry and academia in test design activities for verification and validation of AI and machine learning systems. |
Knowledge |
7009 | Knowledge of coding and scripting in languages that support AI development and use. |
Knowledge |
7020 | Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable). |
Knowledge |
7024 | Knowledge of how AI is developed and operated. |
Knowledge |
7025 | Knowledge of how AI solutions integrate with cloud or other IT infrastructure. |
Knowledge |
7028 | Knowledge of how to automate development, testing, security, and deployment of AI/machine learning-enabled software to the DoD. |
Knowledge |
7029 | Knowledge of how to collect, store, and monitor data. |
Knowledge |
7030 | Knowledge of how to deploy test infrastructures with AI systems. |
Knowledge |
7034 | Knowledge of interactions and integration of DataOps, MLOps, and DevSecOps in AI. |
Knowledge |
7036 | Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government. |
Knowledge |
7037 | Knowledge of machine learning operations (MLOps) processes and best practices. |
Knowledge |
7038 | Knowledge of metrics to evaluate the effectiveness of machine learning models. |
Knowledge |
7041 | Knowledge of remedies against unintended bias in AI solutions. |
Knowledge |
7044 | Knowledge of testing, evaluation, validation, and verification (T&E V&V) tools and procedures to ensure systems are working as intended. |
Knowledge |
7045 | Knowledge of the AI lifecycle. |
Knowledge |
7048 | Knowledge of the benefits and limitations of AI capabilities. |
Knowledge |
7051 | Knowledge of the possible impacts of machine learning blind spots and edge cases. |
Knowledge |
7053 | Knowledge of the user experience (e.g., decision making, user design, and human-computer interaction) as it relates to AI systems. |
Knowledge |
7054 | Knowledge of tools for testing the robustness and resilience of AI products and solutions. |
Knowledge |
7065 | Skill in explaining AI concepts and terminology. |
Skill |
7067 | Skill in identifying low-probability, high-impact risks in machine learning training data sets. |
Skill |
7069 | Skill in identifying risk over the lifespan of an AI solution. |
Skill |
7070 | Skill in integrating AI Test & Evaluation frameworks into test strategies for specific projects. |
Skill |
7075 | Skill in testing and evaluating machine learning algorithms or AI solutions. |
Skill |
7076 | Skill in testing for bias in data sets and AI system outputs as well as determining historically or often underrepresented and marginalized groups are properly represented in the training, testing, and validation data sets and AI system outputs. |
Skill |
7077 | Skill in translating operation requirements for AI systems into testing requirements. |
Skill |
Additional KSATs
KSAT ID | Description | KSAT |
---|---|---|
40 | Knowledge of organization’s evaluation and validation requirements. |
Knowledge |
765B | Perform AI architecture security reviews, identify gaps, and develop a risk management plan to address issues. |
Task |
942 | Knowledge of the organization’s core business/mission processes. |
Knowledge |
1133 | Knowledge of service management concepts for networks and related standards (e.g., Information Technology Infrastructure Library, current version [ITIL]). |
Knowledge |
5850 | Assist integrated project teams to identify, curate, and manage data. |
Task |
5889 | Identify and submit exemplary AI use cases, best practices, failure modes, and risk mitigation strategies, including after-action reports. |
Task |
7012 | Knowledge of current test standards and safety standards that are applicable to AI (e.g. MIL-STD 882E, DO-178C, ISO26262). |
Knowledge |
7040 | Knowledge of Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data reusability considerations for AI solutions. |
Knowledge |