Data Scientist

Data Scientist Work Role ID: 423 (NIST: N/A) Workforce Element: Data/AI

Uncovers and explains actionable insights from data by combining scientific method, math and statistics, specialized programming, advanced analytics, AI, and storytelling.


Items denoted by a * are CORE KSATs for every Work Role, while other CORE KSATs vary by Work Role.

Core KSATs

KSAT ID Description KSAT
21A

Knowledge of statistical/machine learning algorithms.

Knowledge
22

* Knowledge of computer networking concepts and protocols, and network security methodologies.

Knowledge
75A

Knowledge of mathematics, including logarithms, trigonometry, linear algebra, calculus, statistics, and operational analysis.

Knowledge
102

Knowledge of programming language structures and logic.

Knowledge
108

* Knowledge of risk management processes (e.g., methods for assessing and mitigating risk).

Knowledge
166

Skill in conducting queries and developing algorithms to analyze data structures.

Skill
172

Skill in creating and utilizing mathematical or statistical models.

Skill
1120

Ability to interpret and incorporate data from multiple tool sources.

Ability
1157

* Knowledge of national and international laws, regulations, policies, and ethics as they relate to cybersecurity.

Knowledge
1158

* Knowledge of cybersecurity principles.

Knowledge
1159

* Knowledge of cyber threats and vulnerabilities.

Knowledge
3080

Ability to use and understand complex mathematical concepts (e.g., discrete math).

Ability
3756

Skill in developing or recommending analytic approaches or solutions to problems and situations for which information is incomplete or for which no precedent exists.

Skill
5030

Analyze data sources to provide actionable recommendations.

Task
5120

Conduct hypothesis testing using statistical processes.

Task
5550

Program custom algorithms.

Task
5640

Utilize technical documentation or resources to implement a new mathematical, data science, or computer science method.

Task
5853

Build predictive, prescriptive, or descriptive models in collaboration with stakeholders.

Task
5906

Plan and conduct complex analytical, mathematical, and statistical research that informs operational requirements.

Task
5907

Plan, coordinate, and execute complex studies using advanced data modeling techniques and procedures, data trend analysis, and data algorithms.

Task
5924

Train and evaluate machine learning models.

Task
5927

Write and document reproducible code.

Task
6050

Ability to build complex data structures and high-level programming languages.

Ability
6060

Ability to collect, verify, and validate test data.

Ability
6120

Ability to dissect a problem and examine the interrelationships between data that may appear unrelated.

Ability
6490

Skill in assessing the predictive power and subsequent generalizability of a model.

Skill
6570

Skill in identifying hidden patterns or relationships.

Skill
6651

Skill in Regression Analysis (e.g., Hierarchical Stepwise, Generalized Linear Model, Ordinary Least Squares, Tree-Based Methods, Logistic).

Skill
6750

Skill in using outlier identification and removal techniques.

Skill
6760

Skill in writing scripts using R, Python, PIG, HIVE, SQL, etc.

Skill
6790A

Utilize open source languages, as appropriate, and apply quantitative techniques (e.g., descriptive and inferential statistics, sampling, experimental design, parametric and non-parametric tests of difference, ordinary least squares regression, general line).

Task
6900

* Knowledge of specific operational impacts of cybersecurity lapses.

Knowledge
6935

* Knowledge of cloud computing service models Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS).

Knowledge
6938

* Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments.

Knowledge
7002

Assist integrated project teams identify, curate, and manage test data.

Task
7029

Knowledge of how to collect, store, and monitor data.

Knowledge
7071

Skill in labeling data to make it more discoverable and understandable.

Skill

Additional KSATs

KSAT ID Description KSAT
35

Knowledge of digital rights management.

Knowledge
506

Design, develop, and modify software systems, using scientific analysis and mathematical models to predict and measure outcome and consequences of design.

Task
942

Knowledge of the organization’s core business/mission processes.

Knowledge
1034A

Knowledge of Personally Identifiable Information (PII) data security standards.

Knowledge
1034C

Knowledge of Personal Health Information (PHI) data security standards.

Knowledge
5854

Collaborate with appropriate personnel to address Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data resusability concerns for AI solutions.

Task
5884

Evaluate energy implications (graphical processing unit, tensor processing unit, etc.) when designing AI solutions.

Task
5896

Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI.

Task
5907

Plan, coordinate, and execute complex studies using advanced data modeling techniques and procedures, data trend analysis, and data algorithms.

Task
6290

Knowledge of how to leverage government research and development centers, think tanks, academic research, and industry systems.

Knowledge
6651

Skill in Regression Analysis (e.g., Hierarchical Stepwise, Generalized Linear Model, Ordinary Least Squares, Tree-Based Methods, Logistic).

Skill
7020

Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable).

Knowledge
7036

Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government.

Knowledge
7078

Skill in using deep learning approaches to build machine learning models.

Skill