Data Scientist
Uncovers and explains actionable insights from data by combining scientific method, math and statistics, specialized programming, advanced analytics, AI, and storytelling.
Core KSATs
KSAT ID | Description | KSAT |
---|---|---|
21A | Knowledge of statistical/machine learning algorithms. |
Knowledge |
22 | * Knowledge of computer networking concepts and protocols, and network security methodologies. |
Knowledge |
75A | Knowledge of mathematics, including logarithms, trigonometry, linear algebra, calculus, statistics, and operational analysis. |
Knowledge |
102 | Knowledge of programming language structures and logic. |
Knowledge |
108 | * Knowledge of risk management processes (e.g., methods for assessing and mitigating risk). |
Knowledge |
166 | Skill in conducting queries and developing algorithms to analyze data structures. |
Skill |
172 | Skill in creating and utilizing mathematical or statistical models. |
Skill |
1120 | Ability to interpret and incorporate data from multiple tool sources. |
Ability |
1157 | * Knowledge of national and international laws, regulations, policies, and ethics as they relate to cybersecurity. |
Knowledge |
1158 | * Knowledge of cybersecurity principles. |
Knowledge |
1159 | * Knowledge of cyber threats and vulnerabilities. |
Knowledge |
3080 | Ability to use and understand complex mathematical concepts (e.g., discrete math). |
Ability |
3756 | Skill in developing or recommending analytic approaches or solutions to problems and situations for which information is incomplete or for which no precedent exists. |
Skill |
5030 | Analyze data sources to provide actionable recommendations. |
Task |
5120 | Conduct hypothesis testing using statistical processes. |
Task |
5550 | Program custom algorithms. |
Task |
5640 | Utilize technical documentation or resources to implement a new mathematical, data science, or computer science method. |
Task |
5853 | Build predictive, prescriptive, or descriptive models in collaboration with stakeholders. |
Task |
5906 | Plan and conduct complex analytical, mathematical, and statistical research that informs operational requirements. |
Task |
5907 | Plan, coordinate, and execute complex studies using advanced data modeling techniques and procedures, data trend analysis, and data algorithms. |
Task |
5924 | Train and evaluate machine learning models. |
Task |
5927 | Write and document reproducible code. |
Task |
6050 | Ability to build complex data structures and high-level programming languages. |
Ability |
6060 | Ability to collect, verify, and validate test data. |
Ability |
6120 | Ability to dissect a problem and examine the interrelationships between data that may appear unrelated. |
Ability |
6490 | Skill in assessing the predictive power and subsequent generalizability of a model. |
Skill |
6570 | Skill in identifying hidden patterns or relationships. |
Skill |
6651 | Skill in Regression Analysis (e.g., Hierarchical Stepwise, Generalized Linear Model, Ordinary Least Squares, Tree-Based Methods, Logistic). |
Skill |
6750 | Skill in using outlier identification and removal techniques. |
Skill |
6760 | Skill in writing scripts using R, Python, PIG, HIVE, SQL, etc. |
Skill |
6790A | Utilize open source languages, as appropriate, and apply quantitative techniques (e.g., descriptive and inferential statistics, sampling, experimental design, parametric and non-parametric tests of difference, ordinary least squares regression, general line). |
Task |
6900 | * Knowledge of specific operational impacts of cybersecurity lapses. |
Knowledge |
6935 | * Knowledge of cloud computing service models Software as a Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS). |
Knowledge |
6938 | * Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments. |
Knowledge |
7002 | Assist integrated project teams identify, curate, and manage test data. |
Task |
7029 | Knowledge of how to collect, store, and monitor data. |
Knowledge |
7071 | Skill in labeling data to make it more discoverable and understandable. |
Skill |
Additional KSATs
KSAT ID | Description | KSAT |
---|---|---|
35 | Knowledge of digital rights management. |
Knowledge |
506 | Design, develop, and modify software systems, using scientific analysis and mathematical models to predict and measure outcome and consequences of design. |
Task |
942 | Knowledge of the organization’s core business/mission processes. |
Knowledge |
1034A | Knowledge of Personally Identifiable Information (PII) data security standards. |
Knowledge |
1034C | Knowledge of Personal Health Information (PHI) data security standards. |
Knowledge |
5854 | Collaborate with appropriate personnel to address Personal Health Information (PHI), Personally Identifiable Information (PII), and other data privacy and data resusability concerns for AI solutions. |
Task |
5884 | Evaluate energy implications (graphical processing unit, tensor processing unit, etc.) when designing AI solutions. |
Task |
5896 | Maintain current knowledge of advancements in DoD AI Ethical Principles and Responsible AI. |
Task |
5907 | Plan, coordinate, and execute complex studies using advanced data modeling techniques and procedures, data trend analysis, and data algorithms. |
Task |
6290 | Knowledge of how to leverage government research and development centers, think tanks, academic research, and industry systems. |
Knowledge |
6651 | Skill in Regression Analysis (e.g., Hierarchical Stepwise, Generalized Linear Model, Ordinary Least Squares, Tree-Based Methods, Logistic). |
Skill |
7020 | Knowledge of DoD AI Ethical Principles (e.g., responsible, equitable, traceable, reliable, and governable). |
Knowledge |
7036 | Knowledge of laws, regulations, and policies related to AI, data security/privacy, and use of publicly procured data for government. |
Knowledge |
7078 | Skill in using deep learning approaches to build machine learning models. |
Skill |