The field of data science is growing at break-neck speeds. More industries are looking towards data science professionals to find, compile, explore, and model data from numerous areas of interest. Currently, there are over 130,000 data science positions open. Professional training and institutions of higher education are unable to train enough individuals to place into these positions. Initial research in this space has started reviewing some of the data in this space, including determining the knowledge, skills, and abilities (KSAs) employers seek in data science candidates and what institutions are providing in the means of preparation in this space. With the general problem of too many data science positions and not enough professionals to fill them, this research intends to take a grounded theory approach to evaluating the data sources, looking for emergent theories that may provide additional understanding for how to solve that overarching problem.
These dictionaries are ready to be used with the Stanford CoreNLP for classifying data scientist, statistics, and technology phrases.
This is a collection of Data Scientist job postings from 5/2018 and curriculum data from higher education institutes that advertise a "Data Science Initiative."
This publication comprises the source code for various text mining utilities written against the Stanford CoreNLP project and other scripts to plot the formatted output from those programs.
Cite this work
Researchers should cite this work as follows:
- Seliger, C. S. (2018). Knowledge, Skills, and Abilities (KSAs) of Data Science Professionals. Purdue University Research Repository. doi:10.4231/R76971TF