CHI 2019 Evolution of UX Knowledge Stack Exchange Vocabulary

Listed in Datasets

By Yubo Kou1, Colin M. Gray2

1. Florida State University 2. Purdue University

Dataset based on data collected and analyzed from UX Stack Exchange from 2008-2017. This data was used for the following publication at CHI 2019: A Practice-Led Account of the Conceptual Evolution of UX Knowledge...

Go to data

Version 1.0 - published on 03 Jan 2019 doi:10.4231/WKW4-8Q48 - cite this Content may change until committed to the archive on 03 Feb 2019

Licensed under CC0 1.0 Universal


Our study site is the user experience (UX) community supported by Stack Exchange. Stack Exchange is a large network of 170 Q&A communities, including Stack Overfow, which focuses on programming and is one of the most widely studied Q&A sites by researchers. The UX community we are studying belongs to this network, and had its first question asked on September 22th, 2008. We used the official Stack Exchange API to collect Q&A communication among UX practitioners from September 2008 to September 2017, including a total of 21,216 questions, 56,486 answers, and 9,936 unique authors who had written at least one question or answer.

Our study asks: What concepts and knowledge categories characterize this body of UX knowledge? We considered identified nouns as a candidate for a UX concept. We used the Stanford Log-linear Part-Of-Speech Tagger to process all text in our dataset, and generated a set of UX concept candidates, each provided along with their frequencies in our dataset. The initial candidate list contained 53,281 words, with their fre­quencies following a long-tailed distribution (max = 90,938, min = 1, avg = 37.5, std = 557.5).

The criteria we used to delimit a UX concept was that a word must have an unam­biguous meaning that is relevant to UX design. For example, “time” is not a UX concept, because it can be either used as a generic everyday word, or used to refer to the temporal qual­ity of design. “User,” on the other hand, has the unambiguous meaning that people who use a designed thing. Two coders, who have experience and expertise in researching and teach­ing UX, double coded all the candidates. For words that the two coders disagreed over, they went to the actual Q&A communication to check if the word was used with only one meaning. Through this process, we were able to generate a set of 602 UX concepts that we consider a representation of UX knowledge.

We used qualitative content analysis to code all the 602 concepts. The unit of analysis was at the concept level, and the coding was performed in an inductive manner. The same two coders assigned a code to each concept individually. They then discussed and compared their list of codes while moving back and forth between codes and concepts. With a consolidated code list, the two coders engaged in further abstraction, drawing from their collective interpretations of codes. In the end, we generated six primary categories. Each primary category contains one or more secondary categories.

Cite this work

Researchers should cite this work as follows:


The Purdue University Research Repository (PURR) is a university core research facility provided by the Purdue University Libraries, the Office of the Executive Vice President for Research and Partnerships, and Information Technology at Purdue (ITaP).