Frequently Asked Questions
What is TheoryOn?
TheoryOn is an integrated theory development application which aims to reduce redundancy in research in the behavioral sciences. TheoryOn uses natural language processing and Information Extraction techniques to return constructs, construct relationships, theoretical models that are semantically related to a user’s query. For an in-depth explanation, please view the introductory video on the homepage.
In what ways is TheoryOn better than other search engines?
For theory development, TheryOn has the capability to outperform other search engines for four main reasons. First, it contains only theory-relevant content such as variable/construct names, definitions, items, as well as construct-level citations. This means that when you search for a variable like “ease of use,” which will return more than 200,000 documents, the vast majority of which do not statistically examine the ease of use construct against other constructs and variables. Second, it is able to return results that are semantically related to a search, even if the search contained none of the same words as the results. This helps to reduce the need to search for multiple redundant terms. Third, TheoryOn includes only the top rated journals from each discipline. This reduces the large number of barely-relevant results from other search engines into a smaller pool articles that are more highly related to the original search. Armed with a better understanding of the problem, the user may then move on to other search engines if needed. Last but not the least, TheoryOn automatically extract theoretical models, in the form of constructs and their statistical relationships. Therefore, a user could examine the statistically related construct, relevant control variables, antecedents and consequents of a construct of interest at ease. Combining the semantical similarity detection, a user could integrate multiple theoretical models across papers to obtain a comprehensive understanding of existing literature and identify research gaps and opportunities.
How do find ‘similar variables’ work?
These functions are the results of using the Latent Semantic Analysis technique (Deerwester, Dumais et al. 1990; Larsen and Monarchi 2004) to analyze each paragraph in the articles in the selected database. This creates a high-dimensional semantic space containing some sense or “understanding” of the underlying language. Each variable’s name, definition, and items are then treated as a block of text and “projected” into the semantic space. The variable’s location in the high-dimensional space is determined and stored into a meta-semantic space (Larsen, Lee et al. 2010). When you request ‘similar variable’ for a variable of interest, we search the meta-semantic space for the variables closest to yours. The above process is the same for ‘similar items’ except that only one the item texts are projected, stored, and searched for in the meta-semantic space.
How does the ‘synonymy’ function work?
This function is only available on variables that have been part of a manual categorization task inside the project. So far, we have only done this for a subset of the MIS discipline. Please keep in mind that no two categorization tasks (or experts for that matter) will ever create the same categorization structure for a sufficiently complex problem. These results provide a window into our (quite rigorous and time-intensive) process. We did this work comparing the abilities of humans to the functionality of our system as part of the requirements for various papers. We ask that you use these results only for your own literature review rather than as data for your own research.
What is the science behind theoretical model extraction?
We proposed a novel IT artifact built on an information extraction approach to automatically extract theoretical models tested in each paper. This information extraction approach is consisted of four steps: Hypothesis extraction, Variable extraction, Variable Synonymy Detection and Variable Statistical Relationship Extraction. Rule-based vs. machine learning algorithms are evaluated and compared to determine the best approach for the extraction steps. Please refer to Li and Larsen (2011) and Li and Larsen (2013) for more details.
Why can’t I sort results by discipline? I’m not interested in Nursing variables.
Have you created an account? If you create an account and log in, you have access to discipline-specific databases. This does not currently solve the problem of a user who wants to get rid of one discipline and keep all other results, but it helps users who are primarily interested in only one discipline. This functionality is on our ‘wish-list’. If you believe that variables from a specific set of journals should be combined into a new database, let us know through the Feedback link at the top of the page.
- Bong, C. H., K. R. Larsen and J. Martin (2012). A Large Scale Knowledge Integration Which Leads to Human Decision Making. IEEE Symposium on Computer and Informatics. Penang, Malaysia, IEEE.
- Cook, P. F., K. R. Larsen, T. J. Sakraida and L. Pedro (2012). "A Novel Approach to Concept Analysis: The Inter-Nomological Network." Nursing Research Forthcoming.
- Deerwester, S., S. Dumais, G. Furnas, T. Landauer and R. Harshman (1990). "Indexing by Latent Semantic Analysis." Journal of the American Society for Information Science 41(391-407).
- Larsen, K. R. and D. S. Hovorka (2012). Developing Interfield Nomological Nets. Hawaii International Conference on System Sciences. Maui, Hawaii, IEEE.
- Larsen, K. R., J. Lee, J. Li and C. H. Bong (2010). A Transdisciplinary Approach to Construct Search and Integration. 16th Americas Conference on Information Systems, Lima, Peru, Association of Information Systems.
- Larsen, K. R. and D. E. Monarchi (2004). "A Mathematical Approach to Categorization and Labeling of Qualitative Data: the Latent Categorization Method." Sociological Methodology 34(1): 349-392.
- Larsen, K. R., D. Nevo and E. Rich (2008). Exploring the Semantic Validity of Questionnaire Scales. Hawaii International Conference on System Sciences. Waikoloa, Hawaii: 1-10.
- Li, J. and K. R. Larsen (2011). Establishing Nomological Networks for Behavioral Science: A Natural Language Processing Based Approach. International Conference on Information Systems. Shanghai, China, Association for Information Systems.
- Li, J., & Larsen, K. (2013). Tracking Behavioral Construct Use through Citations: A Relation Extraction Approach. International Conference on Information Systems. Shanghai, China, Association for Information Systems.