University of Leicester
Browse
1/1
4 files

LScDC Word Clouds and Tables to Visually Present the Most Informative Words in Subject Categories

figure
posted on 2020-04-24, 13:56 authored by Neslihan SuzenNeslihan Suzen
Word Clouds to Visually Present the Most Informative Words in Subject Categories

April 2020 by Neslihan Suzen, PhD student at the University of Leicester (ns433@leicester.ac.uk / suzenneslihan@hotmail.com )
Supervised by Prof Alexander Gorban and Dr Evgeny Mirkes

This publication presents word clouds of the most informative words in Web of Science (WoS) categories [1,2]. The clouds are created with words of the Leicester Scientific Dictionary-Core LScDC [3,4]. We consider the list of words with their Relative Information Gain (RIGs) in the corresponding category. For all categories, words are sorted by their RIGs in descending order and the top 100 words are shown in the word clouds [5]. The bigger size the word in word clouds, the more informative it is for the category. This study is a part of the research on the quantification of the meaning of research texts.
Word clouds for the top 100 most informative words and histograms of RIGs for the top 10 most informative words for each of 252 categories can be found in the archive published along with this description. The most informative 100 words with their RIGs for each of categories are presented in tables published.

Published archive contains following files:
1. Word_Clouds.pdf: A file that contains all word clouds of the top 100 most informative words and the histogram of the top 10 most informative words for each of 252 WoS categories.
2. Lists_of_Words .pdf: Lists of the top informative 100 words for each of 252 WoS categories.

References

[1] Web of Science. (15 July). Available: https://apps.webofknowledge.com/
[2] WoS Subject Categories. Available: https://images.webofknowledge.com/WOKRS56B5/help/WOS/hp_subject_category_terms_tasca.html
[3] Suzen, Neslihan (2019): LScDC (Leicester Scientific Dictionary-Core). figshare. Dataset. https://doi.org/10.25392/leicester.data.9896579.v3
[4] Suzen, N., Mirkes, E. M., & Gorban, A. N. (2019). LScDC-new large scientific dictionary. arXiv preprint arXiv:1912.06858.
[5] Suzen, Neslihan (2020): LScDC Word-Category RIG Matrix. figshare. Dataset. https://doi.org/10.25392/leicester.data.12133431.v1

History