https://doi.org/10.1140/epjds/s13688-021-00311-9
Regular Article
The presence of occupational structure in online texts based on word embedding NLP models
1
CSS-Recens Research Group, Centre for Social Sciences – Hungarian Academy of Sciences Centre of Excellence, Tóth Kálmán u. 4, 1097, Budapest, Hungary
2
Faculty of Social Sciences, Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117, Budapest, Hungary
3
Department of Network and Data Science, Central European University, A-1100, Vienna, Austria
Received:
16
May
2021
Accepted:
15
November
2021
Published online:
27
November
2021
Research on social stratification is closely linked to analyzing the prestige associated with different occupations. This research focuses on the positions of occupations in the semantic space represented by large amounts of textual data. The results are compared to standard results in social stratification to see whether the classical results are reproduced and if additional insights can be gained into the social positions of occupations. The paper gives an affirmative answer to both questions.
The results show a fundamental similarity of the occupational structure obtained from text analysis to the structure described by prestige and social distance scales. While our research reinforces many theories and empirical findings of the traditional body of literature on social stratification and, in particular, occupational hierarchy, it pointed to the importance of a factor not discussed in the mainline of stratification literature so far: the power and organizational aspect.
Key words: Social stratification / Prestige / Occupations / Natural Language Processing / Word embedding / Text mining
© The Author(s) 2021
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.