Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Linking Twitter and survey data: asymmetry in quantity and its impact

Al Baghal, Tarek, Wenz, Alexander, Sloan, Luke ORCID: and Jessop, Curtis 2021. Linking Twitter and survey data: asymmetry in quantity and its impact. EPJ Data Science 10 , 32. 10.1140/epjds/s13688-021-00286-7

[thumbnail of s13688-021-00286-7.pdf] PDF - Published Version
Available under License Creative Commons Attribution.

Download (1MB)


Linked social media and survey data have the potential to be a unique source of information for social research. While the potential usefulness of this methodology is widely acknowledged, very few studies have explored methodological aspects of such linkage. Respondents produce planned amounts of survey data, but highly variant amounts of social media data. This study explores this asymmetry by examining the amount of social media data available to link to surveys. The extent of variation in the amount of data collected from social media could affect the ability to derive meaningful linked indicators and could introduce possible biases. Linked Twitter data from respondents to two longitudinal surveys representative of Great Britain, the Innovation Panel and the NatCen Panel, show that there is indeed substantial variation in the number of tweets posted and the number of followers and friends respondents have. Multivariate analyses of both data sources show that only a few respondent characteristics have a statistically significant effect on the number of tweets posted, with the number of followers being the strongest predictor of posting in both panels, women posting less than men, and some evidence that people with higher education post less, but only in the Innovation Panel. We use sentiment analyses of tweets to provide an example of how the amount of Twitter data collected can impact outcomes using these linked data sources. Results show that more negatively coded tweets are related to general happiness, but not the number of positive tweets. Taken together, the findings suggest that the amount of data collected from social media which can be linked to surveys is an important factor to consider and indicate the potential for such linked data sources in social research.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Social Sciences (Includes Criminology and Education)
Publisher: SpringerOpen
ISSN: 2193-1127
Funders: ESRC
Date of First Compliant Deposit: 11 June 2021
Date of Acceptance: 25 May 2021
Last Modified: 05 May 2023 10:08

Citation Data

Cited 6 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics