r/birthcontrol: A case study using data from Reddit posts and data science to advance family planning research (Preprint)


引用 0|浏览5
BACKGROUND Contraceptive choice is central to reproductive autonomy, and the internet, including online communities like those formed on Reddit, is an important resource for people seeking contraceptive information and peer support. A subreddit dedicated to contraception, r/birthcontrol, provides a platform for people to share narratives, offering real-time insights in contraceptive decision-making processes and use experiences. OBJECTIVE This study explored use of r/birthcontrol, from the inception of the subreddit through the end of 2020, to describe the online community, identify distinctive interests and themes based upon the textual content of posts, and identify and explore the content of posts with the most user engagement (i.e. ‘popular’ posts). METHODS Data were obtained from the PushShift Reddit API from the establishment of r/birthcontrol to the start date of this analysis (July 21, 2011-December 31, 2020). User interactions within the subreddit were analyzed to describe use of this community over time, specifically the commonality of use based on the volume of posts, the length of posts (character count), and the proportion of posts with any and each flair applied. ‘Scores’, or upvotes minus downvotes serving as a proxy for the popularity of each post, were used to determine ‘popular’ posts on r/birthcontrol (posts with 9 comments and a score of ≥3). TF-IDF analyses were run on all posts with flairs applied, posts within each flair group, and popular posts within each flair group to characterize and compare distinctive language used in each group of posts. RESULTS There were 105,485 posts to r/birthcontrol during the study period, with use of the subreddit increasing over time. The majority of posts were exclusively textual content (96%), had comments (86%), and had a score (96%). Posts averaged 731 characters in length, with a median of 555 characters. Within the timeframe that flairs were available on r/birthcontrol (since February 4, 2016), users applied flairs to 78% of posts with increasing use over time. “SideEffects?” was most frequently used flair among all posts (40% of posts), while “Experience” and “Side Effects” were most frequently applied among popular posts (31% and 29%, respectively). TF-IDF analyses of all posts showed interest in contraceptive methods, menstrual experiences, timing, feelings, and unprotected sex. While n-gram results for posts with each flair varied, the contraceptive pill, menstrual experiences, and timing were discussed across flair groups. Among popular posts, IUDs and contraceptive use experiences were often discussed. CONCLUSIONS This study provides insights into how r/birthcontrol has been used as a resource for contraceptive information and support since 2018 and presents a case study of how public health researchers can use Machine Learning methods to study social networking sites, contributing to and expanding public health research and discourse.
AI 理解论文
Chat Paper