PDF(1704 KB)
PDF(1704 KB)
PDF(1704 KB)
计算传播学的社交机器人田野实验:概念、方法与应用
Social Bot Field Experiments in Computational Communi-cation: Concept, Method, and Application
人工智能技术的发展推动了社交机器人在社会科学研究中的应用。本文从计算传播学的视角出发,探讨社交机器人田野实验的概念界定、方法建构、实验设计及其实际应用,认为社交机器人田野实验结合了大数据分析与仿真方法的优势,已发展为一种高度可控的实验法。社交机器人田野实验为观察、分析与理解数字媒体环境中的传播现象提供了新思路,将助力于新闻传播学理论的验证、探索与拓展。在实证研究部分,本文通过社交机器人田野实验对过滤气泡的成因进行了初步探索。研究发现,即使控制了社交机器人账号的阅读行为偏好,其在随机阅读实验后仍可能陷入过滤气泡。
With advances in artificial intelligence technologies, social bots are increasingly being applied in social science research. From the perspective of computational communication, we discuss the conception, methodology, experimental design, and practical applications of social bot field experiments. Leveraging the advantages of big data analytics and simulation techniques, social bot field experiments have now evolved into a highly controllable research method. We highlighted how this method provides new approaches for observing, analyzing, and understanding communication phenomena in digital media environments, contributing to the verification, exploration, and expansion of communication theories. In the empirical research section, we conducted a preliminary investigation into the causes of filter bubbles through field experiments with social robots. The study finds that despite controlling for reading behavior preferences, social robot accounts may still fall into filter bubbles after random reading experiments.
社交机器人 / 田野实验 / 数字踪迹 / 算法审计 / 过滤气泡
social bot / field experiment / digital trace / algorithm audit / filter bubble
| [1] |
方师师(2016). 算法机制背后的新闻价值观——围绕“Facebook偏见门”事件的研究. 《新闻记者》,(9),39-50.
|
| [2] |
高山冰, 汪婧(2020). 智能传播时代社交机器人的兴起、挑战与反思. 《现代传播(中国传媒大学学报)》,(11),8-11+18.
|
| [3] |
葛岩, 秦裕林, 赵汗青(2020). 社交媒体必然带来舆论极化吗:莫尔国的故事. 《国际新闻界》,(2),67-99.
|
| [4] |
韩娜, 孙颖(2022). 国家安全视域下社交机器人涉华议题操纵行为探析. 《现代传播(中国传媒大学学报)》,(8),40-49.
|
| [5] |
何塞·范·迪克, 孙少晶, 陶禹舟.(2021). 平台化逻辑与平台社会——对话前荷兰皇家艺术和科学院主席何塞·范·迪克. 《国际新闻界》,(9),49-59.
|
| [6] |
李晓静, 付思琪(2020). 智能时代传播学受众与效果研究:理论、方法与展望——与香港城市大学祝建华教授、斯坦福大学杰佛瑞·汉考克教授对谈. 《国际新闻界》,(3),108-128.
|
| [7] |
李永宁, 吴晔, 张伦(2021). 动态社团发现研究综述. 《复杂系统与复杂性科学》,(2),1-8+88.
|
| [8] |
刘河庆, 梁玉成(2023). 透视算法黑箱:数字平台的算法规制与信息推送异质性. 《社会学研究》,(2),49-71+227.
|
| [9] |
罗俊(2020). 计算·模拟·实验:计算社会科学的三大研究方法. 《学术论坛》,(1),35-49.
|
| [10] |
彭兰(2020). 导致信息茧房的多重因素及“破茧”路径. 《新闻界》,(1),30-38+73.
|
| [11] |
申琦, 王璐瑜(2021). 当“机器人”成为社会行动者:人机交互关系中的刻板印象. 《新闻与传播研究》,(2),37-52+127.
|
| [12] |
沈伟伟(2019). 算法透明原则的迷思——算法规制理论的批判. 《环球法律评论》,(6),20-39.
|
| [13] |
师文, 陈昌凤(2020). 社交机器人在新闻扩散中的角色和行为模式研究——基于《纽约时报》“修例”风波报道在Twitter上扩散的分析. 《新闻与传播研究》,(5),5-20+126.
|
| [14] |
师文, 陈昌凤(2023). 平台算法的“主流化”偏向与“个性化”特质研究——基于计算实验的算法审计. 《新闻记者》,(11),3-14.
|
| [15] |
宋美杰, 刘云(2023). 智能新物种崛起与人机传播模式重构. 《福建师范大学学报(哲学社会科学版)》,(5),90-100.
|
| [16] |
塔娜, 林聪(2023). 点击搜索之前:针对搜索引擎自动补全算法偏见的实证研究. 《国际新闻界》,(8),132-154.
|
| [17] |
王斌, 李宛真(2018). 如何戳破“过滤气泡”算法推送新闻中的认知窄化及其规避. 《新闻与写作》,(9),20-26.
|
| [18] |
王成军, 党明辉, 杜骏飞(2019). 找回失落的参考群体:对沉默的螺旋理论的边界条件的考察. 《新闻大学》,(4),13-29+116-117.
|
| [19] |
王敏, 张子柯(2022). 计算传播学的仿真研究范式:优势、挑战与发展. 《新闻界》,(10),64-74.
|
| [20] |
徐明华, 魏子瑶(2023). 算法伦理的治理新范式:算法审计的兴起、发展与未来. 《当代传播》,(1),80-86.
|
| [21] |
杨敏, 熊则见(2013). 模型验证——基于主体建模的方法论问题. 《系统工程理论与实践》,(6),1458-1470.
|
| [22] |
张洪忠, 王競一(2023). 机器行为范式:传播学研究挑战与拓展路径. 《现代传播(中国传媒大学学报)》,(1),1-9.
|
| [23] |
张伦, 邓依林(2021). 网络议程设置理论与方法:计算传播学视角. 《中国传媒大学学报(自然科学版)》,(1),50-54.
|
| [24] |
赵蓓, 张洪忠(2023). 议程设置中的时间变化:基于社交机器人、媒体和公众时间滞后分析. 《国际新闻界》,(2),52-80.
|
| [25] |
周葆华(2020). “计算”的传播与“传播”的计算. 《新闻与写作》,(5),1.
|
| [26] |
周丽华, 王家龙, 王丽珍, 陈红梅, 孔兵(2022). 异质信息网络表征学习综述. 《计算机学报》,(1),160-189.
|
| [27] |
We propose and explore the possibility that language models can be studied as effective proxies for specific human subpopulations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the “algorithmic bias” within one such tool—the GPT-3 language model—is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this propertyalgorithmic fidelityand explore its extent in GPT-3. We create “silicon samples” by conditioning the model on thousands of sociodemographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and sociocultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.
|
| [28] |
|
| [29] |
Exposure to news, opinion, and civic information increasingly occurs through social media. How do these online networks influence exposure to perspectives that cut across ideological lines? Using deidentified data, we examined how 10.1 million U.S. Facebook users interact with socially shared news. We directly measured ideological homophily in friend networks and examined the extent to which heterogeneous friends could potentially expose individuals to cross-cutting content. We then quantified the extent to which individuals encounter comparatively more or less diverse content while interacting via Facebook's algorithmically ranked News Feed and further studied users' choices to click through to ideologically discordant content. Compared with algorithmic ranking, individuals' choices played a stronger role in limiting exposure to cross-cutting content. Copyright © 2015, American Association for the Advancement of Science.
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
This study suggests one direction of theoretical and methodological coupling of communication research with the digital trace data, utilizing its differences from the traditional social science approach (e.g., sampling vs. population, normal distribution vs. power–law distribution, generalization vs. simulation, deductive vs. inductive, and perceived vs. actual). We propose specific examples of (i) combining communication research with trace data methodologically and theoretically; (ii) collaborating with linguistic psychology complemented with the automated content analysis and natural language processing techniques; and (iii) creating new theoretical inquiries by configuring the granular level of interactivity and underlying dynamics, observing the longitudinal change of interactions, and discovering the neglected presence of outliers and the invisibles. We expect the direction suggested by this study contributes to deepening our understanding of human communication behavior.
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
Self-reported measures of media exposure are plagued with error and questions about validity. Since they are essential to studying media effects, a substantial literature has explored the shortcomings of these measures, tested proxies, and proposed refinements. But lacking an objective baseline, such investigations can only make relative comparisons. By focusing specifically on recent Internet activity stored by Web browsers, this article's methodology captures individuals' actual consumption of political media. Using experiments embedded within an online survey, I test three different measures of media exposure and compare them to the actual exposure. I find that open-ended survey prompts reduce overreporting and generate an accurate picture of the overall audience for online news. I also show that they predict news recall at least as well as general knowledge. Together, these results demonstrate that some ways of asking questions about media use are better than others. I conclude with a discussion of survey-based exposure measures for online political information and the applicability of this article's direct method of exposure measurement for future studies.
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
\n Does content curation by Facebook introduce ideological bias?\n \n [Also see Report by\n \n Bakshy\n et al.\n \n ]\n \n
|
| [46] |
|
| [47] |
Does the consumption of ideologically congruent news on social media exacerbate polarization? I estimate the effects of social media news exposure by conducting a large field experiment randomly offering participants subscriptions to conservative or liberal news outlets on Facebook. I collect data on the causal chain of media effects: subscriptions to outlets, exposure to news on Facebook, visits to online news sites, and sharing of posts, as well as changes in political opinions and attitudes. Four main findings emerge. First, random variation in exposure to news on social media substantially affects the slant of news sites that individuals visit. Second, exposure to counter-attitudinal news decreases negative attitudes toward the opposing political party. Third, in contrast to the effect on attitudes, I find no evidence that the political leanings of news outlets affect political opinions. Fourth, Facebook’s algorithm is less likely to supply individuals with posts from counter-attitudinal outlets, conditional on individuals subscribing to them. Together, the results suggest that social media algorithms may limit exposure to counter-attitudinal news and thus increase polarization. (JEL C93, D72, L82)
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
This article presents a review of communication research on user-generated content with a special focus on studies which include a content analysis. The trends of research on this comparatively new and rapidly developing subject are systematically discussed and desiderata are identified. The evaluation is based on a content analysis of pertinent approaches in nine relevant international peer-reviewed journals published from 2004 to 2012. From the results, the article concludes that user-generated content is approached by scholars from a variety of perspectives and offers scope for interdisciplinary cooperation but also notes that several of the challenges posed by the continuously changing nature of the content are not fully met.
|
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
There is widespread public and academic interest in understanding the uses and effects of digital media. Scholars primarily use self-report measures of the quantity or duration of media use as proxies for more objective measures, but the validity of these self-reports remains unclear. Advancements in data collection techniques have produced a collection of studies indexing both self-reported and log-based measures. To assess the alignment between these measures, we conducted a pre-registered meta-analysis of this research. Based on 106 effect sizes, we found that self-reported media use correlates only moderately with logged measurements, that self-reports were rarely an accurate reflection of logged media use and that measures of problematic media use show an even weaker association with usage logs. These findings raise concerns about the validity of findings relying solely on self-reported measures of media use.
|
| [56] |
|
| [57] |
|
| [58] |
|
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
\r\nThere have been growing concerns regarding the potential impact of social media on democracy and public debate. While some theorists have claimed that ICTs and social media would bring about a new independent public sphere and increase exposure to political divergence, others have warned that they would lead to polarization through the formation of echo chambers. The issue of social media echo chambers is both crucial and widely debated. This article attempts to provide a comprehensive account of the scientific literature on this issue, shedding light on the different approaches, their similarities, differences, benefits, and drawbacks, and offering a consolidated and critical perspective that can hopefully support future research in this area. Concretely, it presents the results of a systematic review of 55 studies investigating the existence of echo chambers on social media, providing a first classification of the literature and identifying patterns across the studies’ foci, methods and findings. We found that conceptual and methodological choices influence the results of research on this issue. Most importantly, articles that found clear evidence of echo chambers on social media were all based on digital trace data. In contrast, those that found no evidence were all based on self-reported data. Future studies should take into account the possible biases of the different approaches and the significant potential of combining self-reported data with digital trace data.\r\n
|
| [64] |
This article addresses questions of ideological polarization and the filter bubble in social media. It develops a theoretical analysis of ideological polarization on social media by considering a range of relevant factors. Over recent years, fake news and the effect of the social media filter bubble have become of increasing importance both in academic and general discourse. The article reviews the assumption that algorithmic curation and personalization systems place users in a filter bubble of content that decreases their likelihood of encountering ideologically cross-cutting news content. At the intersection of new media, politics and behavioural science, the article establishes a theoretical framework for further research and future actions by society, policymakers and industries.
|
| [65] |
|
| [66] |
|
| [67] |
|
1. 本研究中使用发布在GitHub上的开源模型以分析文本相似度,源代码链接如下:
2. 新闻类别香农熵的公式如下: H(X)=-∑n P(xi) logbP(xi) 。其中,新闻类别分为硬新闻、软新闻或其他,P(xi) 为随机变量新闻类别取特定值xi的概率;b是对数的底数,本研究取b=2。
/
| 〈 |
|
〉 |