首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 731 毫秒
1.
Online social networks have become immensely popular in recent years and have become the major sources for tracking the reverberation of events and news throughout the world. However, the diversity and popularity of online social networks attract malicious users to inject new forms of spam. Spamming is a malicious activity where a fake user spreads unsolicited messages in the form of bulk message, fraudulent review, malware/virus, hate speech, profanity, or advertising for marketing scam. In addition, it is found that spammers usually form a connected community of spam accounts and use them to spread spam to a large set of legitimate users. Consequently, it is highly desirable to detect such spammer communities existing in social networks. Even though a significant amount of work has been done in the field of detecting spam messages and accounts, not much research has been done in detecting spammer communities and hidden spam accounts. In this work, an unsupervised approach called SpamCom is proposed for detecting spammer communities in Twitter. We model the Twitter network as a multilayer social network and exploit the existence of overlapping community-based features of users represented in the form of Hypergraphs to identify spammers based on their structural behavior and URL characteristics. The use of community-based features, graph and URL characteristics of user accounts, and content similarity among users make our technique very robust and efficient.  相似文献   

2.

In the past decades, a large number of music pieces are uploaded to the Internet every day through social networks, such as Last.fm, Spotify and YouTube, that concentrates on music and videos. We have been witnessing an ever-increasing amount of music data. At the same time, with the huge amount of online music data, users are facing an everyday struggle to obtain their interested music pieces. To solve this problem, music search and recommendation systems are helpful for users to find their favorite content from a huge repository of music. However, social influence, which contains rich information about similar interests between users and users’ frequent correlation actions, has been largely ignored in previous music recommender systems. In this work, we explore the effects of social influence on developing effective music recommender systems and focus on the problem of social influence aware music recommendation, which aims at recommending a list of music tracks for a target user. To exploit social influence in social influence aware music recommendation, we first construct a heterogeneous social network, propose a novel meta path-based similarity measure called WPC, and denote the framework of similarity measure in this network. As a step further, we use the topological potential approach to mine social influence in heterogeneous networks. Finally, in order to improve music recommendation by incorporating social influence, we present a factor graphic model based on social influence. Our experimental results on one real world dataset verify that our proposed approach outperforms current state-of-the-art music recommendation methods substantially.

  相似文献   

3.
Social networks once being an innoxious platform for sharing pictures and thoughts among a small online community of friends has now transformed into a powerful tool of information, activism, mobilization, and sometimes abuse. Detecting true identity of social network users is an essential step for building social media an efficient channel of communication. This paper targets the microblogging service, Twitter, as the social network of choice for investigation. It has been observed that dissipation of pornographic content and promotion of followers market are actively operational on Twitter. This clearly indicates loopholes in the Twitter’s spam detection techniques. Through this work, five types of spammers-sole spammers, pornographic users, followers market merchants, fake, and compromised profiles have been identified. For the detection purpose, data of around 1 Lakh Twitter users with their 20 million tweets has been collected. Users have been classified based on trust, user and content based features using machine learning techniques such as Bayes Net, Logistic Regression, J48, Random Forest, and AdaBoostM1. The experimental results show that Random Forest classifier is able to predict spammers with an accuracy of 92.1%. Based on these initial classification results, a novel system for real-time streaming of users for spam detection has been developed. We envision that such a system should provide an indication to Twitter users about the identity of users in real-time.  相似文献   

4.
Availability of millions of products and services on e-commerce sites makes it difficult to search the best suitable product according to the requirements because of existence of many alternatives. To get rid of this the most popular and useful approach is to follow reviews of others in opinionated social medias, who have already tried them. Almost all e-commerce sites provide facility to the users for giving views and experience of the product and services they experienced. The customers reviews are increasingly used by individuals, manufacturers and retailers for purchase and business decisions. As there is no scrutiny over the reviews received, anybody can write anything unanimously which conclusively leads to review spam. Moreover, driven by the desire of profit and/or publicity, spammers produce synthesized reviews to promote some products/brand and demote competitors products/brand. Deceptive review spam has seen a considerable growth overtime. In this work, we have applied supervised as well as unsupervised techniques to identify review spam. Most effective feature sets have been assembled for model building. Sentiment analysis has also been incorporated in the detection process. In order to get best performance some well-known classifiers were applied on labeled dataset. Further, for the unlabeled data, clustering is used after desired attributes were computed for spam detection. Additionally, there is a high chance that spam reviewers may also be held responsible for content pollution in multimedia social networks, because nowadays many users are giving the reviews using their social network logins. Finally, the work can be extended to find suspicious accounts responsible for posting fake multimedia contents into respective social networks.  相似文献   

5.
何欢  朱焱  李春平 《计算机工程》2021,47(12):192-199
社交网络灰帽用户极易隐藏且类型多样,导致现有检测算法适用性较差。提出一种基于传播时空特性的社交网络检测算法。构建用户生成内容传播网络度量白帽和灰帽用户在传播空间上的不同特性,融合时空传播特性并调节权重比例以提高分类性能。实验结果表明,该算法能有效检测不同类型灰帽用户,与用户特征分析、社交网络链接分析、多视图融合等主流灰帽用户检测算法相比,其在CAVERLEE、CRESCI-15、CRESCI-17等多个数据集上的准确率及AUC值最高分别提升26.08%和30.54%。  相似文献   

6.
With the rise of social networking services such as Facebook and Twitter, the problem of spam and content pollution has become more significant and intractable. Using social networking services, users are able to develop relationships and share messages with others in a very convenient manner; however, they are vulnerable to receiving spam messages. The automatic detection of spammers or content polluters on the network can effectively reduce the burden on the service provider in making a decision on appropriate counteractions. Content polluters can be automatically identified by using the supervised learning technique of artificial intelligence. To build a classification model with high accuracy automatically from the training data set, it is important to identify a set of useful features that can classify polluters and non-polluters. Moreover, because we deal with a huge amount of raw data in this process, the efficiency of data preparation and model creation are also critical issues that need to be addressed. In this paper, we present an efficient method for detecting content polluters on Twitter. Specifically, we propose a set of features that can be easily extracted from the messages and behaviors of Twitter users and construct a new breed of classifiers based on these features. The proposed approach requires only a minimal number of feature values per Twitter user and thus adds considerably less time to the overall mining process compared to other methods. Experiments confirm that the proposed approach outperforms previous approaches in both classification accuracy and processing time.  相似文献   

7.
基于支持向量机的垃圾标签检测模型   总被引:2,自引:2,他引:0  
为解决Folksonomy存在垃圾标签的问题,提出垃圾标签检测模型。利用向量空间模型表征用户特征,再用支持向量机将Folksonomy用户二分类。通过检测出隐藏在正常用户群体中的垃圾投放人,以此减少垃圾标签数量。实验结果表明,基于支持向量机的垃圾标签检测模型具有更高的分类精度,优于其他检测方法。  相似文献   

8.
随着互联网的迅速发展,社交网络已经成为人们日常生活中的重要社交工具。然而,社交网络中的异常用户层出不穷,其危害也日益严重。因此,识别和检测社交网络中的异常用户对提高用户体验、保持良好的网络环境等具有重要作用。介绍了不同类型的社交网络异常用户,并对每种不同类型异常用户的研究进展进行了介绍;对异常检测方法进行了综述,将社交网络中的异常检测技术分为分类、聚类、统计、信息论、混合、图六大类,并对这六类技术各自的优缺点进行了比较,有助于人们了解社交网络中的异常用户、异常检测技术,为解决异常问题提供了思路。  相似文献   

9.
The large volume of data associated with social networks hinders the unaided user from interpreting network content in real time. This problem is compounded by the fact that there are limited tools available for enabling robust visual social network exploration. We present a network activity visualization using a novel aggregation glyph called the clyph. The clyph intuitively combines spatial, temporal, and quantity data about multiple network events. We also present several case studies where major network events were easily identified using clyphs, establishing them as a powerful aid for network users and owners.  相似文献   

10.
Liu  Bo  Ni  Zeyang  Luo  Junzhou  Cao  Jiuxin  Ni  Xudong  Liu  Benyuan  Fu  Xinwen 《World Wide Web》2019,22(6):2953-2975

Social networking websites with microblogging functionality, such as Twitter or Sina Weibo, have emerged as popular platforms for discovering real-time information on the Web. Like most Internet services, these websites have become the targets of spam campaigns, which contaminate Web contents and damage user experiences. Spam campaigns have become a great threat to social network services. In this paper, we investigate crowd-retweeting spam in Sina Weibo, the counterpart of Twitter in China. We carefully analyze the characteristics of crowd-retweeting spammers in terms of their profile features, social relationships and retweeting behaviors. We find that although these spammers are likely to connect more closely than legitimate users, the underlying social connections of crowd-retweeting campaigns are different from those of other existing spam campaigns because of the unique features of retweets that are spread in a cascade. Based on these findings, we propose retweeting-aware link-based ranking algorithms to infer more suspicious accounts by using identified spammers as seeds. Our evaluation results show that our algorithms are more effective than other link-based strategies.

  相似文献   

11.
Zhang  Wei  Zhu  Shiwei  Tang  Jian  Xiong  Naixue 《The Journal of supercomputing》2018,74(4):1779-1801

With the development of Internet technology, social network has become an important application in the network life. However, due to the rapid increase in the number of users, the influx of a variety of bad information is brought up as well as the existence of malicious users. Therefore, it is emergent to design a valid management scheme for user’s authentication to ensure the normal operation of social networks. Node trust evaluation is an effective method to deal with typical network attacks in wireless sensor networks. In order to solve the problem of quantification and uncertainty of trust, a novel trust management scheme based on Dempster–Shafer evidence theory for malicious nodes detection is proposed in this paper. Firstly, by taking into account spatiotemporal correlation of the data collected by sensor nodes in adjacent area, the trust degree can be estimated. Secondly, according to the D–S theory, the trust model is established to count the number of interactive behavior of trust, distrust or uncertainty, further to evaluate the direct trust value and indirect trust value. Then, a flexible synthesis method is adopted to calculate the overall trust to identify the malicious nodes. The simulation results show that the proposed scheme has obvious advantages over the traditional methods in the identification of malicious node and data fusion accuracy, and can obtain good scalability.

  相似文献   

12.
在校园网络中,存在着大量的信息系统,记录着用户的日常行为信息。通过对大量用户的日常轨迹信息分析,可以发现用户之间的行为关联性,度量用户之间的社会关系强度。基于上海某校的校园网络数据特点,提出了一种改进的基于用户时间序列模型,用最短时间距离进行社会关系度量的方法。该方法首先依据用户的行为数据生成用户行为时间序列,并在此基础上进行行为关联性的度量,以反映用户在真实世界中的社会关系强度,并利用地点访问热度修正社会关系强度的分析结果。实验中使用该方法对上海某校的校园网数据进行分析,度量用户关联性强度,验证了该方法的有效性。  相似文献   

13.
社交网络的庞大数据需求分布式存储,多个用户的数据分散存储在各个存储和计算节点上可以保持并行性和冗余性。如何在有限的分布式存储空间内高性能存储和访问用户数据具有现实意义。在当前的社交网络系统中,用户数据之间的读写操作会导致大量跨存储节点的远程访问。减少节点间的远程访问可以降低网络负载和访问延迟,提高用户体验。提出一种基于用户交互行为的动态划分复制算法,利用用户之间的朋友关系和评论行为描述社交网络的结构,周期性划分复制用户数据,从而提高本地访问率,降低网络负载。通过真实数据集验证,该算法相比随机划分和复制算法能够大大提升本地访问率,降低访问延迟。  相似文献   

14.
高维数据中进行各种处理时所需样本数量会成指数级增加,同时样本间距离的价值也逐渐减小,将导致维数灾问题。文本标签数据通常会面临数据维数过高的问题,会影响用户对垃圾标签的检测。文中借助支持向量机的数学模型构建出针对Folksonomy的大规模垃圾标签检测模型。为了减少检测垃圾标签时维数过高的影响,在核主成分分析理论的启发下,将数据降维思想引入数据约简领域,提出基于核主成分分析法的大规模SVM数据集约简模型。最终实例化形成一种新的垃圾标签检测方法,即基于核主成分分析支持向量机( KPCA-SVM)的大规模垃圾标签检测模型。该模型在垃圾标签检测中可以在不影响数据特征的前提下,缩短模型的测试时间且检测性能良好。  相似文献   

15.
社交网络新增恶意用户检测作为一项分类任务,一直面临着数据样本不足、恶意用户标注稀少的问题。在数据有限的情况下,为了能够精确地检测出恶意用户,提出一种基于自适应差异化图卷积网络的检测方法。该方法通过提取社交网络中的用户特征和社交关系构建社交网络图。构建社交网络图后,计算节点与邻居的相似度,并对邻居进行优先级排序,利用优先级顺序采样关键邻居。关键邻居的特征通过自适应权重的加权平均方式聚合到节点自身,以此更新节点特征。特征更新后的节点通过特征降维和归一化计算得到恶意值,利用恶意值判断用户的恶意性。实验表明该方法和其他方法相比,具有更高的恶意用户查全率和整体查准率,并且能够快速地完成对新增用户的检测,证明了自适应差异化图卷积网络能够有效捕捉到少量样本的关键特征。  相似文献   

16.
Twitter spam detection is a recent area of research in which most previous works had focused on the identification of malicious user accounts and honeypot-based approaches. However, in this paper we present a methodology based on two new aspects: the detection of spam tweets in isolation and without previous information of the user; and the application of a statistical analysis of language to detect spam in trending topics. Trending topics capture the emerging Internet trends and topics of discussion that are in everybody’s lips. This growing microblogging phenomenon therefore allows spammers to disseminate malicious tweets quickly and massively. In this paper we present the first work that tries to detect spam tweets in real time using language as the primary tool. We first collected and labeled a large dataset with 34 K trending topics and 20 million tweets. Then, we have proposed a reduced set of features hardly manipulated by spammers. In addition, we have developed a machine learning system with some orthogonal features that can be combined with other sets of features with the aim of analyzing emergent characteristics of spam in social networks. We have also conducted an extensive evaluation process that has allowed us to show how our system is able to obtain an F-measure at the same level as the best state-of-the-art systems based on the detection of spam accounts. Thus, our system can be applied to Twitter spam detection in trending topics in real time due mainly to the analysis of tweets instead of user accounts.  相似文献   

17.
With the recent surge of location-based social networks (LBSNs), e.g., Foursquare and Facebook Places, huge amount of human digital footprints that people leave in the cyber-physical space become accessible, including users’ profiles, online social connections, and especially the places that they have checked in. Different from social networks (e.g., Flickr, Facebook) which have explicit groups for users to subscribe or join, LBSNs usually have no explicit community structure. Meanwhile, unlike social networks which only contain a single type of social interaction, the coexistence of online/offline social interactions and user/venue attributes in LBSNs makes the community detection problem much more challenging. In order to capitalize on the large number of potential users/venues as well as the huge amount of heterogeneous social interactions, quality community detection approach is needed. In this paper, by exploring the heterogenous digital footprints of LBSNs users in the cyber-physical space, we come out with a novel edge-centric co-clustering framework to discover overlapping communities. By employing inter-mode as well as intra-mode features, the proposed framework is able to group like-minded users from different social perspectives. The efficacy of our approach is validated by intensive empirical evaluations based on the collected Foursquare dataset.  相似文献   

18.
Given the increasing applications of service computing and cloud computing, a large number of Web services are deployed on the Internet, triggering the research of Web service recommendation. Despite of service QoS, the use of user feedback is becoming the current trend in service recommendation. Likewise in traditional recommender systems, sparsity, cold-start and trustworthiness are major issues challenging service recommendation in adopting similarity-based approaches. Meanwhile, with the prevalence of social networks, nowadays people become active in interacting with various computers and users, resulting in a huge volume of data available, such as service information, user-service ratings, interaction logs, and user relationships. Therefore, how to incorporate the trust relationship in social networks with user feedback for service recommendation motivates this work. In this paper, we propose a social network-based service recommendation method with trust enhancement known as RelevantTrustWalker. First, a matrix factorization method is utilized to assess the degree of trust between users in social network. Next, an extended random walk algorithm is proposed to obtain recommendation results. To evaluate the accuracy of the algorithm, experiments on a real-world dataset are conducted and experimental results indicate that the quality of the recommendation and the speed of the method are improved compared with existing algorithms.  相似文献   

19.
ABSTRACT

The great number of social network users and the expansion of this kind of tool in the last years demand the storage of a great volume of information regarding user behaviour. In this article, we utilise interaction records from Facebook users and metrics from complex networks study, to identify different user behaviours using clustering techniques. We found three different user profiles regarding interactions performed in the social network: viewer, participant and content producer. Moreover, the groups we found were characterised by the C4.5 decision-tree algorithm. The 'viewer' mainly observes what happens in the network. The ‘participant’ interacts more often with the content, getting a higher value of closeness centrality. Therefore, users with a participant profile are responsible, for example, for the faster transmission of information in the virtual environment, a crucial function for the Facebook social network. We noted too that ‘content producer’ users had a greater quantity of publications in their pages, leading to a superior degree of input interactions than the other two profiles. Finally, we also verify that the profiles are not mutually exclusive, that is, the user of a profile can at determined moment perform the behaviour of another profile.  相似文献   

20.
微博客作为一种新的用户信息传播载体,在网络舆情发起和传播中起着重要作用。由于用户有意(上传广告)、无意(转发)操作所带来的大量噪音微博和相似微博,对网络舆情分析和用户浏览造成极为不利的影响。检测这些噪音微博和相似微博,对微博数据进行提纯,成为一个亟待解决的问题。基于统计数据分析了噪音微博和相似微博的特点,提出一种面向微博文本流的噪音判别和内容相似性双重检测的过滤方法:通过URL链接、字符率、高频词等特征判别,过滤噪音微博;通过分段过滤和索引过滤的双重内容过滤,检测和剔除相似微博。实验表明该方法能有效地对微博数据进行提纯,高效准确地过滤掉相似微博和噪音微博。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号