首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In social tagging systems such as Delicious and Flickr,users collaboratively manage tags to annotate resources.Naturally,a social tagging system can be modeled as a (user,tag,resource) hypernetwork,where there are three different types of nodes,namely users,resources and tags,and each hyperedge has three end nodes,connecting a user,a resource and a tag that the user employs to annotate the resource.Then how can we automatically cluster related users,resources and tags,respectively? This is a problem of community detection in a 3-partite,3-uniform hypernetwork.More generally,given a K-partite K-uniform (hyper)network,where each (hyper)edge is a K-tuple composed of nodes of K different types,how can we automatically detect communities for nodes of different types? In this paper,by turning this problem into a problem of finding an efficient compression of the (hyper)network’s structure,we propose a quality function for measuring the goodness of partitions of a K-partite K-uniform (hyper)network into communities,and develop a fast community detection method based on optimization.Our method overcomes the limitations of state of the art techniques and has several desired properties such as comprehensive,parameter-free,and scalable.We compare our method with existing methods in both synthetic and real-world datasets.  相似文献   

2.
Building application domain models is a time-consuming activity in software engineering. In small teams, it is an activity that involves almost all participants, including developers and domain experts. In our approach, we support the knowledge engineering activity by reusing tagging done by team participants when they search information on the Web about the application’s domain. Team participants collaborate implicitly when they do tagging because their individually created tags are collected and form a folksonomy. This folksonomy reflects their knowledge about the domain and it is the base for eliciting domain model elements in the knowledge acquisition and conceptualization tasks in a consensual way. Experiments provide evidence that our approach helps team participants to build richer domain models than if they do not use our software tool. The tool allows the reuse of simple annotations as long as users learn about the application’s domain.  相似文献   

3.
Users of social Web sites actively create and join communities as a way to collectively share their media content and rich experience with diverse groups of people. In this study we focus on the issue of recommending social communities (or groups) to individual users. We address specifically the potential of social tagging for accentuating users’ interests and characterizing communities. We also discuss some unique methods of improving several techniques that have been adapted for use in the context of community recommendations: collaborative filtering, a random walk model, a Katz influence model, a latent semantic model, and a user-centric tag model. We effectively incorporate social tagging information in each algorithm. We present empirical evaluations using real datasets from CiteULike and Last.fm. Our experimental results demonstrate that the different algorithms incorporated with social tagging offer significant advantages in improving both the recommendation quality and coverage, and demonstrate their feasibility for community recommendations in dealing with sparsity-related limitations.  相似文献   

4.
Recently, social networking sites are offering a rich resource of heterogeneous data. The analysis of such data can lead to the discovery of unknown information and relations in these networks. The detection of communities including ‘similar’ nodes is a challenging topic in the analysis of social network data, and it has been widely studied in the social networking community in the context of underlying graph structure. Online social networks, in addition to having graph structures, include effective user information within networks. Using this information leads to enhance quality of community discovery. In this study, a method of community discovery is provided. Besides communication among nodes to improve the quality of the discovered communities, content information is used as well. This is a new approach based on frequent patterns and the actions of users on networks, particularly social networking sites where users carry out their preferred activities. The main contributions of proposed method are twofold: First, based on the interests and activities of users on networks, some small communities of similar users are discovered, and then by using social relations, the discovered communities are extended. The F-measure is used to evaluate the results of two real-world datasets (Blogcatalog and Flickr), demonstrating that the proposed method principals to improve the community detection quality.  相似文献   

5.
Forking is the creation of a new software repository by copying another repository. Though forking is controversial in traditional open source software (OSS) community, it is encouraged and is a built-in feature in GitHub. Developers freely fork repositories, use codes as their own and make changes. A deep understanding of repository forking can provide important insights for OSS community and GitHub. In this paper, we explore why and how developers fork what from whom in GitHub. We collect a dataset containing 236,344 developers and 1,841,324 forks. We make surveys, and analyze programming languages and owners of forked repositories. Our main observations are: (1) Developers fork repositories to submit pull requests, fix bugs, add new features and keep copies etc. Developers find repositories to fork from various sources: search engines, external sites (e.g., Twitter, Reddit), social relationships, etc. More than 42 % of developers that we have surveyed agree that an automated recommendation tool is useful to help them pick repositories to fork, while more than 44.4 % of developers do not value a recommendation tool. Developers care about repository owners when they fork repositories. (2) A repository written in a developer’s preferred programming language is more likely to be forked. (3) Developers mostly fork repositories from creators. In comparison with unattractive repository owners, attractive repository owners have higher percentage of organizations, more followers and earlier registration in GitHub. Our results show that forking is mainly used for making contributions of original repositories, and it is beneficial for OSS community. Moreover, our results show the value of recommendation and provide important insights for GitHub to recommend repositories.  相似文献   

6.
ContextOpen source development allows a large number of people to reuse and contribute source code to the community. Social networking features open opportunities for information discovery, social collaborations, and improved recommendations of potential collaborators.ObjectiveOnline community and development platforms rely on social network features to increase awareness and attention among community members for improved collaborations. The objective of this work is to introduce an approach for recommending relevant users to follow. Follower networks provide means for informal information propagation. The efficiency and effectiveness of such information flows is impacted by the network structure. Here, we aim to understand the resilience of networks against random or strategic node removal.MethodSocial network features of online software development communities present a new opportunity to enhance online collaboration. Our approach is based on the automatic analysis of user behavior and network structure. The proposed ‘who to follow’ recommendation algorithm can be parametrized for specific contexts. Link-analysis techniques such as PageRank/HITS provide the basis for a novel ‘who to follow’ recommendation model.ResultsWe tested the approach using a GitHub-based dataset. Currently, users follow popular community members to get updates regarding their activities instead of maintaining personal relations. Thus, social network features require further improvements to increase reciprocity. The application of our ‘who to follow’ recommendation model using the GitHub dataset shows excellent results with respect to context-sensitive following recommendations. The sensitivity of GitHub’s follower network to random node removal is comparable with other social networks but more sensitive to follower authority based node removal.ConclusionLink-based algorithm can be used for context-sensitive ‘who to follow’ recommendations. GitHub is highly sensitive to authority based node removal. Information flow established through follower relations will be strongly impacted if many authorities are removed from the network. This underpins the importance of ‘central’ users and the validity of focusing the ‘who to follow’ recommendations on those users.  相似文献   

7.
Complex software development projects rely on the contribution of teams of developers, who are required to collaborate and coordinate their efforts. The productivity of such development teams, i.e., how their size is related to the produced output, is an important consideration for project and schedule management as well as for cost estimation. The majority of studies in empirical software engineering suggest that - due to coordination overhead - teams of collaborating developers become less productive as they grow in size. This phenomenon is commonly paraphrased as Brooks’ law of software project management, which states that “adding manpower to a software project makes it later”. Outside software engineering, the non-additive scaling of productivity in teams is often referred to as the Ringelmann effect, which is studied extensively in social psychology and organizational theory. Conversely, a recent study suggested that in Open Source Software (OSS) projects, the productivity of developers increases as the team grows in size. Attributing it to collective synergetic effects, this surprising finding was linked to the Aristotelian quote that “the whole is more than the sum of its parts”. Using a data set of 58 OSS projects with more than 580,000 commits contributed by more than 30,000 developers, in this article we provide a large-scale analysis of the relation between size and productivity of software development teams. Our findings confirm the negative relation between team size and productivity previously suggested by empirical software engineering research, thus providing quantitative evidence for the presence of a strong Ringelmann effect. Using fine-grained data on the association between developers and source code files, we investigate possible explanations for the observed relations between team size and productivity. In particular, we take a network perspective on developer-code associations in software development teams and show that the magnitude of the decrease in productivity is likely to be related to the growth dynamics of co-editing networks which can be interpreted as a first-order approximation of coordination requirements.  相似文献   

8.
Social tagging systems leverage social interoperability by facilitating the searching, sharing, and exchanging of tagging resources. A major drawback of existing social tagging systems is that social tags are used as keywords in keyword-based search. They focus on keywords and human interpretability rather than on computer interpretable semantic knowledge. Therefore, social tags are useful for information sharing and organizing, but they lack the computer-interpretability needed to facilitate a personalized social tag recommendation. An interesting issue is how to automatically generate a personalized social tag recommendation list to users when a resource is accessed by users. The novel solution proposed in this study is a hybrid approach based on semantic tag-based resource profile and user preference to provide personalized social tag recommendation. Experiments show that the Precision and Recall of the proposed hybrid approach effectively improves the accuracy of social tag recommendation.  相似文献   

9.
The once-sharp distinction between software users and developers is fading away, and richer ecologies of participation are emerging. In particular, software engineering R&D faces new challenges from the quickly increasing population of software developers who are domain experts but don't have the time or desire to be professional software engineers. The metadesign framework reformulates software development activities as a continuum of different degrees of design and use. It's supported by the "seeding, evolutionary growth, reseeding" model and supports the coevolution of individuals, communities, and systems. Guidelines derived from these models can help software developers produce tools for end-user development. This article is part of a special issue on end-user software engineering.  相似文献   

10.
User communities in social networks are usually identified by considering explicit structural social connections between users. While such communities can reveal important information about their members such as family or friendship ties and geographical proximity, just to name a few, they do not necessarily succeed at pulling like‐minded users that share the same interests together. Therefore, researchers have explored the topical similarity of social content to build like‐minded communities of users. In this article, following the topic‐based approaches, we are interested in identifying communities of users that share similar topical interests with similar temporal behavior. More specifically, we tackle the problem of identifying temporal (diachronic) topic‐based communities, i.e., communities of users who have a similar temporal inclination toward emerging topics. To do so, we utilize multivariate time series analysis to model the contributions of each user toward emerging topics. Further, our modeling is completely agnostic to the underlying topic detection method. We extract topics of interest by employing seminal topic detection methods; one graph‐based and two latent Dirichlet allocation‐based methods. Through our experiments on Twitter data, we demonstrate the effectiveness of our proposed temporal topic‐based community detection method in the context of news recommendation, user prediction, and document timestamp prediction applications, compared with the nontemporal as well as the state‐of‐the‐art temporal approaches.  相似文献   

11.
When discussing programming issues on social platforms (e.g, Stack Overflow, Twitter), developers often mention APIs in natural language texts. Extracting API mentions from natural language texts serves as the prerequisite to effective indexing and searching for API-related information in software engineering social content. The task of extracting API mentions from natural language texts involves two steps: 1) distinguishing API mentions from other English words (i.e., API recognition), 2) disambiguating a recognized API mention to its unique fully qualified name (i.e., API linking). Software engineering social content lacks consistent API mentions and sentence writing format. As a result, API recognition and linking have to deal with the inherent ambiguity of API mentions in informal text, for example, due to the ambiguity between the API sense of a common word and the normal sense of the word (e.g., append, apply and merge), the simple name of an API can map to several APIs of the same library or of different libraries, or different writing forms of an API should be linked to the same API. In this paper, we propose a semi-supervised machine learning approach that exploits name synonyms and rich semantic context of API mentions for API recognition in informal text. Based on the results of our API recognition approach, we further propose an API linking approach leveraging a set of domain-specific heuristics, including mention-mention similarity, scope filtering, and mention-entry similarity, to determine which API in the knowledge base a recognized API actually refers to. To evaluate our API recognition approach, we use 1205 API mentions of three libraries (Pandas, Numpy, and Matplotlib) from Stack Overflow text. We also evaluate our API linking approach with 120 recognized API mentions of these three libraries.  相似文献   

12.
Many famous online social networks, e.g., Facebook and Twitter, have achieved great success in the last several years. Users in these online social networks can establish various connections via both social links and shared attribute information. Discovering groups of users who are strongly connected internally is defined as the community detection problem. Community detection problem is very important for online social networks and has extensive applications in various social services. Meanwhile, besides these popular social networks, a large number of new social networks offering specific services also spring up in recent years. Community detection can be even more important for new networks as high quality community detection results enable new networks to provide better services, which can help attract more users effectively. In this paper, we will study the community detection problem for new networks, which is formally defined as the “New Network Community Detection” problem. New network community detection problem is very challenging to solve for the reason that information in new networks can be too sparse to calculate effective similarity scores among users, which is crucial in community detection. However, we notice that, nowadays, users usually join multiple social networks simultaneously and those who are involved in a new network may have been using other well-developed social networks for a long time. With full considerations of network difference issues, we propose to propagate useful information from other well-established networks to the new network with efficient information propagation models to overcome the shortage of information problem. An effective and efficient method, Cat (Cold stArT community detector), is proposed in this paper to detect communities for new networks using information from multiple heterogeneous social networks simultaneously. Extensive experiments conducted on real-world heterogeneous online social networks demonstrate that Cat can address the new network community detection problem effectively.  相似文献   

13.
We report on an exploratory study, which aims at understanding how software communities use blogs compared to conventional development infrastructures. We analyzed the behavior of 1,100 bloggers in four large open source communities, distinguishing between committing bloggers and other community members. We observed that these communities intensively use blogs with one new entry every 8 h. A blog entry includes 14 times more words than a commit message. When analyzing the content of the blogs, we found that committers and others bloggers write about similar topics. Most popular topics in committers’ blogs represent high-level concepts such as features and domain concepts, while source code related topics are discussed in 15% of their posts. Other community members frequently write about community events and conferences as well as configuration and deployment topics. We found that the blogging peak period is usually after the software is released. Moreover, committers are more likely to blog after corrective engineering than after forward engineering and re-engineering activities. Our findings call for a hypothesis-driven research to (a) further understand the role of social media in dissolving the collaboration boundaries between developers and other stakeholders and (b) integrate social media into development processes and tools.  相似文献   

14.
Programming question and answer (Q&A) websites, such as Stack Overflow, leverage the knowledge and expertise of users to provide answers to technical questions. Over time, these websites turn into repositories of software engineering knowledge. Such knowledge repositories can be invaluable for gaining insight into the use of specific technologies and the trends of developer discussions. Previous work has focused on analyzing the user activities or the social interactions in Q&A websites. However, analyzing the actual textual content of these websites can help the software engineering community to better understand the thoughts and needs of developers. In the article, we present a methodology to analyze the textual content of Stack Overflow discussions. We use latent Dirichlet allocation (LDA), a statistical topic modeling technique, to automatically discover the main topics present in developer discussions. We analyze these discovered topics, as well as their relationships and trends over time, to gain insights into the development community. Our analysis allows us to make a number of interesting observations, including: the topics of interest to developers range widely from jobs to version control systems to C# syntax; questions in some topics lead to discussions in other topics; and the topics gaining the most popularity over time are web development (especially jQuery), mobile applications (especially Android), Git, and MySQL.  相似文献   

15.
Social media such as forums, blogs and microblogs has been increasingly used for public information sharing and opinions exchange nowadays. It has changed the way how online community interacts and somehow has led to a new trend of engagement for online retailers especially on microblogging websites such as Twitter. In this study, we investigated the impact of online retailers' engagement with the online brand communities on users' perception of brand image and service. Firstly, we analysed the overall sentiment trends of different brands and the patterns of engagement between companies and customers using the collected tweets posted on a popular social media platform, Twitter. Then, we studied how different types of engagements affect customer sentiments. Our analysis shows that engagement has an effect on sentiments that associate with brand image, perception and customer service of the online retailers. Our findings indicate that the level, length, type and attitude of retailers' engagement with social media users have a significant impact on their sentiments. Based on our results, we derived several important managerial and practical implications.  相似文献   

16.
针对在线社会网络的特性和现有社区发现算法的不足,提出一种基于语义网技术的在线社会网络社区发现算法ISLPA(Improved Semantic Label Propagation Algorithm),即一种适用于大规模在线社会网络的社区发现和标识算法。ISLPA算法对语义标签算法SemTagP进行改进,在社区划分过程中将在线社会网络视为有向加权图,通过语义网和社会化标签技术,充分结合在线社会网络丰富的语义信息和网络拓扑特征进行社区划分。ISLPA算法不需要预先设定社区数量和大小,就能实现社区发现,并能根据标签自动识别划分的社区。算法接近线性时间复杂度,具有较高的效率。通过实验表明,ISLPA算法能有效划分和标识真实在线社会网络。  相似文献   

17.
Recent years have witnessed the increasing emphasis on human aspects in software engineering research and practices. Our survey of existing studies on human aspects in software engineering shows that screen-captured videos have been widely used to record developers’ behavior and study software engineering practices. The screen-captured videos provide direct information about which software tools the developers interact with and which content they access or generate during the task. Such Human-Computer Interaction (HCI) data can help researchers and practitioners understand and improve software engineering practices from human perspective. However, extracting time-series HCI data from screen-captured task videos requires manual transcribing and coding of videos, which is tedious and error-prone. In this paper we report a formative study to understand the challenges in manually transcribing screen-captured videos into time-series HCI data. We then present a computer-vision based video scraping technique to automatically extract time-series HCI data from screen-captured videos. We also present a case study of our scvRipper tool that implements the video scraping technique using 29-hours of task videos of 20 developers in two development tasks. The case study not only evaluates the runtime performance and robustness of the tool, but also performs a detailed quantitative analysis of the tool’s ability to extract time-series HCI data from screen-captured task videos. We also study the developer’s micro-level behavior patterns in software development from the quantitative analysis.  相似文献   

18.
The popularity of mobile devices has been steadily growing in recent years. These devices heavily depend on software from the underlying operating systems to the applications they run. Prior research showed that mobile software is different than traditional, large software systems. However, to date most of our research has been conducted on traditional software systems. Very little work has focused on the issues that mobile developers face. Therefore, in this paper, we use data from the popular online Q&A site, Stack Overflow, and analyze 13,232,821 posts to examine what mobile developers ask about. We employ Latent Dirichlet allocation-based topic models to help us summarize the mobile-related questions. Our findings show that developers are asking about app distribution, mobile APIs, data management, sensors and context, mobile tools, and user interface development. We also determine what popular mobile-related issues are the most difficult, explore platform specific issues, and investigate the types (e.g., what, how, or why) of questions mobile developers ask. Our findings help highlight the challenges facing mobile developers that require more attention from the software engineering research and development communities in the future and establish a novel approach for analyzing questions asked on Q&A forums.  相似文献   

19.
Open source software (OSS) projects represent a new paradigm of software creation and development based on hundreds or even thousands of developers and users organised in the form of a virtual community. The success of an OSS project is closely linked to the successful organisation and development of the virtual community of support. The main objective of this article is to analyse the activity of virtual communities. Social network analysis is employed to analyse Linux ports to embedded processors as a case study to achieve this aim. The obtained results confirm the necessity of structuring the virtual community with a selection of active developers and core members to promote community activity and attract peripheral users, expanding the impact of the underlying software. The obtained result will be useful for the software industry migrating to the open source software paradigm.  相似文献   

20.
An online learning community enables learners to access up-to-date information via the Internet anytime–anywhere because of the ubiquity of the World Wide Web (WWW). Students can also interact with one another during the learning process. Hence, researchers want to determine whether such interaction produces learning synergy in an online learning community. In this paper, we take the Technology Acceptance Model as a foundation and extend the external variables as well as the Perceived Variables as our model and propose a number of hypotheses. A total of 436 Taiwanese senior high school students participated in this research, and the online learning community focused on learning English. The research results show that all the hypotheses are supported, which indicates that the extended variables can effectively predict whether users will adopt an online learning community. Finally, we discuss the implications of our findings for the future development of online English learning communities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号