题名非结构化P2P网络测量与分析
作者段世惠
学位类别博士
答辩日期2009-05-26
授予单位中国科学院声学研究所
授予地点声学研究所
关键词对等网络 Gnutella 网络测量 内容相似度 拓扑动态性 搜索算法 布鲁姆过滤器
其他题名Studies on Measurement and Analysis of Unstructured Peer-to-Peer Network
学位专业信号与信息处理
中文摘要近年来,对等网络技术蓬勃发展,已迅速成为下一代互联网技术研究的热点。Gnutella是一个当前非常流行的P2P共享文件系统,是典型的非结构化对等网络,它拥有百万级在线用户群,已成为目前最成熟和应用最广泛的非结构P2P系统。对Gnutella进行测试与分析,将有助于了解P2P网络的运行规律和解决P2P网络优化、网络监管等问题。 本文对Gnutella网络测量与性能分析做了大量研究,主要包括文件共享属性测量、网络拓扑与内容相关性分析、拓扑动态性测量和搜索算法优化等。经过上述研究,本文归纳出Gnutella网络在文件共享和拓扑动态属性方面的一些运行规律,针对拓扑与内容无关性、搜索算法效率低等问题提出了相关的改进方案,有效地改善了Gnutella网络性能。 本文的主要研究成果如下: 1、对Gnutella网络文件共享属性进行了详尽的测量,归纳了Gnutella网络用户在文件共享方面的规律。基于所测量的文件共享属性,本文对这些属性进行了关联分析,给出了这些属性之间的内在联系,并对用户进行了聚类分析。最后,采用词频统计方法,本文尝试对网络上共享的不良信息进行了初步的分析。 2、提出一种基于内容相似度的拓扑改善算法。通过研究与测量表明,现有的Gnutella网络中内容分布与网络拓扑结构无关,因而导致查询效率低下。基于节点之间的内容相似度强弱,该算法有选择地断开相似度弱的节点连接,而重新建立相似度强的节点连接。仿真表明,该算法能够在一定程度上实现网络拓扑结构与内容分布相关,增强内容相似度强的节点之间的连接,改善网络的聚类性质。 3、对Gnutella网络拓扑的动态性做短期型测量,在个体层面上归纳出超级节点在网络中的运行规律。在测量基础上,对节点连接度的变化规律进行了研究,分析表明,超级节点通常以较高概率保持在稳态;采用M/G/C/C排队系统可以很好地模拟连接度的变化;本文还提出一种与实际测量值无显著差异的节点度变化序列生成方法。 4、提出一种基于BloomFilter的提示性搜索算法。该算法利用BF技术生成路由条目并在一定范围内相互交换本地路由表,使节点能够了解一定范围内的节点共享信息,实现有针对性的搜索,避免了传统的盲目性搜索。仿真表明,该算法查询搜索时产生的消息数量比传统算法减少了一个数量级,并能够获得较高的查全率与较低的覆盖率。
英文摘要The P2P technology has been developed fast in recent years and become the study focus in research of the next generation internet technologies. Gnutella is a very popular P2P file sharing system currently and it is a typical unstructured peer-to-peer network, which has mega online global users and turns to be the most mature and widely used unstructured P2P system. It is useful to know the P2P network operation pattern and resolve these problems such as P2P network optimization and supervision by measuring and analyzing Gnutella network. This thesis does lots of research on Gnutella network measurement and performance analysis, including file shared properties measurement, relativity between topology and content, network dynamic properties and search algorithm optimization. Following from the above studies, this thesis concludes some operation patterns in file sharing and topology dynamics, and proposes some novel and effective approaches to address the problems including the independent between network topology and content distribution and search algorithm inefficiency, which effectively improve the performance of Gnutella network. The main research contributions of this thesis include: 1. This thesis does detailed measurement on file sharing properties and concludes the pattern about these properties in the Gnutella network. Based these file sharing properties, this thesis does association analysis between these patterns,gives the internal relationship and do cluster analysis to the users. Finally, with the term frequency statistical method, this thesis tries to analyze the vicious information which is shared in Gnutella network. 2. A topology improved algorithm is proposed based on shared content similarity. The study and measurement shows that it is independent between network topology and content distribution which leads to inefficient search in Gnutella. Based on the similarity between nodes, this algorithm selectively disconnects the nodes which are with weak similarity and reconnects the nodes which are with strong similarity. Our simulation shows that, this algorithm can implement the integration of network topology and content distribution to a certain extent, enhance the connections between nodes which have strong similarity and improve the network clustering characteristic. 3. This thesis does short-term measurement for the Gnutella network dynamic properties and concludes UltraPeer operation pattern at individual level. Based on the measurement, the analysis shows that the UltraPeers usually keep steady-state with high probability; the M/G/C/C queue system can well simulate the change of node connection degree; moreover, a scheme for node connection degree sequence generation is proposed and the produced sequence is not remarkably different from the actual measured sequence. 4. An informed search algorithm is proposed based on BloomFilter. The algorithm generates routing table items using BloomFilter and interchanges these items in a certain range, which makes nodes know the shared information about this range and implement informed search to avoid traditional blind search. Our simulation shows that, the number of messages produced by this algorithm is one order of magnitude lower than the traditional algorithm and this algorithm can gain higher recall ratio and lower coverage ratio.
语种中文
公开日期2011-05-07
页码146
内容类型学位论文
源URL[http://159.226.59.140/handle/311008/356]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
段世惠. 非结构化P2P网络测量与分析[D]. 声学研究所. 中国科学院声学研究所. 2009.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace