公共文化服务平台

一种基于容器的自组织存储模型被引量：1: 2010年; 针对互联网Web应用特别是Web2.0应用的特殊存储需求,提出一种基于容器的自组织存储模型(CSS-M).CSS-M使用容器管理存储空间,聚簇存放用户文件,提高用户数据备份、迁移和恢复效率.一方面模型通过文件唯一标识提供文件的快速访问,另一方面它采用文件集来组织用户文件成树状,提供灵活的文件管理功能.模型使用容器作为数据定位和复制的基本单元,并利用对等覆盖网络技术自组织地维护容器元数据.另外,主从容器复制技术和基于容器状态的容器恢复技术保证了数据的可靠性和一致性.利用CSS-M实现了一个存储系统原型,初步的实验结果表明,CSS-M提供良好的性能和可扩展性,能够满足互联网WEB应用的存储需求.; 余利华陈刚王伟陈柯董金祥; 关键词：分布式存储系统对等覆盖网络

一种自学习的中文地址判重算法: 随着中文搜索引擎技术和海量数据挖掘技术的飞速发展,高效精确的中文地址判重技术作为其核心技术之一已成为学术界研究的焦点和热点。目前,面向中文的地址判重研究还尚未充分展开,且现有工作在判断同一地址的多种表述时均依赖领域知识,...; 周佳庆李晓燕陈珂胡天磊陈刚; 关键词：数据清洗自学习; 文献传递

基于树合并的Deep Web查询接口集成: 随着在线数据库应用的流行,整个互联网已经被迅速"深化".对于某一特定领域的deep Web,不同的站点往往会提供不同查询能力的查询接口.为了能够集成同一领域内的各个数据源,首先要解决的问题就是查询接口的集成.但是面对数量...; 陶然江锦华吴羽陈刚; 关键词：查询接口集成树模型; 文献传递

InfoSigs:一种面向WEB对象的细粒度聚类算法: 面向WEB对象的细粒度聚类已经成为学术界研究的热点。然而现有大多数聚类模型只关注如何对文本内容或文章主题进行聚类,聚类结果粒度较粗,无法满足大规模网络信息检索的质量要求.针对上述挑战,本文挖掘WEB文档中词汇间的树状概率...; 盛振华吴羽江锦华寿黎但陈刚; 文献传递

一种面向协作标签系统的图片检索聚类方法被引量：3: 2010年; 为了更有效地进行图片检索,提出了一种面向Web2.0协作标签系统的图片检索聚类方法。该算法首先针对标签空间由于标签表达多样性带来的不一致问题,并通过挖掘标签间的词汇关系实现语义级查询扩展来得到语义可能相关的扩展图片结果集;然后根据标签间的相关度度量选出图片结果集中与查询标签高相关的标签集,接着采用一种自顶向下启发式的图划分算法来自动对次相关标签集进行分类。最后图片结果集即根据标签分类结果被聚类。为验证该方法的效果,从标签图片共享网站Flickr上随机下载了大量真实图片集以及所含带的标签元数据,在已实现的图片检索原型系统PivotBrowser上进行了大量实验,结果证明,该聚类算法能有效解决标签空间存在的标签表达不一致问题和标签查询歧义性问题,能提供更满意的用户检索。; 李晓燕陈刚寿黎但董金祥; 关键词：标签歧义性

Adaptive Indexing of Moving Objects with Highly Variable Update Frequencies被引量：3: 2008年; In recent years,management of moving objects has emerged as an active topic of spatial access methods. Various data structures(indexes) have been proposed to handle queries of moving points,for example,the well-known Bx-tree uses a novel mapping mechanism to reduce the index update costs. However,almost all the existing indexes for predictive queries are not applicable in certain circumstances when the update frequencies of moving objects become highly variable and when the system needs to balance the performance of updates and queries. In this paper,we introduce two kinds of novel indexes,named By-tree and αBy-tree. By associating a prediction life period with every moving object,the proposed indexes are applicable in the environments with highly variable update frequencies. In addition,the αBy-tree can balance the performance of updates and queries depending on a balance parameter. Experimental results show that the By-tree and αBy-tree outperform the Bx-tree in various conditions.; 陈楠寿黎但陈刚董金祥; 关键词：索引

Bottom-up mining of XML query patterns to improve XML querying被引量：1: 2008年; Querying XML data is a computationally expensive process due to the complex nature of both the XML data and the XML queries. In this paper we propose an approach to expedite XML query processing by caching the results of frequent queries. We discover frequent query patterns from user-issued queries using an efficient bottom-up mining approach called VBUXMiner. VBUXMiner consists of two main steps. First, all queries are merged into a summary structure named "compressed global tree guide" (CGTG). Second, a bottom-up traversal scheme based on the CGTG is employed to generate frequent query patterns. We use the frequent query patterns in a cache mechanism to improve the XML query performance. Experimental results show that our proposed mining approach outperforms the previous mining algorithms for XML queries, such as XQPMinerTID and FastXMiner, and that by caching the results of frequent query patterns, XML query performance can be dramatically improved.; Yi-jun BEI Gang CHEN Jin-xiang DONG Ke CHEN; 关键词：XML查询数据隐藏

Adaptive XML to relational mapping: an integrated approach被引量：3: 2008年; Storing and querying XML (eXtensible Markup Language) data in relational form can exploit various services offered by modern relational database management systems (RDBMSs). Due to structural complexity of XML, there are many equivalent relational mapping schemes for the same XML data and queries. In this paper, we propose the adaptive XML to relational mapping (AX2RM) system, which considers finding optimal XML to relational (X2R) mapping as four separate but correlated procedures: logical database design, data scale estimation, workload transformation, and physical database design. We view the whole process as an autonomic computing problem and formalize the adaptive X2R mapping problem. Search spaces for each procedure are investigated individually, and five approaches for finding the optimal mapping are studied. We propose an integrated approach with greedy pruning (IT-GP), which views the mapping procedures as a whole and exploits heuristic rules in each procedure to prune impossible mappings as early as possible. Evaluation of these approaches shows the validity and high efficiency of IT-GP.; Tian-lei HU Gang CHEN; 关键词：XML 映射技术关系数据库

基于状态的通用自主计算模型被引量：11: 2007年; 针对目前由于计算系统庞大造成管理工作复杂以及处理突发事件低效的问题,提出一种通用的自主计算模型.通过归纳系统的运行状态和状态变化,利用统计记分方法对状态变化进行判定和标记,实现了包含自我配置、恢复、优化和保护在内的自主计算特性,使系统可以自主地规划最能够满足系统运行需求的方案,从而将繁重的系统配置和突发事件处理任务由管理人员移交给系统自身.实验结果表明,该运用该模型能够在减轻管理负担的情况下,自主地实现系统资源的合理配置与使用.; 臧铖黄忠东董金祥; 关键词：访问控制

基于权重哈尔小波的XML包含连接估计方法被引量：3: 2009年; 针对可扩展标记语言(XML)基本查询操作符——包含连接,提出了一种基于权重哈尔小波的结果数估计方法.该方法利用哈尔小波有效压缩XML包含连接结果统计,并通过小波摘要维护统计信息.在估计阶段,使用小波系数重构包含连接结果数.为了减小估计误差,提出基于标签名查询频率的权重模型,并集成于哈尔小波估计方法中.实验证明,对于XML包含连接结果数估计,权重哈尔小波估计方法优于先前的估计方法(如直方图法、随机取样法).在相同的空间限制下,权重小波估计具有更小的平均相对误差.; 邵峰陈刚陈珂贝毅君董金祥; 关键词：可扩展标记语言哈尔小波

渝B2-20050021-1　渝公网安备 50019002500403号　违法和不良信息举报中心　互联网出版许可证　新出网证(渝)字10号

国家自然科学基金(60603044)

文献类型

领域

主题

机构

作者

传媒

年份

用户反馈

国家自然科学基金(60603044)

文献类型

领域

主题

机构

作者

传媒

年份

用户登录

用户反馈