首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 399 毫秒
1.
侯锟  罗海龙 《科技广场》2007,22(3):117-118
本文研究了对于Web页面列表信息的抽取方法。通过对超文本文档特征的分析获取抽取知识,并通过自学习适应页面的变化,实现了对于列表信息的抽取。  相似文献   

2.
邱金鹏 《科技通报》2019,35(10):133-136
传统Web页面语义标注方法需手工处理,或只可将Web页面中有属性的标签赋予数据,针对无属性标签数据不进行标注,不适于大规模Web页面信息标注,且标注结果不可靠。为此,提出一种新的基于集成学习的动态Web页面语义标注方法。给出动态Web页面语义标注流程。将Web页面转换成DOM树,识别待标注文本。选取抽取信息特征与训练Web页面特征,将含有语义信息的内容分配至概念抽象化的本体上,采用多分类器集成学习方法进行分类,区分待标注信息是属性标签还是数据元素,通过不同分类器预测结果的一致性对相应样本被准确标注的置信度进行衡量。通过训练页面中涵盖的属性标注规则集与抽取信息中的属性名称实现语义标注。实验结果表明,所提方法适于大规模动态Web页面语义标注,标注结果可靠。  相似文献   

3.
彭同坠 《科教文汇》2008,(36):278-278
信息抽取技术的研究旨在为人们提供一种更有利的获取信息的方式,针对互联网上web页面的异构性和动态性,本文提出了一种通用的web新闻页面信息抽取的方法。该方法克服了传统的网页信息抽取中针对不同的网站制作不同的包装器的缺点。本方法主要针对新闻页面正文、发布时间、转载情况的信息抽取,为自然语言处理的研究提供语料支持,其准确性能够很好地满足需求。  相似文献   

4.
基于HTML或MXL描述的Web页信息抽取技术研究   总被引:1,自引:0,他引:1  
谢维成  吕先竞  宋玉忠 《情报科学》2005,23(9):1398-1402
从同类企业挖掘有价值的信息是企业信息化的重要任务,目前Web企业信息描述大多数是用HTML表示的,但基于XML描述的企业信息Web页面逐渐增加,Web数据抽取是Web企业信息挖掘的关键,本文提出了一种面向HTML和XML描述的Web页面的Web数据抽取模型并阐述了实现过程。  相似文献   

5.
基于模式匹配抽取技术的网上产品情报获取   总被引:1,自引:0,他引:1  
产品生命周期的逐渐缩短迫使企业关注从因特网上大量、散乱的信息中及时获取新产品信息,跟踪竞争对手研发动向。本文引入基于模式匹配的自动Web信息抽取技术,叙述抽取产品的关键信息方法,并以家用冰箱性能参数信息的自动抽取为例,分析了冰箱领域知识,进行了样本页面的分析归纳,确认了冰箱产品的多种属性及产品信息抽取的模式特征,最终获得了清晰、结构化的产品数据,形成从Web页面上抽取同类产品关键信息的整个处理流程模型,成为网络环境下情报采集与分析的新情报研究模式的一个有力探索。  相似文献   

6.
【目的/意义】随着Web网页的爆炸式增长和网页噪声不断增多,企业竞争情报系统和智能化网站的开发 以及移动终端的阅读都急需一种可以高效精确抽取网页信息的方法。【方法/过程】本文提出了基于重复模式识别 的信息提取新方法,通过页面解析、相似度计算、聚类并形成群组、删除横幅广告和导航链接等步骤,提取到了详情 页面的标题和主要内容。【结果/结论】对于结构稳定的页面,本文实现了较高质量的信息抽取。不足之处是聚类和 相似度的计算量较大,时间较长。  相似文献   

7.
马玉春  孙冰 《情报科学》2005,23(9):1376-1380
针对股票的网站越来越多,如何从这些网站的有关页面进行信息抽取,并得到相关知识,为股民提供股票交易的决策参考,是一个值得研究的课题。本文剖析了信息抽取常用的Wrapper方法,以及抽取知识的获取方法。最后,根据可视化信息抽取的原则,设计了一个可视化信息抽取的实验,取得了良好的效果。  相似文献   

8.
翟东升  余旸 《情报杂志》2005,24(8):33-35
提出了一个应用于国际贸易技术壁垒预警系统中的网页表格信息抽取的可行性方案。数据抓取从对Web页的HTML源代码分析入手,采取基于Ontology的抽取方法,结合一系列成熟模型,进而建立网页信息采集系统并且通过测试。实验结果表明该方案切实可行,且抓取具有较快的速率和较高的准确性。  相似文献   

9.
基于网上新闻语料的Web页面自动分类研究   总被引:1,自引:0,他引:1  
Web页面由于其在表达信息的丰富性方面远胜于纯文本文件,因此Web页面分类与纯文本分类不同。针对网上中文新闻页面特点,我们提出了一种无需词典的从Web页面中抽取主题的实用算法。并将提取出的类主题概念融入分类用知识库,然后用我们研究小组提出的混合分类算法进行分类,实验语料取自新华网财经新闻。实验结果表明:与不使用Web页面特征,仅用全文相比较,分类性能有所提高。  相似文献   

10.
杜翠茹 《大众科技》2010,(5):153-154
网页的布局方法是网页设计课程教学中比较重要的一个环节,网页布局关系着页面的排版和内容的编排,它直接影响对该网站网页信息的阅读及体现着网页的美观与否,同时,根据网页页面内容信息是通过网络传输的特点,网页的显示有一定的滞后性,这就需要设计出一种能准确、快速显示页面信息的网页,所以,网页的布局方法尤显重要。  相似文献   

11.
Aresearch group led by Prof. ZHAI Qiwei from the Institute for Nutritional Sciences under the CAS Shanghai Institutes for Biological Sciences has discovered that even relatively low doses of resveratrol--a chemical found in the skins of red grapes and in red wine--can improve the sensitivity of mice to the hormone insulin, according to a report in the October,2007 issue of Cell Metabolism. As insulin resistance is often characterized as the most critical factor contributing to the development of Type 2 diabetes, the findings“provide a potential new therapeutic approach for preventing or treating” both conditions, the researchers said.  相似文献   

12.
This study examined how students who had no prior experience with videoconferencing would react to the use of videoconferencing as an instructional medium. Students enrolled in seven different courses completed a questionnaire at the beginning of the semester and again at the end of the semester. Students at the origination and remote sites did not differ in their reactions toward videoconferencing but there was a significant difference for gender. Women reacted less favorably to videoconferencing. Compared to the beginning of the semester, students reported significantly less positive attitudes toward taking a course through videoconferencing at the end of the semester. There were no significant differences in students' attitudes toward videoconferencing across courses at the beginning of the semester but there were significant differences across the courses at the end of the semester. The results suggest the need for better preparation for both students and instructors.  相似文献   

13.
Ajoint study by Prof. ZHANG Zhibin from the CAS Institute of Zoology and his co-workers from Norway, US and Swiss have indicated that historical outbreaks of migratory locusts in China were associated with cold spells, suggesting that China's projected climate warming could decrease the pest's numbers. The study was published in Proceedings of theNational Academy of Sciences on 17 September, 2007.  相似文献   

14.
A computer-mediated group is a complex entity whose members exchange many types of information via multiple means of communication in pursuit of goals specific to their environment. Over time, they coordinate technical features of media with locally enacted use to achieve a viable working arrangement. To explore this complex interaction, a case study is presented of the social networks of interactions and media use among members of a class of computer-supported distance learners. Results show how group structures associated with project teams dominated who communicated with whom, about what, and via which media over the term, and how media came to occupy their own communication niches: Webboard for diffuse class-wide communication; Internet Relay Chat more to named others but still for general communication across the class; and e-mail primarily for intrateam communication. Face-to-face interaction, occurring only during a short on-campus session, appears to have had a catalytic effect on social and emotional exchanges. Results suggest the need to structure exchanges to balance class-wide sharing of ideas with subgroup interactions that facilitate project completion, and to provide media that support these two modes of interaction.  相似文献   

15.
CAS should stick to the principle of rendering service to, and giving impetus for, the development of China's science enterprise by making S&T innovations, said CAS President LU Yongxiang. The CAS president made the remarks in a recent talk to communicate the gist of the winter session of the Party's Leading-member Group at CAS, which was held from 7 to 11 January in Beijing.  相似文献   

16.
Active biological molecules and functional structures can be fabricated into a bio-mimetic system by using molecular assembly method. Such materials can be used for the drug delivery, disease diagnosis and therapy, and new nanodevice construction.  相似文献   

17.
Electronic data interchange (EDI) provides means for interorganizational communication, creates network externalities, requires an advanced information technology (IT) infrastructure, and relies on standards. In the diffusion of such innovations, institutional involvement is imperative. Such institutions contain governmental agencies, national and global standardization organizations, local government, and nonprofit private organizations like industry associations. The last type of organizations we call intermediating institutions. They intermediate or coordinate ("inscribe") the activities of a group of would-be adopters. Unfortunately, little is known of how these organizations shape the EDI diffusion trajectory. In this article we examine one specific type of intermediating organizations?industry associations?and how they advanced the EDI diffusion process in the grocery sectors of Hong Kong, Denmark and Finland. We identify six institutional measures, placed into a matrix formed by the mode of involvement (influence vs. regulation) and the type of diffusion force (supply push vs. demand pull), that can be mobilized to further the EDI diffusion. Industry associations were found to be active users of all these measures to varying degrees. Their role was critical especially in knowledge building, knowledge deployment, and standard setting. Furthermore, institutional involvement varied due to policy and cultural contingencies and power dependencies.  相似文献   

18.
With great care, Dr. ZHOU Zhonghe takes out a package wrapped by cotton tissue from a drawer and says: "This is the gem of our collections: the fossil of a bird that lived 125 million years ago!" Then, pointing at a tiny mound, he explains: "Look, this is the claw and that is the head. It was in the egg shell and ready to hatch ... The species fell into a family of waterside inhabitants."  相似文献   

19.
The increasing prospects of digital piracy has prompted the perceived need by electronic publishers to adopt technical systems of protection, and governments to reform their copyright laws. This article is a preliminary study of the management of intellectual property by electronic publishers, defined as those involved in the production of online databases, and CD-ROMs. It focuses on three main issues: (1) how electronic publishers view the increasing threat of piracy; (2) the methods of protection employed to protect intellectual property in digital format; and (3) the importance of technological protection of intellectual property in electronic publications. The analysis is based on a sample of 23 UK electronic publishers. The interviews revealed an interesting assortment of protection methods and did not show that technological protection was a preferred way. Instead, the means of protection, in addition to copyright law, comprised niche markets, pricing, trust, bad publicity, and nontechnical and technical means.  相似文献   

20.
This essay focuses on universal service and the Internet as means to support social and political participation. The emphasis on access to telecommunications systems in conventional approaches to universal service is contrasted with access to content. A model of the information environment is described that accounts for the roles of content and conduit, both of which are necessary conditions to achieve true access. A method is outlined for employing information indicators to observe or measure the information environment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号