首页 | 本学科首页   官方微博 | 高级检索  
     检索      

GFL:用于族性化学结构的标引图形形式语言
引用本文:孙艳玲,张迪,杨素言,苏向银,高铭,蒋克侠,蒋淑梅,孙旭,王昕,刘化冰,甘霖,徐峻.GFL:用于族性化学结构的标引图形形式语言[J].情报学报,2007(2):253-259.
作者姓名:孙艳玲  张迪  杨素言  苏向银  高铭  蒋克侠  蒋淑梅  孙旭  王昕  刘化冰  甘霖  徐峻
作者单位:1. 国家知识产权局知识产权出版社专利数据研发中心,北京,100088
2. 中南大学化学化工学院,长沙,410083;DPI, 9640 Towne Centre Drive,San Diego, California 92121, USA
基金项目:国家高技术研究发展计划(863计划)
摘    要:为了满足日益增长的对专利检索的需求,国家高技术研究发展计划(863计划)启动了族性化学结构数据库系统的研究与开发。族性化学结构数据库系统主要涉及两方面的关键技术:(1)族性化学结构的计算机表达, (2)族性化学结构的检索算法。本文主要讨论族性化学结构的计算机表达。存在于化学专利原始文献中的族性化学结构是用具有一定规范的自然语言表述的。为了能在计算机系统中储存与检索这些信息,自然语言表述的族性化学结构必须转换为计算机可以接受的无歧义的形式语言。这个过程叫做族性化学结构的标引。国际上一般采用的基于结构片断的族性化学结构标引形式语言开发于20世纪70~80年代,这种形式语言与化学家采用的图形自然语言相去甚远,标引速度慢,成本高。本文介绍在ISIS/Draw绘图功能基础上发展起来的标引族性化学结构的图形形式语言,它的主要特点是与化学家日常使用的图形自然语言接近,规则简单易于掌握,从而提高标引效率,降低族性化学结构数据库系统的实现成本。

关 键 词:族性化学结构  马库什结构  标引  图形形式语言  计算机检索
修稿时间:2006年4月11日

GFL: A Graphic Formal Language for Markush Structure Indexing
Sun Yanling,Zhang Di,Yang Suyan,Su Xiangyin,Gao Ming,Jiang Kexia,Jiang Shumei,Sun Xu,Wang Xin,Liu Huabing,Gan Lin,Xu Jun.GFL: A Graphic Formal Language for Markush Structure Indexing[J].Journal of the China Society for Scientific andTechnical Information,2007(2):253-259.
Authors:Sun Yanling  Zhang Di  Yang Suyan  Su Xiangyin  Gao Ming  Jiang Kexia  Jiang Shumei  Sun Xu  Wang Xin  Liu Huabing  Gan Lin  Xu Jun
Abstract:The State Intellectual Patent Office of P.R.C receives a huge amount of chemical patent applications each year. Academies and enterprises have to search a large number of chemical patents in order to protect their own intellectual properties, and make use of known technology.In 2004,the National High Technology Research and Development Program of China initiated the project of generic chemical structure database as a solution to the chemical patent process challenges.Two core technologies of this project are:(1)Computer representation of generic chemical structure,(2)Retrieval algorithm of generic chemical structure. This article presents new protocols to represent genetic chemical structures in a computer system.A generic chemical structure in a chemical patent is described in natural language,which is not well defined.Such natural language has to be fonnahzed in order to be stored,exchanged,and searchable in a database system.The formalized language is called a formal language.An indexing process is to translate a chemical patent in natural language to the patent in a formal language.A number of formal languages for genetic chemical structure have been reported in past years.Most of them are based upon the concept of chemical structure fragmentation.The main disadvantages of these languages are(1)syntaxes are too complicated to learn,(2)the rules are too different from natural chemical language,and hard to understand.These problems make the chemical patent indexing process very costly.In this paper,we propose a novel formal language to represent generic chemical structures,which are close to natural chemical language;syntax rules are concise and easy to learn.Therefore,the new formal language is well received in our chemical patent indexing process in SIPO(State Intellectual Property Office).
Keywords:generic chemical structure  Markush structure  indexing  graphic formal language  Markush database
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号