基于XSLT的PDF论文元数据的优化抽取 Optimizing Extraction of Science Documents' Metadata in PDF Format Based on XSLT期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于XSLT的PDF论文元数据的优化抽取

引用本文：	陈俊林,张文德.基于XSLT的PDF论文元数据的优化抽取[J].现代图书情报技术,2007,2(2):18-23.

作者姓名：	陈俊林张文德

作者单位：	福州大学图书馆,福州,350002

摘要：	简述PDF信息抽取过程中采用的转换工具及抽取语言，简析PDFTOHTML格式转换后的中间文档，分析PDF科技论文首页元数据存在的问题，给出对以上问题的解决方案。
关键词：	PDF to HTML XSLT 元数据
收稿时间：	2006-11-10
修稿时间：	2006-11-10
Optimizing Extraction of Science Documents' Metadata in PDF Format Based on XSLT

Chen Junlin,Zhang Wende.Optimizing Extraction of Science Documents'''' Metadata in PDF Format Based on XSLT[J].New Technology of Library and Information Service,2007,2(2):18-23.

Authors:	Chen Junlin Zhang Wende

Institution:	Library of Fuzhou Uninversity, Fuzhou 350002, China

Abstract:	This paper firstly introduces a format transforming tool and XSLT which is the language used to produce extraction rules,then simply analyses the middle documents generated from PDF to HTML.Thirdly,discusses the problem of metadata existed in the science documents in PDF format,finally gives the methods to solve this problem.

Keywords:	PDF PDF to HTML XSLT
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《现代图书情报技术》浏览原始摘要信息
	点击此处可从《现代图书情报技术》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏