期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Modeling context through domain ontologies 总被引：1，自引：0，他引：1

Nathalie Hernandez Josiane Mothe Claude Chrisment Daniel Egret 《Information Retrieval》2007,10(2):143-172

Traditional information retrieval systems aim at satisfying most users for most of their searches, leaving aside the context in which the search takes place. We propose to model two main aspects of context: The themes of the user's information need and the specific data the user is looking for to achieve the task that has motivated his search. Both aspects are modeled by means of ontologies. Documents are semantically indexed according to the context representation and the user accesses information by browsing the ontologies. The model has been applied to a case study that has shown the added value of such a semantic representation of context.

Daniel EgretEmail:

相似文献

2.

Evaluating the effectiveness of content-oriented XML retrieval methods 总被引：1，自引：0，他引：1

Norbert Gövert Norbert Fuhr Mounia Lalmas Gabriella Kazai 《Information Retrieval》2006,9(6):699-722

Content-oriented XML retrieval approaches aim at a more focused retrieval strategy: Instead of retrieving whole documents, document components that are exhaustive to the information need while at the same time being as specific as possible should be retrieved. In this article, we show that the evaluation methods developed for standard retrieval must be modified in order to deal with the structure of XML documents. More precisely, the size and overlap of document components must be taken into account. For this purpose, we propose a new effectiveness metric based on the definition of a concept space defined upon the notions of exhaustiveness and specificity of a search result. We compare the results of this new metric by the results obtained with the official metric used in INEX, the evaluation initiative for content-oriented XML retrieval.

Gabriella KazaiEmail:

相似文献

3.

How to manage an information state: Jean-Baptiste Colbert’s archives and the education of his son

Jacob Soll 《Archival Science》2007,7(4):331-342

This article examines the archival methods developed by Colbert to train his son in state administration. Based on Colbert’s correspondence with his son, it reveals the practices Colbert thought necessary to collect and manage information in his state encyclopedic archive during the last half of the 17th century.

Jacob SollEmail:

相似文献

4.

Canadian Social Science and Humanities Online Journal Publishing,the Synergies Project,and the Creation and Representation of Knowledge

Rowland Lorimer John Maxwell 《Publishing Research Quarterly》2007,23(3):175-193

相似文献

5.

Multilingual phrase-based concordance generation in real-time

Kumiko Tanaka-Ishii Yuichiro Ishii 《Information Retrieval》2007,10(3):275-295

We present software that generates phrase-based concordances in real-time based on Internet searching. When a user enters a string of words for which he wants to find concordances, the system sends this string as a query to a search engine and obtains search results for the string. The concordances are extracted by performing statistical analysis on search results and then fed back to the user. Unlike existing tools, this concordance consultation tool is language-independent, so concordances can be obtained even in a language for which there are no well-established analytical methods. Our evaluation has revealed that concordances can be obtained more effectively than by only using a search engine directly.

Yuichiro IshiiEmail:

相似文献

6.

Chinese Publishing Industry Going Global: Background and Performance

Lifang Xu Qing Fang 《Publishing Research Quarterly》2008,24(1):64-72

To put an end to the large copyright trade deficit, both Chinese government agencies and publishing houses have been striving for entering the international publication market. The article analyzes the background of the going-global strategy, and sums up the performance of both Chinese administrations and publishers.

Qing Fang (Corresponding author)Email:

相似文献

7.

The Identification of Digital Book Content

Andy Weissberg 《Publishing Research Quarterly》2008,24(4):255-260

This article analyzes current industry practices toward the identification of digital book content. It highlights key technology trends, workflow considerations and supply chain behaviors, and examines the implications of these trends and behaviors on the production, discoverability, purchasing and consumption of digital book products.

Andy WeissbergEmail:

相似文献

8.

Electronic Books in the 2003–2005 Period: Some Reflections on Their Apparent Potential and Actual Development

Gemma Towle James A. Dearnley Cliff McKnight 《Publishing Research Quarterly》2007,23(2):95-104

This paper, based on PhD research, reflects upon the market for electronic books in the general trade sectors of UK and US publishers during the early years of the 21st century. The paper reports on interviews carried out with publishers between 2003 and 2005, and reflects upon four areas which presented and still present challenges to the uptake of e-books—negative perceptions from consumers; formats; pricing and issues regarding digital rights. The paper concludes that the development and uptake of electronic books has some way to go in the general trade/mass-market sectors.

Cliff McKnightEmail:

相似文献

9.

International Investments and Acquisitions in India: Tax and Regulatory Aspects

Sandeep Chaufla 《Publishing Research Quarterly》2008,24(3):187-201

A review and analysis of the rules and regulations including the tax aspects of making an investment in India is presented. The full range from Foreign Direct Investment to different forms of doing business with specific examples from the publishing industry is explored to help understand current policies and regulations.

Sandeep ChauflaEmail: Email:

相似文献

10.

Chinese Children’s Book Market and the German Experiences in Cooperation with Chinese Publishers

Bartz Jing 《Publishing Research Quarterly》2008,24(1):73-78

A summary overview of the children’s and young adult publishing industry in China with a focus on the size of the market, ten major publishing houses, copyright and trends. Special emphasis has been placed on specific transaction for the sale of translation rights from German language publishers to China and minimal activities of German rights sold to Chinese publishers.

Jing BartzEmail:

相似文献

11.

The long-term preservation of identifiable personal data: a comparative archival perspective on privacy regulatory models in the European Union,Australia, Canada and the United States

Livia Iacovino Malcolm Todd 《Archival Science》2007,7(1):107-127

This article analyses the extent to which archival exemptions for historical, scientific and statistical research in privacy legislation support preservation in selected European Union countries, and comparable aspects of Australian, American and Canadian law within a legal, ethical and digital archival perspective. The authors recommend that the further processing of personal data under data protection law be given a wider scope of interpretation for archival preservation purposes in both the public and private sector, coupled with the use of researcher and archival codes in relation to access to personal data. They also recommend early appraisal and integration of privacy with freedom of information and archival regimes.

Malcolm ToddEmail:

相似文献

12.

Teaching mathematics for search using a tutorial style of delivery

Andrew MacFarlane 《Information Retrieval》2009,12(2):162-178

Understanding of mathematics is needed to underpin the process of search, either explicitly with Exact Match (Boolean logic, adjacency) or implicitly with Best match natural language search. In this paper we outline some pedagogical challenges in teaching mathematics for information retrieval (IR) to postgraduate information science students. The aim is to take these challenges either found by experience or in the literature, to identify both theoretical and practical ideas in order to improve the delivery of the material and positively affect the learning of the target audience by using a tutorial style of teaching. Results show that there is evidence to support the notion that a more pro-active style of teaching using tutorials yield benefits both in terms of assessment results and student satisfaction.

Andrew MacFarlaneEmail:

相似文献

13.

Extending WHIRL with background knowledge for improved text classification

Sarah Zelikovitz William W. Cohen Haym Hirsh 《Information Retrieval》2007,10(1):35-67

Intelligent use of the many diverse forms of data available on the Internet requires new tools for managing and manipulating heterogeneous forms of information. This paper uses WHIRL, an extension of relational databases that can manipulate textual data using statistical similarity measures developed by the information retrieval community. We show that although WHIRL is designed for more general similarity-based reasoning tasks, it is competitive with mature systems designed explicitly for inductive classification. In particular, WHIRL is well suited for combining different sources of knowledge in the classification process. We show on a diverse set of tasks that the use of appropriate sets of unlabeled background knowledge often decreases error rates, particularly if the number of examples or the size of the strings in the training set is small. This is especially useful when labeling text is a labor-intensive job and when there is a large amount of information available about a particular problem on the World Wide Web.

Haym HirshEmail:

相似文献

14.

Celebrating Book Culture: The Aims and Outcomes of UNESCOs World Book and Copyright Day in Europe

Carlota Larrea Alexis Weedon 《Publishing Research Quarterly》2007,23(3):224-234

World Book and Copyright Day was established by a resolution of the 28th General Council of UNESCO in 1995. Its avowed aim was ‘to pay a world-wide tribute to books and authors on this date, encouraging everyone, and in particular young people, to discover the pleasure of reading and gain a renewed respect for the irreplaceable contributions of those who have furthered the social and cultural progress of humanity.’ This article examines the context for World Book and Copyright Day, the extent to which cultural and commercial interests have converged in the activities of the day and argues that an analysis of the activities of the day reveal a specifically European attitude to book culture.

Alexis WeedonEmail:

相似文献

15.

Query structuring and expansion with two-stage term dependence for Japanese web retrieval 总被引：1，自引：1，他引：0

Koji Eguchi W. Bruce Croft 《Information Retrieval》2009,12(3):251-274

In this paper, we propose a new term dependence model for information retrieval, which is based on a theoretical framework using Markov random fields. We assume two types of dependencies of terms given in a query: (i) long-range dependencies that may appear for instance within a passage or a sentence in a target document, and (ii) short-range dependencies that may appear for instance within a compound word in a target document. Based on this assumption, our two-stage term dependence model captures both long-range and short-range term dependencies differently, when more than one compound word appear in a query. We also investigate how query structuring with term dependence can improve the performance of query expansion using a relevance model. The relevance model is constructed using the retrieval results of the structured query with term dependence to expand the query. We show that our term dependence model works well, particularly when using query structuring with compound words, through experiments using a 100-gigabyte test collection of web documents mostly written in Japanese. We also show that the performance of the relevance model can be significantly improved by using the structured query with our term dependence model.

Koji EguchiEmail:

相似文献

16.

On knowledge-poor methods for person name matching and lemmatization for highly inflectional languages 总被引：1，自引：1，他引：0

Jakub Piskorski Karol Wieloch Marcin Sydow 《Information Retrieval》2009,12(3):275-299

Web person search is one of the most common activities of Internet users. Recently, a vast amount of work on applying various NLP techniques for person name disambiguation in large web document collections has been reported, where the main focus was on English and few other major languages. This article reports on knowledge-poor methods for tackling person name matching and lemmatization in Polish, a highly inflectional language with complex person name declension paradigm. These methods apply mainly well-established string distance metrics, some new variants thereof, automatically acquired simple suffix-based lemmatization patterns and some combinations of the aforementioned techniques. Furthermore, we also carried out some initial experiments on deploying techniques that utilize the context, in which person names appear. Results of numerous experiments are presented. The evaluation carried out on a data set extracted from a corpus of on-line news articles revealed that achieving lemmatization accuracy figures greater than 90% seems to be difficult, whereas combining string distance metrics with suffix-based patterns results in 97.6–99% accuracy for the name matching task. Interestingly, no significant additional gain could be achieved through integrating some basic techniques, which try to exploit the local context the names appear in. Although our explorations were focused on Polish, we believe that the work presented in this article constitutes practical guidelines for tackling the same problem for other highly inflectional languages with similar phenomena.

Marcin SydowEmail:

相似文献

17.

Consumer Magazines in Argentina: A Market to Recover

Ethel Alejandra Pis Diez 《Publishing Research Quarterly》2007,23(3):194-209

相似文献

18.

Publishing in Scotland: Reviewing the Fragile Revival

Alistair McCleery Marion Sinclair Linda Gunn 《Publishing Research Quarterly》2008,24(2):87-97

A comparison of analyses of the Scottish publishing industry carried out in 1992, 2002 and 2007 underscores the fragility of the sector within a small country within the English-language community. A number of indices reveal either stability or stagnation and the picture emerges of the remarkable tenacity of publishing in Scotland. Although there is already a significant and vital element of state support for publishing in Scotland, further intervention will be necessary to ensure fulfilment of its potential.

Alistair McCleeryEmail:

相似文献

19.

Collection-based compound noun segmentation for Korean information retrieval

In-Su Kang Seung-Hoon Na Jong-Hyeok Lee 《Information Retrieval》2006,9(5):613-631

Compound noun segmentation is a key first step in language processing for Korean. Thus far, most approaches require some form of human supervision, such as pre-existing dictionaries, segmented compound nouns, or heuristic rules. As a result, they suffer from the unknown word problem, which can be overcome by unsupervised approaches. However, previous unsupervised methods normally do not consider all possible segmentation candidates, and/or rely on character-based segmentation clues such as bi-grams or all-length n-grams. So, they are prone to falling into a local solution. To overcome the problem, this paper proposes an unsupervised segmentation algorithm that searches the most likely segmentation result from all possible segmentation candidates using a word-based segmentation context. As word-based segmentation clues, a dictionary is automatically generated from a corpus. Experiments using three test collections show that our segmentation algorithm is successfully applied to Korean information retrieval, improving a dictionary-based longest-matching algorithm.

Jong-Hyeok LeeEmail:

相似文献

20.

A new unsupervised method for document clustering by using WordNet lexical and conceptual relations

Diego Reforgiato Recupero 《Information Retrieval》2007,10(6):563-579

Text document clustering provides an effective and intuitive navigation mechanism to organize a large amount of retrieval results by grouping documents in a small number of meaningful classes. Many well-known methods of text clustering make use of a long list of words as vector space which is often unsatisfactory for a couple of reasons: first, it keeps the dimensionality of the data very high, and second, it ignores important relationships between terms like synonyms or antonyms. Our unsupervised method solves both problems by using ANNIE and WordNet lexical categories and WordNet ontology in order to create a well structured document vector space whose low dimensionality allows common clustering algorithms to perform well. For the clustering step we have chosen the bisecting k-means and the Multipole tree, a modified version of the Antipole tree data structure for, respectively, their accuracy and speed.

Diego Reforgiato RecuperoEmail:

相似文献