首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
This paper gives an overview of the archival issues that relate to digitally signed documents. First, by way of introduction, the advanced digital signature is presented briefly. In the second part, a number of problems are discussed that present themselves when a digital signature is used as a proof of authenticity and integrity for digital documents in general. In particular, it is also being investigated whether it makes any sense for the archivist to digitally sign all electronic records under his or her management. Problems relating to the (medium) long-term archiving of digitally signed documents are dealt with in the third part. After an overview of the sticking points for long-term validation (“Archival issues”) a number of possible solutions are discussed (“Solutions for long-term archiving”).
Filip BoudrezEmail:
  相似文献   

2.
This article is a general introduction into the special issue of Archival Science on “archiving research data”. It summarizes the different contributions and gives an overview of the main issues in this special field of archiving. One of the leading questions is how and why research data archives differ from public record offices. In the past, the developments in these two worlds have been rather separate. There are however signs that they are converging in the digital world. In particular, this can be seen in the areas of metadata and Internet dissemination as these are strongly influenced by the rapid changes in information technology. These changes have also led to important new developments in the infrastructure of research data to which special attention is paid. New concepts such as collaboratories, data curation, Open Access and the Open Archives Initiative are discussed.
Heiko TjalsmaEmail:
  相似文献   

3.
Evaluating the effectiveness of content-oriented XML retrieval methods   总被引:1,自引:0,他引:1  
Content-oriented XML retrieval approaches aim at a more focused retrieval strategy: Instead of retrieving whole documents, document components that are exhaustive to the information need while at the same time being as specific as possible should be retrieved. In this article, we show that the evaluation methods developed for standard retrieval must be modified in order to deal with the structure of XML documents. More precisely, the size and overlap of document components must be taken into account. For this purpose, we propose a new effectiveness metric based on the definition of a concept space defined upon the notions of exhaustiveness and specificity of a search result. We compare the results of this new metric by the results obtained with the official metric used in INEX, the evaluation initiative for content-oriented XML retrieval.
Gabriella KazaiEmail:
  相似文献   

4.
Smoothing of document language models is critical in language modeling approaches to information retrieval. In this paper, we present a novel way of smoothing document language models based on propagating term counts probabilistically in a graph of documents. A key difference between our approach and previous approaches is that our smoothing algorithm can iteratively propagate counts and achieve smoothing with remotely related documents. Evaluation results on several TREC data sets show that the proposed method significantly outperforms the simple collection-based smoothing method. Compared with those other smoothing methods that also exploit local corpus structures, our method is especially effective in improving precision in top-ranked documents through “filling in” missing query terms in relevant documents, which is attractive since most users only pay attention to the top-ranked documents in search engine applications.
ChengXiang ZhaiEmail:
  相似文献   

5.
With increasingly higher numbers of non-English language web searchers the problems of efficient handling of non-English Web documents and user queries are becoming major issues for search engines. The main aim of this review paper is to make researchers aware of the existing problems in monolingual non-English Web retrieval by providing an overview of open issues. A significant number of papers are reviewed and the research issues investigated in these studies are categorized in order to identify the research questions and solutions proposed in these papers. Further research is proposed at the end of each section.
Efthimis N. EfthimiadisEmail:
  相似文献   

6.
This article describes the first half century of the Communist government’s supervision and management of the central-government archives of the last two dynasties. Immediately with the Communist ascent to power in 1949, the new government took great interest in assembling and protecting the country’s archival documents, readying the Ming-Qing archives for access to scholars, and preparing for publication of selected materials. By the 1980s Beijing’s Number One Historical Archives, in charge of the largest holding of Ming-Qing documents, had become the first Chinese authority to complete a full sorting and preliminary catalogues for such a collection. Moreover, to facilitate searches, an attempt has recently begun to create a subject-heading system for these and other holdings in the country. In the first half century’s final decades, foreign researchers were admitted for the first time and tours and international exchanges began to take place.
Beatrice S. BartlettEmail:
  相似文献   

7.
Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a machine learning, sentence extraction-based summarisation technique. To find which features are more effective for producing summaries, this approach views sentence extraction as an ordering task. We evaluated our summarisation model using the INEX and SUMMAC datasets. The results demonstrate that the inclusion of features from the logical structure of documents increases the effectiveness of the summariser, and that the learnable system is also effective and well-suited to the task of summarisation in the context of XML documents. Our approach is generic, and is therefore applicable, apart from entire documents, to elements of varying granularity within the XML tree. We view these results as a step towards the intelligent summarisation of XML documents.
Mounia LalmasEmail:
  相似文献   

8.
To put an end to the large copyright trade deficit, both Chinese government agencies and publishing houses have been striving for entering the international publication market. The article analyzes the background of the going-global strategy, and sums up the performance of both Chinese administrations and publishers.
Qing Fang (Corresponding author)Email:
  相似文献   

9.
We adapt the cluster hypothesis for score-based information retrieval by claiming that closely related documents should have similar scores. Given a retrieval from an arbitrary system, we describe an algorithm which directly optimizes this objective by adjusting retrieval scores so that topically related documents receive similar scores. We refer to this process as score regularization. Because score regularization operates on retrieval scores, regardless of their origin, we can apply the technique to arbitrary initial retrieval rankings. Document rankings derived from regularized scores, when compared to rankings derived from un-regularized scores, consistently and significantly result in improved performance given a variety of baseline retrieval algorithms. We also present several proofs demonstrating that regularization generalizes methods such as pseudo-relevance feedback, document expansion, and cluster-based retrieval. Because of these strong empirical and theoretical results, we argue for the adoption of score regularization as general design principle or post-processing step for information retrieval systems.
Fernando DiazEmail:
  相似文献   

10.
This article analyzes current industry practices toward the identification of digital book content. It highlights key technology trends, workflow considerations and supply chain behaviors, and examines the implications of these trends and behaviors on the production, discoverability, purchasing and consumption of digital book products.
Andy WeissbergEmail:
  相似文献   

11.
This article examines the archival methods developed by Colbert to train his son in state administration. Based on Colbert’s correspondence with his son, it reveals the practices Colbert thought necessary to collect and manage information in his state encyclopedic archive during the last half of the 17th century.
Jacob SollEmail:
  相似文献   

12.
A review and analysis of the rules and regulations including the tax aspects of making an investment in India is presented. The full range from Foreign Direct Investment to different forms of doing business with specific examples from the publishing industry is explored to help understand current policies and regulations.
Sandeep ChauflaEmail: Email:
  相似文献   

13.
Modern retrieval test collections are built through a process called pooling in which only a sample of the entire document set is judged for each topic. The idea behind pooling is to find enough relevant documents such that when unjudged documents are assumed to be nonrelevant the resulting judgment set is sufficiently complete and unbiased. Yet a constant-size pool represents an increasingly small percentage of the document set as document sets grow larger, and at some point the assumption of approximately complete judgments must become invalid. This paper shows that the judgment sets produced by traditional pooling when the pools are too small relative to the total document set size can be biased in that they favor relevant documents that contain topic title words. This phenomenon is wholly dependent on the collection size and does not depend on the number of relevant documents for a given topic. We show that the AQUAINT test collection constructed in the recent TREC 2005 workshop exhibits this biased relevance set; it is likely that the test collections based on the much larger GOV2 document set also exhibit the bias. The paper concludes with suggested modifications to traditional pooling and evaluation methodology that may allow very large reusable test collections to be built.
Ellen VoorheesEmail:
  相似文献   

14.
Muniments and monuments: the dawn of archives as cultural patrimony   总被引:1,自引:1,他引:0  
Around 1800 the “paradigm of patrimony” recognized archives as cultural and national patrimony. That paradigm was, however, not a new revolutionary invention. It had been fostered by a “patrimony consciousness” which had developed in the seventeenth and eighteenth centuries. The value of archives as a patrimony to future generations was acknowledged first in the private sphere by families and then by cities—communities of memory becoming communities of archives.
Eric KetelaarEmail:

Eric Ketelaar   is Professor of Archivistics in the Department of Mediastudies of the Faculty of Humanities of the University of Amsterdam. He is Honorary Professor at Monash University, Melbourne (Faculty of Information Technology). He engages with the social history of archives by researching the history of recordkeeping and the use of records and archives, resulting in articles on thirteenth century Dordrecht, sixteenth century Leiden, the eighteenth century Court of Holland, Dutch public administration 1795–1950, and record creation in the context of systematic management in Dutch enterprise, 1870–1940. He is particularly interested in the relationship between recordkeeping and organizational, professional, and national cultures, past and present. This led him further to study the role of records and archives in times of oppression, war, liberation, and reconciliation.  相似文献   

15.
16.
Arabic documents that are available only in print continue to be ubiquitous and they can be scanned and subsequently OCR’ed to ease their retrieval. This paper explores the effect of context-based OCR correction on the effectiveness of retrieving Arabic OCR documents using different index terms. Different OCR correction techniques based on language modeling with different correction abilities were tested on real OCR and synthetic OCR degradation. Results show that the reduction of word error rates needs to pass a certain limit to get a noticeable effect on retrieval. If only moderate error reduction is available, then using short character n-gram for retrieval without error correction is not a bad strategy. Word-based correction in conjunction with language modeling had a statistically significant impact on retrieval even for character 3-grams, which are known to be among the best index terms for OCR degraded Arabic text. Further, using a sufficiently large language model for correction can minimize the need for morphologically sensitive error correction.
Kareem DarwishEmail:
  相似文献   

17.
This paper reviews the archival process at the Inter-university Consortium for Political and Social Research (ICPSR), a repository of digital social science data, and maps ICPSR’s Ingest and Access operations to the Open Archival Information System (OAIS) Reference Model. The paper also assesses ICPSR’s conformance with the archival responsibilities of “trusted” OAIS repositories, with the proviso that audit criteria for archival certification are still under development. The ICPSR to OAIS mapping exercise has benefits for the larger social science archiving community because it provides an interpretation of the reference model in the quantitative social science environment and points to preservation-related issues that may be salient for other social science archives. Building on the archives’ long tradition of shared norms and cooperation, we may ultimately be able to design a federated system of trusted social science repositories that provides access to the global heritage.
Cole WhitemanEmail:
  相似文献   

18.
A summary overview of the children’s and young adult publishing industry in China with a focus on the size of the market, ten major publishing houses, copyright and trends. Special emphasis has been placed on specific transaction for the sale of translation rights from German language publishers to China and minimal activities of German rights sold to Chinese publishers.
Jing BartzEmail:
  相似文献   

19.
Precision prediction based on ranked list coherence   总被引:1,自引:0,他引:1  
We introduce a statistical measure of the coherence of a list of documents called the clarity score. Starting with a document list ranked by the query-likelihood retrieval model, we demonstrate the score's relationship to query ambiguity with respect to the collection. We also show that the clarity score is correlated with the average precision of a query and lay the groundwork for useful predictions by discussing a method of setting decision thresholds automatically. We then show that passage-based clarity scores correlate with average-precision measures of ranked lists of passages, where a passage is judged relevant if it contains correct answer text, which extends the basic method to passage-based systems. Next, we introduce variants of document-based clarity scores to improve the robustness, applicability, and predictive ability of clarity scores. In particular, we introduce the ranked list clarity score that can be computed with only a ranked list of documents, and the weighted clarity score where query terms contribute more than other terms. Finally, we show an approach to predicting queries that perform poorly on query expansion that uses techniques expanding on the ideas presented earlier.
W. Bruce CroftEmail:
  相似文献   

20.
Through a reading of the archived letters of Henry Garnet (1555–1606), Superior of the Jesuit order in England and suspected Gunpowder plotter, this article investigates the nature of the archive in relation to narrative theory. Figuring the archive as one of the number of narrating voices accrued by the individual record, I argue that models of communication such as those put forward by Roman Jakobson, Wayne C. Booth and Seymour Chatman afford useful insights into the ways in which power is inscribed and reinscribed in the record through successive acts of reading and rewriting.
Paul WakeEmail:

Paul Wake   is a Senior Lecturer in English Literature at Manchester Metropolitan University. He is the author of Conrad’s Marlow (2007), editor, with Simon Malpas, of The Routledge Companion to Critical Theory (2006), and he has published articles on narrative theory and postmodernism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号