Automatically assigned DDC number: 0063
Manually assigned DDC number: 00635
Number of references: 0
Title: Selective Sampling In Natural Language Learning
Author:
Author:
Subject: Ido Dagan,Sean P. Engelson Selective Sampling In Natural Language Learning
Description: Many corpus-based methods for natural language processing are based on supervised training, requiring expensive manual annotation of training corpora. This paper investigates reducing annotation cost by selective sampling. In this approach, the learner examines many unlabeled examples and selects for labeling only those that are most informative at each stage of training. In this way it is possible to avoid redundantly annotating examples that contribute little new information. The paper first analyzes the issues that need to be addressed when constructing a selective sampling algorithm, arguing for the attractiveness of committee-based sampling methods. We then focus on selective sampling for training probabilistic classifiers, which are commonly applied to problems in statistical natural language processing. We report experimental results of applying a specific type of committee-based sampling during training of a stochastic part-of-speech tagger, and demonstrate substantially improv...
Contributor: The Pennsylvania State University CiteSeer Archives
Publisher: unknown
Date: 1995-08-22
Pubyear: 1995
Format: ps
Identifier: http://citeseer.ist.psu.edu/140541.html
Source: http://www.cs.biu.ac.il:8080/~argamon/Access/Access/Access/Access/Access/Access/Access/Access/Access/ijcai-ml-nl95.ps.Z
Language: en
Rights: unrestricted
<?xml version="1.0" encoding="UTF-8"?>
<references_metadata>
<rec ID="SELF" Type="SELF" CiteSeer_Book="SELF" CiteSeer_Volume="SELF" Title="Selective Sampling In Natural Language Learning">
<identifier Org="ISBN:3790814369" Paper_ID="SELF" Extracted="3790814369" DDC="006.3" Normalized_DDC="0063" Normalized_Weight="1.0" />
</rec>
</references_metadata>