Page "Automatic summarization" Paragraph 21

« »

We also need to create features that describe the examples and are informative enough to allow a learning algorithm to discriminate keyphrases from non-keyphrases.

Typically features involve various term frequencies ( how many times a phrase appears in the current text or in a larger corpus ), the length of the example, relative position of the first occurrence, various boolean syntactic features ( e. g., contains all caps ), etc.

The Turney paper used about 12 such features.

Hulth uses a reduced set of features, which were found most successful in the KEA ( Keyphrase Extraction Algorithm ) work derived from Turney ’ s seminal paper.

Page 1 of 1.

2.100 seconds.

Most text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply.