Keyword extraction by nonextensivity measure

Ali Mehri and Amir H. Darooneh
Phys. Rev. E 83, 056106 – Published 10 May 2011

Abstract

The presence of a long-range correlation in the spatial distribution of a relevant word type, in spite of random occurrences of an irrelevant word type, is an important feature of human-written texts. We classify the correlation between the occurrences of words by nonextensive statistical mechanics for the word-ranking process. In particular, we look at the nonextensivity parameter as an alternative metric to measure the spatial correlation in the text, from which the words may be ranked in terms of this measure. Finally, we compare different methods for keyword extraction.

    • Received 9 August 2010

    DOI:https://doi.org/10.1103/PhysRevE.83.056106

    ©2011 American Physical Society

    Authors & Affiliations

    Ali Mehri* and Amir H. Darooneh

    • Department of Physics, Zanjan University, Zanjan, Iran

    • *alimehri@znu.ac.ir
    • darooneh@znu.ac.ir

    Article Text (Subscription Required)

    Click to Expand

    References (Subscription Required)

    Click to Expand
    Issue

    Vol. 83, Iss. 5 — May 2011

    Reuse & Permissions
    Access Options
    Author publication services for translation and copyediting assistance advertisement

    Authorization Required


    ×
    ×

    Images

    ×

    Sign up to receive regular email alerts from Physical Review E

    Log In

    Cancel
    ×

    Search


    Article Lookup

    Paste a citation or DOI

    Enter a citation
    ×