正在加载图片...
Industrial and Government Applications Track Paper concentrating on incremental work in mature 3. Revaide We have deployed a prototype system, Revaide, within NSF 3. If unrestricted text is allowed as descriptions for that addresses the problems with previous fully autonomous expertise, it is rare that potenti ystems. The philosophy behind the system is to assist program ogram directors, and proposal authors all select directors and not replace their judgment with a x system the same free text terms. Numerous studies of One key design criteria is that Revaide offers suggestions that have found low may be accepted or declined individually. In this section, we agreement among individuals assigning keywords introduce Revaide, its tasks and solution, and evaluate the utility to content(e.g, [8) of using Revaide. We introduce a measure to evaluate how well of reviewers is suited for a proposal 4. There is not high compliance with requests of users Following the discussion of the key components of Revaide in to enter information into the database. Many this section, we will report on the experiences using the researchers are too usy to fill out forms or algorithm agreeing to review proposals is a service to the funding agency, being asked to review proposals is 3.1 Representing Proposals as welcome to some as other forms of service such Proposals are submitted to NSF in PDF form. Revaide as serving on jury duty converts the proposals to ASCll and represents proposals in the standard TF-IDF vector space [10] as term vectors in the space of 5. The interface for submitting proposals to NSF, all words in the document collection. The entire proposal is used Fastlane, does not allow keywords to be entered ncluding the references and resume of the investigator. On describing the proposals. While this could be simple use of Revaide is to annotate spreadsheets of proposals added to the interface, doing so would require with the 20 terms with highest TF-IDF weights. These keywords onsensus that this will facilitate proposal handling are often more informative to program directors than the title to and this has not been demonstrated convincingly determine what a proposal is about. While early versions of whe Due to the limitations of keyword-based database systems, Revaide used stemming [ll to convert words to root forms, w found that stemming reduced the human comprehensibility of th when they are used within NSF, they are limited to suggesting a resulting term vector representation. Experience showed that and Information Science and Engineering at NSF has using stemming did not increase the quality of the suggestions experimented with a keyword system (e.g, in the 2001 ITR made by Revaide. Therefore, we no longer use stemming competition), it was not used in subsequent years One other enhancement also increased the comprehensibility Finally, NSF has experimented with systems that allow of the resulting term representation. We augmented the stoplist of panelists to indicate preferences for reviewing proposals within a Items that should not be used as keywords. While most stoplists include common words such as articles and prepositions, we reviewing a proposal on a numeric scale. Many conferences also augmented the stoplist to include words that appeared in use similar systems such as Cyberchair [9]. In Cyberchair, a proposals that were not descriptive of the proposal content, constraint satisfaction algorithm assigns people proposals they are including the e-mail addresses of Pls and the name and city of the ost interested in. These systems only address part of the university. These words frequently occur within a few proposals and not in many others giving them high TF-IDF weights, but panelists but only assigning proposals to panelists once they have they contused program directors when used as keywords and these systems as well, i.e., not every panelist promptly enters An example will illustrate the representation used by preferences data and a single person not replying can delay the Revaide for one proposal. The terms with the highest weights and assignments for all others. In addition, it isn't clear what the their weights were image: 0.031, judgments: 0.028 preference scores mean or how much thought goes into th feedback: 0.027, relevance: 0.026, multimodal assignments. While the intent is to judge how well qualified 0.020, retrieval: 0.019, and preference: 0.017 reviewer is to review a proposal, we have observed many To preserve the privacy of the submitter, we cannot provide the panelists having a strong preference for proposals by well known title or abstract, but we find that the automatically extracted researchers and fewer having a preference for proposals by less keywords do indeed provide a compact representation that makes established researchers. While NSF typically asks for preferences sense to program directors and provides a basis to assist on 20-30 proposals, some conferences ask for preference data on reviewers 200-300 papers. The second author admits that when presented with 300 papers in Cyberchair, not as much time is spent 3.2 Representing Reviewer Expertise reviewing the abstracts of the last batch of papers as the first to determine preferences. Finally, there is also a problem with Revaide represents the expertise of a reviewer with the TF multidisciplinary proposals if people from one discipline have IDF representation of the proposals they have submitted to Nsf in reference for a paper. It can occur that all computer scientists the past. While it would be possible to use published papers of and no biologists give high preference scores to a bioinformatics authors downloaded from Citeseer [12] or Google Scholar as proposal, in which case a preference-based system will result in of expertise, there are advantages in using NSF one aspect of the proposal not being reviewed proposals in a practical system deployed at NSF. First, all proposals are similar in style and length. These conditions arconcentrating on incremental work in mature fields. 3. If unrestricted text is allowed as descriptions for expertise, it is rare that potential reviewers, program directors, and proposal authors all select the same free text terms. Numerous studies of information retrieval systems have found low agreement among individuals assigning keywords to content (e.g., [8]). 4. There is not high compliance with requests of users to enter information into the database. Many researchers are too busy to fill out forms or hesitant to “volunteer” for reviewing. While agreeing to review proposals is a service to the funding agency, being asked to review proposals is as welcome to some as other forms of service such as serving on jury duty. 5. The interface for submitting proposals to NSF, Fastlane, does not allow keywords to be entered describing the proposals. While this could be added to the interface, doing so would require consensus that this will facilitate proposal handling and this has not been demonstrated convincingly. Due to the limitations of keyword-based database systems, when they are used within NSF, they are limited to suggesting a pool of candidates for a panel on a given topic. While Computer and Information Science and Engineering at NSF has experimented with a keyword system (e.g., in the 2001 ITR competition), it was not used in subsequent years. Finally, NSF has experimented with systems that allow panelists to indicate preferences for reviewing proposals within a panel. In such systems, panelists indicate their preference for reviewing a proposal on a numeric scale. Many conferences also use similar systems such as Cyberchair [9]. In Cyberchair, a constraint satisfaction algorithm assigns people proposals they are most interested in. These systems only address part of the reviewer assignment problem. They do not assist with identifying panelists but only assigning proposals to panelists once they have been identified. There has been an issue with compliance on these systems as well, i.e., not every panelist promptly enters preferences data and a single person not replying can delay the assignments for all others. In addition, it isn’t clear what the preference scores mean or how much thought goes into the assignments. While the intent is to judge how well qualified a reviewer is to review a proposal, we have observed many panelists having a strong preference for proposals by well known researchers and fewer having a preference for proposals by less established researchers. While NSF typically asks for preferences on 20-30 proposals, some conferences ask for preference data on 200-300 papers. The second author admits that when presented with 300 papers in Cyberchair, not as much time is spent reviewing the abstracts of the last batch of papers as the first to determine preferences. Finally, there is also a problem with multidisciplinary proposals if people from one discipline have a preference for a paper. It can occur that all computer scientists and no biologists give high preference scores to a bioinformatics proposal, in which case a preference-based system will result in one aspect of the proposal not being reviewed. 3. Revaide We have deployed a prototype system, Revaide, within NSF that addresses the problems with previous fully autonomous systems. The philosophy behind the system is to assist program directors and not replace their judgment with a black box system. One key design criteria is that Revaide offers suggestions that may be accepted or declined individually. In this section, we introduce Revaide, its tasks and solution, and evaluate the utility of using Revaide. We introduce a measure to evaluate how well the expertise of a group of reviewers is suited for a proposal. Following the discussion of the key components of Revaide in this section, we will report on the experiences using the algorithm. 3.1 Representing Proposals Proposals are submitted to NSF in PDF form. Revaide converts the proposals to ASCII and represents proposals in the standard TF-IDF vector space [10] as term vectors in the space of all words in the document collection. The entire proposal is used including the references and resume of the investigator. One simple use of Revaide is to annotate spreadsheets of proposals with the 20 terms with highest TF-IDF weights. These keywords are often more informative to program directors than the title to determine what a proposal is about. While early versions of Revaide used stemming [11] to convert words to root forms, we found that stemming reduced the human comprehensibility of the resulting term vector representation. Experience showed that using stemming did not increase the quality of the suggestions made by Revaide. Therefore, we no longer use stemming. One other enhancement also increased the comprehensibility of the resulting term representation. We augmented the stoplist of items that should not be used as keywords. While most stoplists include common words such as articles and prepositions, we augmented the stoplist to include words that appeared in proposals that were not descriptive of the proposal content, including the e-mail addresses of PIs and the name and city of the university. These words frequently occur within a few proposals and not in many others giving them high TF-IDF weights, but they confused program directors when used as keywords and degraded the quality of Revaide’s suggestions. An example will illustrate the representation used by Revaide for one proposal. The terms with the highest weights and their weights were image: 0.031, judgments: 0.028, feedback: 0.027, relevance: 0.026, multimodal: 0.020, retrieval: 0.019, and preference: 0.017. To preserve the privacy of the submitter, we cannot provide the title or abstract, but we find that the automatically extracted keywords do indeed provide a compact representation that makes sense to program directors and provides a basis to assist reviewers. 3.2 Representing Reviewer Expertise Revaide represents the expertise of a reviewer with the TF￾IDF representation of the proposals they have submitted to NSF in the past. While it would be possible to use published papers of authors downloaded from Citeseer [12] or Google Scholar as measures of expertise, there are advantages in using NSF proposals in a practical system deployed at NSF. First, all proposals are similar in style and length. These conditions are 864 Industrial and Government Applications Track Paper
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有