ACL 2010: The 48th Annual Meeting of the Association for Computational
Linguistics

Review form for SHORT EMPIRICAL/DATA-DRIVEN research papers

This review form is appropriate for papers that present research that
is evaluated on a corpus of data.


APPROPRIATENESS (1-5)

Does the paper fit in ACL 2010 as a short paper? Does this paper have
any particular attributes that make it suitable for being a short
paper? (Please answer this question in light of the desire to broaden
the scope of the research areas represented at ACL, and with reference
to the examples listed in the Call for Papers as being suitable for
short papers: reporting smaller experiments; describing
work-in-progress; single-author position papers; challenge papers;
descriptions of new language resources or evaluation methodologies;
presenting negative results.)

5 = Certainly.
4 = Probably.
3 = Unsure.
2 = Probably not.
1 = Certainly not. 


CLARITY (1-5)

For the reasonably well-prepared reader, is it clear what was done and
why? Is the paper well-written and well-structured?

5 = Very clear.
4 = Understandable by most readers.
3 = Mostly understandable to me with some effort.
2 = Important questions were hard to resolve even with effort.
1 = Much of the paper is confusing. 


ORIGINALITY / INNOVATIVENESS (1-5)

How original is the approach? Does this paper break new ground in
topic, methodology, or content? How exciting and innovative is the
research it describes?

Note that a paper could score high for originality even if the results
do not show a convincing benefit.

5 = Seminal: Significant new problem, technique, methodology, or
    insight -- no prior research has attempted something similar.
4 = Creative: An intriguing problem, technique, or approach
    that is substantially different from previous research. 
3 = Respectable: A nice research contribution that represents a
    significant extension of prior approaches or methodologies.
2 = Pedestrian: Obvious, or a minor improvement on familiar
    techniques.
1 = Significant portions have actually been done before or done
    better.


SOUNDNESS / CORRECTNESS (1-5)

First, is the technical approach sound and well-chosen? Second, can
one trust the claims of the paper -- are they supported by proper
experiments and are the results of the experiments correctly
interpreted?

5 = The approach is very apt, and the claims are convincingly supported.
4 = Generally solid work, although there are some aspects of the
    approach or evaluation I am not sure about.
3 = Fairly reasonable work. The approach is not bad, and at least the
    main claims are probably correct, but I am not entirely ready to
    accept them (based on the material in the paper).
2 = Troublesome. There are some ideas worth salvaging here, but the
    work should really have been done or evaluated differently.
1 = Fatally flawed. 


MEANINGFUL COMPARISON (1-5)

Does the author make clear where the problems and methods sit with
respect to existing literature? Are the references adequate? Are the
experimental results meaningfully compared with the best prior
approaches?

5 = Precise and complete comparison with related work. Good job given
    the space constraints.
4 = Mostly solid bibliography and comparison, but there are some
    references missing.
3 = Bibliography and comparison are somewhat helpful, but it could be
    hard for a reader to determine exactly how this work relates to
    previous work.
2 = Only partial awareness and understanding of related work, or a
    flawed empirical comparison.
1 = Little awareness of related work, or lacks necessary empirical
    comparison.


IMPACT OF IDEAS OR RESULTS (1-5)

How significant is the work described? If the ideas are novel, will
they also be useful or inspirational? If the results are sound, are
they also important? Does the paper bring new insights into the nature
of the problem?

5 = Will affect the field by altering other people's choice of
    research topics or basic approach.
4 = Some of the ideas or results will substantially help other
    people's ongoing research.
3 = Interesting but not too influential. The work will be cited, but
    mainly for comparison or as a source of minor contributions.
2 = Marginally interesting. May or may not be cited.
1 = Will have no impact on the field.


REPLICABILITY (1-5)

Will members of the ACL community be able to reproduce or verify the
results in this paper?

Members of the ACL community:

5 = could easily reproduce the results.
4 = could mostly reproduce the results, but there may be some
    variation because of sample variance or minor variations in their
    interpretation of the protocol or method.
3 = could reproduce the results with some difficulty. The settings of
    parameters are underspecified or subjectively determined; the
    training/evaluation data are not widely available.
2 = would be hard pressed to reproduce the results. The contribution
    depends on data that are simply not available outside the author's
    institution or consortium; not enough details are provided.
1 = could not reproduce the results here no matter how hard they
    tried.


IMPACT OF RESOURCES (1-5)

In addition to its direct intellectual contributions, does the paper
promise to release any new resources, such as an implementation, a
toolkit, or new data?

If so, is it clear what will be released and when? Will these
resources be valuable to others in the form in which they are
released? Do they fill an unmet need? Are they at least sufficient to
replicate or better understand the research in the paper?

(This question encourages authors to help the field advance, by
releasing their systems, data, or tools.)

5 = Enabling: The newly released resources should affect other
    people's choice of research or development projects to undertake.
4 = Useful: I would recommend the new resources to other researchers
    or developers for their ongoing work.
3 = Potentially useful: Someone might find the new resources useful
    for their work.
2 = Documentary: The new resources are useful to study or replicate
    the reported research, although for other purposes they may have
    limited interest or limited usability. (Still a positive rating)
1 = No usable resources released.


RECOMMENDATION (1-6)

There are many good submissions competing for slots at ACL 2010; how
important is it to feature this one? Will people learn a lot by
reading this paper or seeing it presented?

In deciding on your ultimate recommendation, please think over all
your scores above. But remember that no paper is perfect, and remember
that we want a conference full of interesting, diverse, and timely
work. If a paper has some weaknesses, but you really got a lot out of
it, feel free to fight for it. If a paper is solid but you could live
without it, let us know that you're ambivalent. Remember also that the
author has a few weeks to address reviewer comments before the
camera-ready deadline.

Should the paper be accepted or rejected?

6 = Exciting: I'd fight to get it accepted; probably would be one
              of the best short papers at the conference.
5 = Strong: I'd like to see it accepted; it will be one of the
            better short papers at the conference.
4 = Worthy: A good short paper that is worthy of being presented at ACL.
3 = Ambivalent: OK but does not seem up to the standards of ACL.
2 = Leaning against: I'd rather not see it in the conference.
1 = Poor: I'd fight to have it rejected.


REVIEWER CONFIDENCE (1-5)

5 = Positive that my evaluation is correct. I read the paper very
    carefully and am familiar with related work.  
4 = Quite sure. I tried to check the important points carefully. It's
    unlikely, though conceivable, that I missed something that should
    affect my ratings.
3 = Pretty sure, but there's a chance I missed something. Although I
    have a good feel for this area in general, I did not carefully check
    the paper's details, e.g., the math, experimental design, or novelty.
2 = Willing to defend my evaluation, but it is fairly likely that I
    missed some details, didn't understand some central points, or can't
    be sure about the novelty of the work.
1 = Not my area, or paper is very hard to understand. My evaluation is
    just an educated guess.


RECOMMENDATION FOR BEST SHORT PAPER AWARD (1-3)

3 = Definitely.
2 = Maybe.
1 = Definitely not.