ACL 2012: The 50th Annual Meeting of the Association for Computational 
Linguistics

Review form for RESOURCES/EVALUATION research papers

This review form is appropriate for papers that describe a new
set of resources or a new evaluation methodology for computational 
linguistics research.


APPROPRIATENESS (1-5)

Does the paper fit in ACL 2012? (Please answer this question in light
of the desire to broaden the scope of the research areas represented
at ACL.)

5: Certainly.
4: Probably.
3: Unsure.
2: Probably not.
1: Certainly not. 


CLARITY (1-5)

For the reasonably well-prepared reader, is it clear what resource has
been produced or evaluation methodology proposed and why? Is the paper
well-written and well-structured?

5 = Very clear.
4 = Understandable by most readers.
3 = Mostly understandable to me with some effort.
2 = Important questions were hard to resolve even with effort.
1 = Much of the paper is confusing. 


ORIGINALITY / INNOVATIVENESS (1-5)

How original is the approach? Does this paper break new ground in
topic, methodology, or content? How exciting and innovative is the
research it describes?

5 = Seminal: Significant new resource or evaluation methodology -- no
    prior research has attempted something similar.
4 = Creative: An intriguing resource or evaluation methodology that is
    substantially different from previous research.
3 = Respectable: A nice research contribution that represents a
    notable extension of prior resources or evaluations.
2 = Pedestrian: Obvious, or a minor extension to existing work.
1 = Significant portions have actually been done before or done
    better.


SOUNDNESS / CORRECTNESS (1-5)

Is the methodology used to produce the resource or carry out the
evaluation sound and well-chosen?

5 = The methodology is very apt, and any claims are convincingly
    supported.
4 = Generally solid work, although there are some aspects of the
    methodology or evaluation I am not sure about.
3 = Fairly reasonable work. The approach is not bad, but I am not
    entirely ready to accept the resource or evaluation methodology
    (based on the material in the paper).
2 = Troublesome. There are some ideas worth salvaging here, but the
    work should really have been done differently.
1 = Fatally flawed. 


MEANINGFUL COMPARISON (1-5)

Does the author make clear where the resource or evaluation
methodology sits with respect to existing literature? Are the
references adequate?

5 = Precise and complete comparison with related work. Good job given
    the space constraints.
4 = Mostly solid bibliography and comparison, but there are some
    references missing.
3 = Bibliography and comparison are somewhat helpful, but it could be
    hard for a reader to determine exactly how this work relates to
    previous work.
2 = Only partial awareness and understanding of related work, or a
    flawed comparison.
1 = Little awareness of related work, or lacks necessary comparison.


SUBSTANCE (1-5)

Does this paper have enough substance, or would it benefit from more
ideas or results?

Note that this question mainly concerns the amount of work; its
quality is evaluated in other categories.

5 = Contains more ideas or results than most publications in this
    conference; goes the extra mile.
4 = Represents an appropriate amount of work for a publication in this
    conference. (most submissions)
3 = Leaves open one or two natural questions that should have been
    pursued within the paper.
2 = Work in progress. There are enough good ideas, but perhaps not
    enough in terms of outcome.
1 = Seems thin. Not enough ideas here for a full-length paper.


IMPACT OF IDEAS OR RESULTS (1-5)

How significant is the work described? If the ideas are novel, will
they also be useful or inspirational? Does the paper bring any new
insights?

5 = Will affect the field by altering other people's choice of
    research topics or basic approach.
4 = Some of the ideas or results will substantially help other
    people's ongoing research.
3 = Interesting but not too influential. The work will be cited, but
    mainly for comparison or as a source of minor contributions.
2 = Marginally interesting. May or may not be cited.
1 = Will have no impact on the field.

IMPACT OF ACCOMPANYING SOFTWARE (1-5)

If software was submitted along with the paper, what is the expected
impact of the software package? Will this software be valuable to 
others? Is the software well written and easy to use? Does it have 
a complete README file? Does the software match the research described 
in the paper? 


5 = Enabling: The newly released software should affect other
    people's choice of research or development projects to undertake.
4 = Useful: I would recommend the new software to other researchers
    or developers for their ongoing work.
3 = Potentially useful: Someone might find the new software useful
    for their work.
2 = Documentary: The new software useful to study or replicate
    the reported research, although for other purposes they may have
    limited interest or limited usability. (Still a positive rating)
1 = No usable software released.



IMPACT OF ACCOMPANYING DATASET (1-5)


If a dataset was submitted along with the paper, what is the expected 
impact of the dataset? Will this dataset be valuable to others in 
the form in which they are released? Do they fill an unmet need? Are 
they at least sufficient to replicate or better understand the 
research in the paper?


5 = Enabling: The newly released datasets should affect other
    people's choice of research or development projects to undertake.
4 = Useful: I would recommend the new datasets to other researchers
    or developers for their ongoing work.
3 = Potentially useful: Someone might find the new datasets useful
    for their work.
2 = Documentary: The new datasets are useful to study or replicate
    the reported research, although for other purposes they may have
    limited interest or limited usability. (Still a positive rating)
1 = No usable datasets submitted.


RECOMMENDATION (1-6)

There are many good submissions competing for slots at ACL 2012; how
important is it to feature this one? Will people learn a lot by
reading this paper or seeing it presented?

In deciding on your ultimate recommendation, please think over all
your scores above. But remember that no paper is perfect, and remember
that we want a conference full of interesting, diverse, and timely
work. If a paper has some weaknesses, but you really got a lot out of
it, feel free to fight for it. If a paper is solid but you could live
without it, let us know that you're ambivalent. Remember also that the
author has a few weeks to address reviewer comments before the
camera-ready deadline.

Should the paper be accepted or rejected?

6 = Exciting: I'd fight to get it accepted; probably would be one
              of the best papers at the conference.
5 = Strong: I'd like to see it accepted; it will be one of the
            better papers at the conference.
4 = Worthy: A good paper that is worthy of being presented at ACL.
3 = Ambivalent: OK but does not seem up to the standards of ACL.
2 = Leaning against: I'd rather not see it in the conference.
1 = Poor: I'd fight to have it rejected.


REVIEWER CONFIDENCE (1-5)

5 = Positive that my evaluation is correct. I read the paper very
    carefully and am familiar with related work.  
4 = Quite sure. I tried to check the important points carefully. It's
    unlikely, though conceivable, that I missed something that should
    affect my ratings.
3 = Pretty sure, but there's a chance I missed something. Although I
    have a good feel for this area in general, I did not carefully check
    the paper's details, e.g., the math, experimental design, or novelty.
2 = Willing to defend my evaluation, but it is fairly likely that I
    missed some details, didn't understand some central points, or can't
    be sure about the novelty of the work.
1 = Not my area, or paper is very hard to understand. My evaluation is
    just an educated guess.


RECOMMENDATION FOR BEST LONG PAPER AWARD (1-3)

3 = Definitely.
2 = Maybe.
1 = Definitely not.