2014Q1 Reports: Info Officer

From Admin Wiki
Jump to: navigation, search

[Link to 2013 Q3 Report] [Link to 2013 Q1 Report]

The Information Officer (IO) portfolio includes integration of the different ACL-wide activities that are related to information dissemination; including the Anthology, website, wiki, portal and archive. Plans include provide integration of logins (through OpenID and OAuth; IN PROGRESS); update our information services to be updated and professionally-designed (PLANNED).

Long-term goals for the costs of the information services to be sponsored, movement of the aclweb.org infrastructure to a more modern webhost, accessibility and long-term maintenance of the aclweb.org and other sites, and to be cost-neutral through sponsorship by corporate interests.

IO Overview

Budget. The IO has budget to oversee part-time manpower allocated to help improve our association's websites, which includes maintenance, upgrading, migrating and backup. So far, we have incurred costs of 2,270, but have an outstanding amount of over 400. This is well in line with our projections.

DOIs/CrossRef. We are now a registered body for assigning DOIs to scholarly materials in CrossRef. This costs USD 275 per year and 1 USD per resource registered for a DOI. Under previous agreement, we decided to start with DOI assignment with our new TACL journal, but it seems that the journal's EIC staff are encumbered by other issues. One important consideration of joining CrossRef and being able to assign DOIs is CrossRef's mandate that all journal articles do outbound citation linking within 18 months of joining CrossRef (see here, rule #6). We will be working with TACL to establish this workflow with TACL first, and then try to propagate this for use in our conference systems (may have to be tied with START development). The EICs have not had the bandwidth to deal with this issue (as it requires some reworking of how cited items are reported in their system).

Given this bottleneck and the unforseeable delay in getting a workflow established, I would suggest that we start with our voluminous conference proceeding data instead. I will start with the annual ACL proceedings, then deal with chapter conferences (EACL, NAACL) next.

Collaboration with ELRA. We have also discussed joint work with ELRA's executive officers, Nicoletta Calzolari and Khalid Choukri, about information services. As a result, there was a meeting in Paris on 17 Nov 2013 that led to Gertjan (on ACL's behalf) to sign the NLP12 Paris Declaration. With respect to the IO portfolio, this explicitly endorses the common scheme for a International Standard Language Resource Number (ISLRN) to itemize and inventory language resources for searching and bibliometrics.

Other plans include:

  • Investigating the citation indexing of our materials -- it appears that ACL materials are somewhat haphazardly indexed by Elsevier and SCI. We need to address this because many of our membership depend on the availability of indexing in determining whether to publish in our venues or not. The beta Anthology (see below) has proper metadata on each paper's individual pages (not present in the original Anthology), which will make it easier for clients (like Google Scholar) to index the metadata properly.


The ACL Anthology is a digital archive of research papers in computational linguistics, sponsored by the CL community, and freely available to all. We employ a Creative Commons Attribution Non-Commercial, Share-Alike license for materials published by ACL, although dual licensing for a fee is presumably possible (although not exercised currently).

This past half year, aside from regular ingestion, videos from NAACL have been made available as links in the old Anthology.

The Anthology now contains over 24,500 papers (up from 21,900 articles from a year ago, and 23,000 from 6 months ago). The new ACL Anthology is now active but not out of beta yet, so we are going to maintain both sites for the time being. The domain name aclanthology.info has been registered for 10 years to point towards the webhost that provides the Anthology.

Mailing List. The Anthology mailing list's (http://groups.google.com/group/acl-anthology) membership pool has grown, now consisting of 469 members (up from 394 from a year ago, and 426 from six months ago). This is an announcement-only list, where we notify members of newly listed released materials online.

Plans. A key thrust this year will be to start assigning DOIs, as part of the ACL's initiative to take DOIs under our control. We are working towards doing this, pending the Exec's approval.

A second thrust is to other forms of scientific knowledge that we are interested in archiving. These include software, datasets and videos. The procedures for integrating these with START and the submission process need to be worked out, and the space requirements for these services assessed. For the time being, we will concentrate on videos.

A third thrust for this year will be to incorporate the results of the R50 workshop into the Anthology, and allow third-party applications to automatically annotate articles with new metadata and papers in the Anthology, as they come available. Such an API will raise the visibility of the Anthology as a object of study, complementing our earlier work to make the Anthology's text a corpus.

We have long term plans to work on these other following problems, which are less urgent:

  • A previous discussion (with Ken Church) proposed that we create a single bibtex file for all Anthology materials. The beta Anthology can generate such information fairly easily with its database backing; we plan to have this file available soon (before the ACL 2014 conference).
  • collaboration with START and aclpub (also may involve the Conference Officer's work)
  • collaboration with ELRA with respect to use of the LRE Map and ISLRNs, and voluntarily helping them with scanning backlog archives into a digital form.

Website / Portal

The ACL website continues to serve as the primary online resource for the organization. It contains the main ACL site, an ACL Wiki which serves as a resource to the general computational linguistics community, an ACL Admin wiki used to store and maintain ACL specific resources such as reports, handbooks, and policies as well as an exec wiki reserved for the use of ACL execs. We also maintain mirrors of individual ACL conference websites, membership email lists for ACL announcements and a listing of resolutions of the ACL Exec Committee.

The ACL Portal was created to provide a web-based platform to house facilities for the benefit of members. The Portal currently serves little function other than maintaining a list of current members and a payment gateway for membership. We are currently working towards integrating the Portal into the website's functionality, now that both systems are run on a common platform (Drupal 7). Integration will involve upgrading existing custom modules developed by Ben Phelan (the previous developer) for the Portal to Drupal 7; this is ongoing work.

We are now working in parallel on consolidating the ACL Website and the ACL Portal, and on the establishment of a central login for ACL services (something akin to a "ACL Account" a la Google or Facebook). We are planning to use OpenID and OAuth, which would allow members to link their ACL account with other (i.e., Google, LinkedIn, Twitter, Microsoft/Hotmail) services; such that one could use login credentials from those services for ACL use.

Update. From 2013 Q3 to now, we have reached several milestones:

  • We completed updates to our software installations to fix security concerns and updated the website to be responsive (so that it functions somewhat better on mobile devices such as smartphones and tablets). In particular, our MediaWiki installations were updated, and the main website was upgraded to a Drupal 7 installation that is responsive. This makes the website largely compatible with the code for the Portal.
  • We have also mirrored a few additional old conference websites (especially when those domains go offline and non-renewed) and listed them within the website, and now have a better policy for this.
  • ACL Elections were also run by the webmaster, with help from Drago. We inherited his code, installed it, and were able to run the elections smoothly.
  • 2 new resolutions were passed by the ACL Exec Committee and they were added to the list maintained on the Admin Wiki.
  • There were small problems that were a bit difficult to trace in the registration and payment code in the Portal. These are now fixed.
  • There are some communication problems with respect to migrating and updating information in the ACL website. The webmaster needs to be aware that the business manager, secretary and the information officer all have jurisdiction over the work. This was not made clear in the first few tasks, which caused confusion for the webmaster.
  • As time management became an issue, our current webmaster, Joshua Herring, has voluntarily resigned and will no longer serve, as of the end of March.

We would like place on the record a note of thanks to Josh for his service. Josh's work has incurred a cost of about 2K USD for his time on the projects, that has been paid out of the 10K USD allocated budget for the IO portfolio.

Plans. Josh will be working to close out his duties on the integration between the Portal and the website, in the time remaining. We are currently planning to source for a replacement, without the involvement of the ACL Exec. We are also planning to move the site to another host aside from our current provider, 1and1.com, as the current webhost provides fairly limited facilties for both DNS and host access.