OLAC Report for 2002
OLAC - Open Language Archives Community - www.language-archives.org
Steven Bird and Gary Simons
OLAC, the Open Language Archives Community, is an international
partnership of institutions and individuals who are creating a
worldwide virtual library of language resources by: (i) developing
consensus on best current practice for the digital archiving of
language resources, and (ii) developing a network of interoperating
repositories and services for housing and accessing such resources.
OLAC builds on the Open Archives Initiative and the Dublin Core
Metadata Initiative, and is sponsored by the NSF/EC project
"International Standards in Language Engineering" (ISLE).
The OLAC homepage, at www.language-archives.org, hosts many documents
providing full technical details of OLAC, including the OLAC Metadata
Set, the Metadata Schema, the OLAC Process document, information for
implementers, and the OLAC-General Mailing list.
In the past 12 months, there have been three major developments with OLAC:
LAUNCHES: OLAC was formally launched in North America in a symposium held
at the 76 Annual Meeting of the Linguistic Society of America in San
Francisco (January 2002). OLAC was also launched in Europe in a
symposium held at the 3rd International Conference on Language Resources
and Evaluation, in Las Palmas, Spain (May 2002).
PARTICIPATION: Participation in OLAC has doubled. Today there are over 20
participating archives and services: ACL, ANLC, AISRI, APS, ATILF, CBOLD,
DFKI, ELRA, Ethnologue, LACITO, LDC, LINGUIST, OLACA, OTA, Perseus
Project, Rosetta, SCOIL, SIL, TalkBank, TRC and TRACTOR. Together these
institutions have documented some 30,000 language resources (corpora,
tools, publications, etc), and they can be searched simultaneously.
Note that the ACL Anthology is indexed by OLAC.
INFRASTRUCTURE: Key software infrastructure is now in place, including
Vida, a virtual data provider for proxied services; ORyX, and XML format
for representing OLAC repositories in XML, ORE, the OLAC repository
editor, a forms based metadata editor; and DP9, a web-crawler gateway
permitting search engines to index all OLAC content.
There are three ways in which members of the community can become involved
1. Use OLAC to search for needed language resources. Users can search
based on language, linguistic data type, linguistic software
2. Document new resources using OLAC metadata. Use OLAC tools to
create/publish descriptions of language resources.
3. Contribute to the development of controlled vocabularies for language resource
To find out more, please visit www.language-archives.org and join the OLAC-General