UserNLP: User-centered Natural Language Processing Workshop

Natural Language Processing (NLP) models are vital for analyzing, retrieving, and summarizing the vast amounts of digital information produced every day. However, models trained as "one-size-fits-all" do not explicitly consider language and interpretation diversity among individuals or groups of individuals. Such user-level variation (delineated by, e.g., demographics, culture, user interests) can cause stylistic and even semantic disparities, which decrease dialogue coherence, harm fairness, and reduce model robustness. User-centered NLP can fill these gaps by explicitly taking these variations into account and focusing on user-level modeling tasks. Including user-level elements into NLP models, for example inferring user preferences, beliefs, and behavior in general, has an increasing impact on a broad range of downstream applications, including text understanding and generation, conversational information retrieval, and mental health care.

The UserNLP Workshop will provide a unique chance for researchers to collect and aggregate current development, challenges, and opportunities across a multitude of related fields. This includes:

the privacy and fairness concerns in user data collection and annotation
opportunities and stereotyping risks in modeling
and challenges of personalized evaluation

We aim to create a platform where researchers can present rising challenges in building user-centered NLP models, going beyond modeling individual documents, and exploring various ways to capture stylistic variations and enhance model personalization.

Since annotating user-level data requires making non-trivial judgements based on a collection of documents, the complexity of user information can prevent researchers from developing precise annotations on user-level corpora. While we have data statement discussions and schema for document-level annotations, few studies have explored data schema and standards on the user-level.

Moreover, conventional evaluation metrics and schemes for the document-level models may not be appropriate to capture the diversity and complexity of user-level information. The lack of ethically appropriate, standardized, and easily accessible evaluation data and metrics is perhaps the major hindrance to the development of this field, impeding the reproducibility of experimental results.

Finally, user-centered NLP raises a wide range of important ethical questions, such as algorithmic fairness and user privacy. An informed discussion on these timely topics requires gathering at one table researchers who encounter stylistic disparities and user-level tasks directly or indirectly in their work.

While an increasing number of NLP research works develop user-level models and datasets, there is no de-facto venue to bring this topic to interdisciplinary researchers and communities. Some of the NLP workshops, such as those focusing on social media or dialog systems, have recently raised awareness of user-related issues. Our proposed workshop will provide a platform beyond those venues by focusing solely on the emerging challenges of user-level models and applications, such as: privacy and fairness, human-level modeling, and personalization for interdisciplinary communities. Furthermore, our venue can bring a unique chance to present user-involved NLP topics, such as interactive and human-in-the-loop NLP.

The goals of our proposed workshop squarely align with the Web Conference’s mission of developing technologies, practices and applications to create a Web ecosystem that is efficient, trustworthy, safe, open and inclusive for everyone. Of particular interest, are the topics of user modeling and personalization which have been crucial to the success of the web technologies such as search and recommendation systems. As we deploy systems that perform inferences over personal data, it is critical to ensure that these models can accurately represent individuals and account for their uniqueness. Better user models and representations can directly contribute to improvements in areas that are relevant for the Web Conference community such as methods for Social Network Analysis, and applications for Computational Social Sciences, and Digital Epidemiology, to name a few. However, this also raises important questions related to the ethics, fairness, privacy, transparency, and accountability of Web technologies, which will be of interest to participants of the Web4Good: FATES workshop.

Call for Papers
The overarching questions that motivate this workshop are:

To what extent do stylistic variations indirectly impact downstream applications which were historically treated as stylistically uniform?
To what extent is it desirable to exploit individual variations to reduce demographic disparity, promote user-level models and personalize NLP applications?
To what extent recent advances in related areas including representation learning, domain adaptation and transfer learning can leverage individual variations, understand user intentions, customize NLP models, and deliver interpretable outputs for users’ specific needs?
How to better evaluate user-centric models to shed insight on user-level disparities and the impact of personalized models?
How to better achieve privacy-preserving user centric NLP models when it comes to a wide range of personalization and user level tasks?
How much user data is sufficient for system performance?

A non-exhaustive list of proposed topics and applications of interest follows. Suggested topics include:

Effects of stylistic variation on downstream tasks
User-level distributional vector models
Personalization and user-aware natural language generation
Fairness and ethics in user-level tasks
User modeling and user behavior analysis
Effective approaches to evaluate user-level models
Interactive and personalized information retrieval
Challenges in user privacy and private user-centered models

Potential applications include:

User sociodemographic inference applications, together with their issues and risks
Personalized text generation
User modeling for health applications (e.g. mental health, preventive care)
Identifying trustworthiness and deception of users
Rhetoric and personalization (e.g. stylistic choices in political speeches, etc.)

Submission Guidelines:

Full research papers (up to 8 pages for main content)
Short research papers (up to 4 pages for main content)
Vision/Position papers (up to 4 pages for main content)

The workshop calls for full research papers (up to 8 pages + 2 pages of appendices + 2 pages of references), describing original work on the listed topics, and short papers (up to 4 pages + 2 pages of appendices + 2 pages of references), on early research results, new results on previously published works, demos, and projects. In accordance with Open Science principles, research papers may also be in the form of data papers and software papers (short or long papers). The former present the motivation and methodology behind the creation of data sets that are of value to the community; e.g., annotated corpora, benchmark collections, training sets. The latter presents software functionality, its value for the community, and its application to a non-specialist reader. To enable reproducibility and peer-review, authors will be requested to share the DOIs of the data sets and the software products described in the articles and thoroughly describe their construction and reuse.

The workshop will also call for vision/position papers (up to 4 pages + 2 pages of appendices + 2 pages of references) providing insights towards new or emerging areas, innovative or risky approaches, or emerging applications that will require extensions to the state of the art. These do not have to include results already, but should carefully elaborate on the motivation and the ongoing challenges of the described area.

Submissions for review must be anonymous and in PDF format and must adhere to the ACM template and format. Submissions that do not follow these guidelines, or do not view or print properly, may be rejected without review.

The proceedings of the workshops will be published jointly with The Web Conference 2022 proceedings.

Submit your contributions following the link: https://easychair.org/cfp/UserNLP_2022

The deadline for submission is 11:59pm GMT -12 on Feb, 3rd, 2022.

Menu

UserNLP: User-centered Natural Language Processing Workshop

Latest Events

Menu

UserNLP: User-centered Natural Language Processing Workshop

User login

Latest Events