approved
Conversational search dataset with labels

CAsT 2019 data is split into two files one for training and the other one for testing. - Training set: CAsT 2019 conversations from training set and from test set without qrel + ConvQ dataset - Test set: CAsT 2019 conversations with qrel

Labels - SE: classification label for utterances that are Self Explanatory (e.g., they do not need any rewriting) - FT: classification label for utterances referring to the First Topic in the conversation - PT: classification label for utterances referring to a Previous Topic in the conversation (different from the first topic)

Tags
Data and Resources
To access the resources you must log in
Personal Data Attributes

Description: Personal Data related Information

Field Value
ChildrenData No
Personal Data No
Personal data was manifestly made public by the data subject N/A (Not appliable)
Additional Info
Field Value
Accessibility Both
Accessibility Mode Download
Accessibility Mode OnLine Access
Availability On-Line
Basic rights Copying
Basic rights Download
Creation Date 2021-11-08
Creator Muntean, Cristina Ioana
Dataset Citation [Mele2021] Ida Mele, Cristina Ioana Muntean, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Ophir Frieder, Adaptive utterance rewriting for conversational search, Information Processing & Management, Volume 58, Issue 6, 2021, https://doi.org/10.1016/j.ipm.2021.102682.
Dataset Re-Use Safeguards Please cite the original paper.
Field/Scope of use Non-commercial research only
Group Others
License term 2022-03-08 /2024-12-31
Manifestation Type Virtual
Processing Degree Primary
Retention Period 2022-03-08 /2024-12-31
Sublicense rights No
Territory of use World Wide
Thematic Cluster Other
Thematic Cluster Text and Social Media Mining [TSMM]
system:type Dataset
Management Info
Field Value
Author Muntean Cristina
Maintainer Muntean Cristina
Version 1
Last Updated 24 June 2023, 01:15 (CEST)
Created 18 March 2022, 00:45 (CET)