-
Dataset for Evaluating Abstractive Summaries of Crisis-Related Social Media
The dataset created for evaluation of summaries generated from social media posted during five natural disasters. The dataset contains: ground truth reports created by human... -
WIRE dataset
This dataset consists of 503 pairs of Wikipedia entities drawn from the New York Times dataset with a human assigned relatedness score. The domain experts based their... -
Ariadne English Dendrochronology Entity Recognizer
Identifies terms and phrases in English for analysing archaeological text. The method delivers named entities of archaeological elements, wood material, sample, and date, with...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
Ariadne Dutch Dendrochronology Entity Recognizer
Identifies terms and phrases in Dutch for analysing archaeological text. The method delivers named entities of archaeological elements, wood material, sample, and date, with...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
Amazon reviews
This (link to the) dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014. This dataset includes reviews...-
HTML
The resource: 'Julian McAuley's repository.' is not accessible as guest user. You must login to access it!
-
HTML
-
Cross-Lingual Dataset of Crisis-Related Social Media
If you use this dataset, please cite the following paper: Fedor Vitiugin, Carlos Castillo: Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive... -
Dictionary creator
This tool creates a dictionary with inverse document frequency (idf) values from the Google NGrams dataset. -
GATE Cloud Chemical Entity Recogniser
This service annotates chemical named entities using the open source OSCAR4 tagger. As well as the names of the detected entities the tagger also returns their structure in...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
Ariadne Swedish Archaeology Named Entity Recognizer
Identifies terms and phrases in Swedish for analysing archaeological text. The method delivers named entities of archaeological context, physical object, material, time...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
The Italian Music Dataset
The dataset is built by exploiting the Spotify and SoundCloud APIs. It is composed of over 14,500 different songs of both famous and less famous Italian musicians. Each song...-
JSON
The resource: 'Dataset' is not accessible as guest user. You must login to access it!
-
JSON
-
Ariadne Swedish Dendrochronology Entity Recognizer
Identifies terms and phrases in Swedish for analysing archaeological text. The method delivers named entities of archaeological elements, wood material, sample, and date, with...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
ArchiveSpark
ArchiveSpark is an Apache Spark framework for easy data access, processing, extraction as well as derivation for Web archives and archival collections. It has a simple and... -
Product Reviews for Ordinal Quantification
This data set comprises a labeled training set, validation samples, and testing samples for ordinal quantification. It appears in our research paper "Ordinal Quantification... -
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0... -
Cherenkov Telescope Data for Ordinal Quantification
This labeled data set is targeted at ordinal quantification. It appears in our research paper "Ordinal Quantification Through Regularization", which we have published at... -
Learning to quantify: LeQua 2022 datasets
The aim of LeQua 2022 (the 1st edition of the CLEF “Learning to Quantify” lab) is to allow the comparative evaluation of methods for “learning to quantify” in textual... -
Ariadne Dutch Archaeology Named Entity Recognizer
Identifies terms and phrases in Dutch for analysing archaeological text. The method delivers named entities of archaeological context, physical object, material, time...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
Ariadne English Archaeology Named Entity Recognizer
Identifies terms and phrases in English for analysing archaeological text. The method delivers named entities of archaeological context, physical object, material, time...-
method-engine
The resource: 'Method Engine' is not accessible as guest user. You must login to access it!
-
method-engine
-
Wikinews dataset
This dataset consists of a sample of 365 news published by Wikinews from November 2004 to June 2014 and annotated with about 5000 entities, each associated with a saliency...-
JSON
The resource: 'entity-saliency' is not accessible as guest user. You must login to access it!
-
JSON