-
EMAKG: Enhanced Microsoft Academic Knowledge Graph
The EMAKG is a large dataset of scientific publications and related entities such as authors, affiliations, venues, and fields of study. Data includes authors' careers and... -
Where do migrants and natives belong in a community: a Twitter case study and...
Today, many users are actively using Twitter to express their opinions and to share information. Thanks to the availability of the data, researchers have studied behaviours... -
Private Cybersecurity NER dataset
Our dataset is created by merging APTNER and CyNER datasets, containing 13601 sentences, 347779 tokens, and 37684 entities. The split ratio was roughly 70% for training and... -
Iperf K8s-based Power and Resource consumption dataset
The data were collected in a Prometheus-like data format: each entry has a timestamp, a value and key-value labels containing additional information. Metrics were gathered...-
CSV
The resource: '5G_Power_and_Resource_consu ...' is not accessible as guest user. You must login to access it!
-
CSV
-
Stroke and sepsi
The considered stroke dataset (DOI:10.17632/x8ygrw87jw.1, DOI:10.1016/j.artmed.2019.101723) was pre-processed by removing attributes with more than 30% missing values, by... -
Know your trees dataset
A set of images of urban trees in Tortona specifically focusing on images of trees, leaves, bark and habits along with general information, taxonomy, and selected biometric...-
ZIP
The resource: 'Dataset Know Your Trees ...' is not accessible as guest user. You must login to access it!
-
ZIP
-
Cybersecurity NER SecureBERT model
This method includes a Python script and files of a SecureBERT model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that will be...-
JSON
The resource: 'config' is not accessible as guest user. You must login to access it!
-
TXT
The resource: 'merges' is not accessible as guest user. You must login to access it!
-
BIN
The resource: 'model' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'model_args' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'optimizer' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'scheduler' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'special_tokens_map' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer_config' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'training_args' is not accessible as guest user. You must login to access it!
-
TXT
The resource: 'vocab' is not accessible as guest user. You must login to access it!
-
text/x-python
The resource: 'inference' is not accessible as guest user. You must login to access it!
-
JSON
-
Characterising different communities of Twitter users: migrants and natives
Today, many users are actively using Twitter to express their opinions and to share information. Thanks to the availability of the data, researchers have studied behaviours... -
Origin and destination attachment: study of cultural integration on Twitter
The cultural integration of immigrants conditions their overall socio-economic integration as well as natives’ attitudes towards globalisation in general and immigration in...-
HTML
The resource: 'Link to article.' is not accessible as guest user. You must login to access it!
-
HTML
-
Cybersecurity NER RoBERTa-base model
This method includes a Python script and files of a RoBERTa-base model fine-tuned on our Cybersecurity NER dataset. The method requires as input a list of sentences that will...-
JSON
The resource: 'config' is not accessible as guest user. You must login to access it!
-
TXT
The resource: 'merges' is not accessible as guest user. You must login to access it!
-
BIN
The resource: 'model' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'model_args' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'scheduler' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'special_tokens_map' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer_config' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'training_args' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'tokenizer' is not accessible as guest user. You must login to access it!
-
JSON
The resource: 'vocab' is not accessible as guest user. You must login to access it!
-
ZIP
The resource: 'optimizer' is not accessible as guest user. You must login to access it!
-
py
The resource: 'inference' is not accessible as guest user. You must login to access it!
-
JSON
-
Private Vegetation of a basin of the Po river Dataset
We provide two climatological dataset composed by D = 136 (with 1038 samples) and D = 1991 (with 981 samples) continuous climatological features and a scalar target, which... -
y/Politics 1k
Social simulation data generated using Y Social focused on political-related topics. Y Social is a Digital Twin of an online social media platform that allows researchers to...-
ZIP
The resource: 'y_politics_1k.db' is not accessible as guest user. You must login to access it!
-
ZIP
-
Private Word-in-Context task for Italian
The general goal of the WiC-ITA task is to establish whether a word w occurring in two different sentences, s_1 and s_2, has the same meaning or not. In particular, our task... -
Measuring the Salad Bowl: Superdiversity on Twitter
Superdiversity refers to large cultural diversity in a population due to immigration. In this paper, we introduce a superdiversity index based on the changes in the emotional... -
Private EnviroStream
This repository contains datasets, queries and a generator for the EnviroStream, a benchmark for Stream Reasoning (SR) systems. SR focuses on applying inference to dynamic... -
Human and mouse gene regulatory networks
The dataset was built by considering gene expression data related to 6 different organs (liver, lung, brain, skin, bone marrow, heart), obtained by control samples available... -
Combining Twitter and Mobile Phone Data to Observe Border-Rush: The Turkish-E...
Following Turkey's 2020 decision to revoke border controls, many individuals journeyed towards the Greek, Bulgarian, and Turkish borders. However, the lack of verifiable... -
Private Highway driving simulation
The SUMO simulator is used to model scenarios with diferent road topologies and traffc intensities, randomizing the fow of vehicles, to ensure the generation of sufciently... -
Digital footprints of international migration on twitter
Studying migration using traditional data has some limitations. To date, there have been several studies proposing innovative methodologies to measure migration stocks and...