-
Multilevel Monitoring of Activity and Sleep in Healthy people
Multilevel Monitoring of Activity and Sleep in Healthy people (MMASH) dataset provides 24 hours of continuous beat-to-beat heart data, triaxial accelerometer data, sleep... -
GPS Origin Destination Matrix in Tuscany
This dataset is the origin and destination matrix among the municipalities of Tuscany extracted starting from GPS tracks of private vehicles collected from 2014-02-10 to...-
CSV
The resource: ' GPS Origin Destination Matrix' is not accessible as guest user. You must login to access it!
-
CSV
-
Soccer Events
This dataset contains data regarding one full season of soccer games. For each player there are locations (positions in pitch) visited and all the events they generated...-
ZIP
The resource: 'Soccer event data' is not accessible as guest user. You must login to access it!
-
ZIP
-
Introduction to Data Curation
This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...-
PDF
The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
-
PDF
-
Social Network dataset - LiveJournal
LiveJournal is a free on-line blogging community where users declare friendship each other. LiveJournal also allows users form a group which other members can then join. We...-
HTML
The resource: 'LiveJournal social network ...' is not accessible as guest user. You must login to access it!
-
HTML
-
Call Data Record District of Pisa 2013 October
The dataset contains mobile phone records collected in the provinces of Pisa, Lucca, Livorno and Firenze in October 2013. It contains about 60 mln of Call Data Records (CDR),... -
ClueWeb09
The ClueWeb09 dataset consists of about 1 billion web pages in ten languages that were collected in January and February 2009. It was created to support research on... -
Official administrative information of Tuscany
The data contains the spatial partitioning of Tuscany and some statistical information published by the Italian Statistical Bureau.-
LOD
The resource: 'Linked Open Data' is not accessible as guest user. You must login to access it!
-
LOD
-
A hybrid approach for PPI
We propose a new framework that can exploit topological and biological information to predict protein-protein interactions. The algorithm relies on the underlying hypothesis... -
German Credit
In the german credit dataset each one of the 1,000 persons is classified as a good or bad creditor according to attributes like age, sex, checking_account, credit_amount,...-
CSV
The resource: 'German Credit' is not accessible as guest user. You must login to access it!
-
CSV
-
Twitter Dumps
The dataset consists of the 10% of the daily stream of tweets produced on Twitter filtered into 3 subsets: English, Italian, geo-referenced. The tweets are a random sample of... -
Open data from NervousNet
This dataset contains anonymized proximity information sent by 154 mobile phones (both Android and iPhone) via phone apps. These information are sent by bluetooth beacons every...-
ZIP
The resource: 'open data from NervousNet' is not accessible as guest user. You must login to access it!
-
ZIP
-
Car sharing dataset
The dataset comprises pickup and drop-off times and locations of vehicles in 10 European cities for one of the major free-floating car sharing operator. For nine of these... -
Twitter social bots
Spambots are automated accounts (i.e., accounts driven by a bot) that repeatedly advertise unsolicited and often harmful content (e.g., malware, URLs to phishing Web sites,... -
Broad Twitter Corpus
The Broad Twitter Corpus is a named entity-annotated dataset of tweets, collected in order to capture temporal, spatial and social diversity. The goal of the corpus is to...-
JSON
The resource: 'Broad Twitter Corpus' is not accessible as guest user. You must login to access it!
-
JSON
-
Gene-specific regularization for COPD partial-correlation estimation
We introduce a gene-specific regularization factor when computing the Partial Correlation score to make the indeterminate regression feasible. We decided to slightly modify... -
Estonian public sector electronic services and service providers and consumers
The dataset contains records of electronic services (aka X-Road services), service providers and consumers harvested in April 2014 from RIHA (https://riha.eesti.ee). The data... -
Twitter fake followers
Fake followers are fake accounts massively created to follow a target account and that can be bought from online markets. In other words, their goal is that of increasing the... -
Disease Twitter Dataset
This Twitter dataset covers two recent outbreaks: Ebola and Zika. About 60 million tweets were collected through a query-based access to the Twitter Streaming API, covering... -
e-MID interbank transactions
This dataset is an edgelist containing daily interbank transactions as registered in the electronic Market for Interbank Deposits (e-MID), in the period 2010--2014. e-MID is...