228 items found

Organisations: SoBigData Catalogue

Filter Results
  • Dataset

    Emergency Tweets 2013 Milan blackout

    This dataset is related to a power outage (i.e., a blackout) that occurred in the city of Milan, in northern Italy, in the night between 14 and 15 May 2013. Despite not...
    • CSV
      The resource: 'PWO-MIL_tweets.csv' is not accessible as guest user. You must login to access it!
  • Dataset

    Call Data Record Tuscan cities 2014

    The dataset contains mobile phone records collected in the provinces of Pisa, Lucca, Livorno and Firenze in March 2014. It counts about 50 mln of Call Data Records (CDR) of...
  • Dataset

    Twitter Dataset 2013-2014

    The dataset was collected by the Archive team through the Twitter Streaming API which provides free access to 1% of public tweets. The covered time period is from January 1st...
  • Dataset

    City-to-city migration

    Census data recording the migration of people between metropolitan areas in the US
  • Dataset

    Wyscout soccer-logs dataset

    A dataset of soccer-logs for all the main soccer leagues in the world, from season 2014/2015 to the current one.
  • Dataset

    Dataset Adult

    The adult dataset includes $48,842$ instances with demographic information like age, workclass, marital-status, race, capital-loss, capital-gain etc. The income attribute...
    • CSV
      The resource: 'Adult' is not accessible as guest user. You must login to access it!
  • Dataset

    Emergency Tweets 2009 L'Aquila earthquake

    This dataset comprises 1,100 Italian tweets shared in the aftermath of the 2009 L’Aquila earthquake (https://en.wikipedia.org/wiki/2009_L%27Aquila_earthquake). The earthquake...
    • ZIP
      The resource: 'EAQ-LAQ.zip' is not accessible as guest user. You must login to access it!
  • Dataset

    Multilevel Monitoring of Activity and Sleep in Healthy people

    Multilevel Monitoring of Activity and Sleep in Healthy people (MMASH) dataset provides 24 hours of continuous beat-to-beat heart data, triaxial accelerometer data, sleep...
  • Dataset

    GPS Origin Destination Matrix in Tuscany

    This dataset is the origin and destination matrix among the municipalities of Tuscany extracted starting from GPS tracks of private vehicles collected from 2014-02-10 to...
    • CSV
      The resource: ' GPS Origin Destination Matrix' is not accessible as guest user. You must login to access it!
  • Dataset

    Soccer Events

    This dataset contains data regarding one full season of soccer games. For each player there are locations (positions in pitch) visited and all the events they generated...
    • ZIP
      The resource: 'Soccer event data' is not accessible as guest user. You must login to access it!
  • Dataset

    Social Network dataset - LiveJournal

    LiveJournal is a free on-line blogging community where users declare friendship each other. LiveJournal also allows users form a group which other members can then join. We...
    • HTML
      The resource: 'LiveJournal social network ...' is not accessible as guest user. You must login to access it!
  • Dataset

    Call Data Record District of Pisa 2013 October

    The dataset contains mobile phone records collected in the provinces of Pisa, Lucca, Livorno and Firenze in October 2013. It contains about 60 mln of Call Data Records (CDR),...
  • Dataset

    ClueWeb09

    The ClueWeb09 dataset consists of about 1 billion web pages in ten languages that were collected in January and February 2009. It was created to support research on...
  • Dataset

    Official administrative information of Tuscany

    The data contains the spatial partitioning of Tuscany and some statistical information published by the Italian Statistical Bureau.
    • LOD
      The resource: 'Linked Open Data' is not accessible as guest user. You must login to access it!
  • Method

    A hybrid approach for PPI

    We propose a new framework that can exploit topological and biological information to predict protein-protein interactions. The algorithm relies on the underlying hypothesis...
  • Dataset

    German Credit

    In the german credit dataset each one of the 1,000 persons is classified as a good or bad creditor according to attributes like age, sex, checking_account, credit_amount,...
    • CSV
      The resource: 'German Credit' is not accessible as guest user. You must login to access it!
  • Dataset

    Twitter Dumps

    The dataset consists of the 10% of the daily stream of tweets produced on Twitter filtered into 3 subsets: English, Italian, geo-referenced. The tweets are a random sample of...
  • Dataset

    Open data from NervousNet

    This dataset contains anonymized proximity information sent by 154 mobile phones (both Android and iPhone) via phone apps. These information are sent by bluetooth beacons every...
    • ZIP
      The resource: 'open data from NervousNet' is not accessible as guest user. You must login to access it!
  • Dataset

    Car sharing dataset

    The dataset comprises pickup and drop-off times and locations of vehicles in 10 European cities for one of the major free-floating car sharing operator. For nine of these...
  • Dataset

    Twitter social bots

    Spambots are automated accounts (i.e., accounts driven by a bot) that repeatedly advertise unsolicited and often harmful content (e.g., malware, URLs to phishing Web sites,...