501 items found

Organisations: SoBigData Catalogue

Filter Results
  • Dataset

    Emergency Tweets 2014 Genoa flood

    This dataset contains Italian tweets collected during and in the aftermath of the floods that occurred near the city of Genoa between 9 and 11 October 2014...
    • ZIP
      The resource: 'FLO-GEN.zip' is not accessible as guest user. You must login to access it!
  • Dataset

    Testing NBA dataset

    Just for platform reviewing
    • CSV
      The resource: 'nbaStats.csv' is not accessible as guest user. You must login to access it!
  • Dataset

    Global Peace Index data

    A dataset of the Global Peace Index (GPI), which ranks 163 independent states and territories according to their level of peacefulness. The GPI covers 99.7 per cent of the...
  • Dataset

    Sheffield NERD Tweet Corpus

    The dataset contais 794 tweets annotated with named entities disambiguated against DBpedia, and split into equally sized training and test portions. 400 tweets from 2013 comes...
    • FINF
      The resource: 'Sheffield NERD Tweet Corpus' is not accessible as guest user. You must login to access it!
  • Method

    Modelling Scientific Migration

    This method is an adaptation of the general migration models to understand scientific migration. Under development.
  • Dataset

    Italian Tourism Dataset

    A set of users' comments crawled and scraped from two main touristic websites (Booking.com and Tripadvisor.com) related to main touristic point of interests in Italy and, in...
    • HTML
      The resource: 'tourism-dataset' is not accessible as guest user. You must login to access it!
  • Dataset

    CDR data - Tuscany

    The dataset contains mobile phone records collected in Tuscany between September 2015 and August 2016. It contains Call Data Records (CDRs) of phone users, and the corresponding...
  • Dataset

    DE webarchive

    The dataset consists of all the content from the .de top level domain as crawled by the Internet Archive.
    • HTML
      The resource: 'Internet Archive Wayback ...' is not accessible as guest user. You must login to access it!
  • Dataset

    Aalto-Twitter

    The dataset consists of about 418 million of tweets from June 25, 2015 to September 19, 2015. Tweets are about trending hashtags gathered though the public Twitter api.
  • Dataset

    Yeast

    The yeast dataset is a collection of yeast microarray expressions and phylogenetic profiles which can be used to learn the yeast gene functional categories. One row of this...
    • arff
      The resource: 'Yeast Dataset' is not accessible as guest user. You must login to access it!
  • Dataset

    UK General Election Vote Intent

    A list of Twitter users for whom party political allegiance/vote intent has been established.
  • Dataset

    Emergency Tweets 2013 Sardinia flood

    This dataset is related to the floods that occurred in the Sardinia regional district between 17 and 19 November 2013 (https://en.wikipedia.org/wiki/2013_Sardinia_floods), as...
    • ZIP
      The resource: 'FLO-SAR.zip' is not accessible as guest user. You must login to access it!
  • Method

    A New Topological Approach for the Prediction of Protein-Protein Interactions

    We propose, Maximum-Proteins-Similarity(Topological)": MPS(T). MPS(T) is a topological three-length path method that scores the potential interaction between proteins by...
  • Dataset

    GPS Tracks - Calabria, Italy 2012

    The dataset consists of GPS tracks of private vehicles collected in Calabria region (Italy). It counts about 28 mln of trajectories of about 115.000 users. Data are in the...
  • Dataset

    Soccer Team Performance

    The dataset contains the performance features (passes, shots, goals, tackles, ecc) of soccer teams during the games of six major European leagues in three seasons. The dataset...
  • Dataset

    Formal network of Estonian companies and board members

    This dataset consists of managed and continuously updated data about Estonian companies and board members since 1994. Technical documentation of data structures and the REST API...
    • ZIP
      The resource: 'Dataset' is not accessible as guest user. You must login to access it!
  • Dataset

    Congress Network

    Network built on top of US congress voting data and made available on the website GovTrack.us. Nodes identifies congressman and edges represent the semantic "have supported the...
    • HTML
      The resource: 'Original data' is not accessible as guest user. You must login to access it!
  • Method

    Prediction of next career moves from scientific profiles

    This is a two-stage predictive model for the mobility of scientists. First, data mining is used to predict which researcher will move in the next year on the basis of their...