-
Synthetic Dataset for Causal Analysis
The dataset is a synthetic version of the well-known German Credit dataset (https://archive.ics.uci.edu/dataset/144/statlog+german+credit+data). It includes variables such as...-
CSV
The resource: 'synthetic german data' is not accessible as guest user. You must login to access it!
-
CSV
-
SWH Filenames
A 69 GB dataset with ~2.3 billion strings representing deduplicated names of source code files collected by Software Heritage, the great library of source code...-
ZIP
The resource: 'SWH Filenames' is not accessible as guest user. You must login to access it!
-
ZIP
-
DeLag: Microservices execution traces
The dataset contains execution traces collected from the well-know open-source microservices system Train-ticket. The traces are generated over a variety of scenario,...-
parquet
The resource: 'Unnamed resource' is not accessible as guest user. You must login to access it!
-
parquet
-
DBLP Network
The DBLP computer science bibliography provides a comprehensive list of research papers in computer science. This dataset is a co-authorship network constructed upon the DBLP...-
HTML
The resource: 'DBLP Network' is not accessible as guest user. You must login to access it!
-
HTML
-
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...