approved
Stroke and sepsi

The considered stroke dataset (DOI:10.17632/x8ygrw87jw.1, DOI:10.1016/j.artmed.2019.101723) was pre-processed by removing attributes with more than 30% missing values, by converting categorical descriptive variables into numerical ones through one-hot encoding. After these steps, the dataset is described through 1 id (the first column), 13 descriptive variables, and the target variable (the last column). The dataset contains instances representing patients who have not relapsed to stroke (the majority class, last_column=0), and instances representing patients who had a recurrent stroke (the minority class, last_column=1). The sepsi dataset (DOI:10.1038/s41598-020-73558-3) is described through 1 id (the first column), 3 descriptive variables, and the target variable (the last column). The task is to predict if the patient survived or not, due to sepsis. The dataset consists of patients died because of sepsis (last_column=1), while the remaining survived (last_column=0). The target variable of the two datasets was aligned so that the label that indicates the relapse to stroke in the cerebral stroke dataset corresponds to the label indicating people who did not survive in the sepsis dataset. Finally, a reduced version was built by: i) imposing a balanced class distribution (obtained by downsampling the majority class); ii) further reducing the source domain dataset through a 10% stratified random sampling in order to facilitate the competitors to run the experiments to perform a comparative analysis.

Tags
Data and Resources
To access the resources you must log in
  • Stroke and sepsi

    Stroke and sepsi dataset for binary classification and transfer learning

    The resource: 'Stroke and sepsi' is not accessible as guest user. You must login to access it!
Personal Data Attributes

Description: Personal Data related Information

Field Value
ChildrenData No
Personal Data No
Personal data was manifestly made public by the data subject No
Additional Info
Field Value
Accessibility Both
Accessibility Mode Download
Associate Project FAIR
Availability On-Line
Basic rights Download
Basic rights Copying
Basic rights Distribution
Basic rights Modification
Basic rights Communication
Basic rights Making available to the public
Creation Date 2023-10-01
Creator Mignone, Paolo, [email protected], orcid.org/0000-0002-8641-7880
Dataset Citation DOI: 10.1016/j.bdr.2024.100456
Dataset Re-Use Safeguards none
Field/Scope of use Any use
Group Health Studies
Group Others
License term 2024-07-04 /3024-07-04
Manifestation Type Virtual
Processing Degree Primary
Retention Period 3024-07-04 /4024-07-04
SoBigData Node SoBigData EU
SoBigData Node SoBigData IT
Sublicense rights No
Territory of use World Wide
Thematic Cluster Other
system:type Dataset
Management Info
Field Value
Author Mignone Paolo
Maintainer Mignone Paolo
Version 1
Last Updated 23 November 2024, 16:03 (CET)
Created 23 November 2024, 16:03 (CET)