This repository consists of Ethereum Transaction Data Generator (ETDG) and Ethereum Transaction Fraud Detection (ETFD) dataset.
Existing Ethereum transaction fraud detection datasets often suffer from issues such as single cardinality, high cardinality, missing values, categorical data encoding, and staleness. ETDG is a tool designed to create high-quality transaction datasets suitable for classification tasks. It utilizes graph traversal and a genetic algorithm with a novel fitness function for effective feature extraction. This approach mitigates the complexities associated with cardinality, data encoding, and staleness in Ethereum transaction data.
ETFD is a comprehensive and high-quality dataset designed to facilitate research and development in the domain of fraud transaction detection within the Ethereum blockchain. Generated using ETDG, the ETFD dataset addresses common challenges in public Ethereum fraud detection datasets, such as single cardinality, high cardinality, missing values, and data encoding issues, thereby reducing the risk of model overfitting and enhancing model performance.