Enhancing Classification Accuracy Through Feature Extraction: a Comparative Study of Discretization and Clustering Approaches on Sensor-Based Datasets

No Thumbnail Available

Date

2023

Authors

Esme, Engin

Journal Title

Journal ISSN

Volume Title

Publisher

Springer London Ltd

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

Accuracy in a classification problem is directly related to the ability of features to adequately represent the differences between classes. In sensor-based datasets, measurements taken from the sensor form feature vectors. Measuring a given physical signal with different sensors enables it to be expressed with various feature vectors. For this reason, using sensor fusion is preferred in data acquisition. However, each new sensor added to the system brings problems such as complex sensory and supply circuit structures, extra energy consumption, signal sampling complexity, and time-consumption. On the other hand, in cases where sensor fusion cannot be applied, the ability of data from one sensor to represent classes may be insufficient. To avoid these problems, discretization and clustering approaches are suitable to derive more features from fewer sensors. The aim is to improve the accuracy of classifiers by deriving new feature vectors that can represent sensor data. This research reveals the contributions of clustering and discretization approaches as feature extraction methods to improve classification accuracy. In this study, three widely used machine learning techniques are investigated on Perfume, Wine, Seeds, and Gas datasets from the UCI repository. This comprehensive empirical study indicates that the accuracy of classifiers improves by up to 20% on datasets obtained from some sensors by using both discretization and clustering as feature-extracting methods.

Description

Article; Early Access

Keywords

Hybrid classifier, Machine learning, Discretization, Clustering

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Q2

Scopus Q

Q2
OpenCitations Logo
OpenCitations Citation Count
1

Source

Knowledge and Information Systems

Volume

66

Issue

Start Page

339

End Page

356
PlumX Metrics
Citations

Scopus : 1

Captures

Mendeley Readers : 3

SCOPUS™ Citations

1

checked on Feb 03, 2026

Web of Science™ Citations

1

checked on Feb 03, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.15887805

Sustainable Development Goals

SDG data is not available