Price Rank Prediction of a Company by Utilizing Data Mining Methods on Financial Disclosures

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

IEICE-INST ELECTRONICS INFORMATION COMMUNICATION ENGINEERS

Open Access Color

GOLD

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Average

Research Projects

Journal Issue

Abstract

The use of reports in action has grown significantly in recent decades as data has become digitized. However, traditional statistical methods no longer work due to the uncontrollable expansion and complexity of raw data. Therefore, it is crucial to clean and analyze financial data using modern machine learning methods. In this study, the quarterly reports (i.e. 10Q filings) of publicly traded companies in the United States were analyzed by utilizing data mining methods. The study used 8905 quarterly reports of companies from 2019 to 2022. The proposed approach consists of two phases with a combination of three different machine learning methods. The first two methods were used to generate a dataset from the 10Q filings with extracting new features, and the last method was used for the classification problem. Doc2Vec method in Gensim framework was used to generate vectors from textual tags in 10Q filings. The generated vectors were clustered using the K-means algorithm to combine the tags according to their semantics. By this way, 94000 tags representing different financial items were reduced to 20000 clusters consisting of these tags, making the analysis more efficient and manageable. The dataset was created with the values corresponding to the tags in the clusters. In addition, PriceRank metric was added to the dataset as a class label indicating the price strength of the companies for the next financial quarter. Thus, it is aimed to determine the effect of a company's quarterly reports on the market price of the company for the next period. Finally, a Convolutional Neural Network model was utilized for the classification problem. To evaluate the results, all stages of the proposed hybrid method were compared with other machine learning techniques. This novel approach could assist investors in examining companies collectively and inferring new, significant insights. The proposed method was compared with different approaches for creating datasets by extracting new features and classification tasks, then eventually tested with different metrics. The proposed approach performed comparatively better than the other machine learning methods to predict future price strength based on past reports with an accuracy of 84% on the created 10Q filings dataset.

Description

Keywords

10-Q Filings, PriceRank XBRL, Doc2Vec, K means, CNN, Convolutional Neural-Network, K-Means, Language, Filings

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Q4

Scopus Q

Q3
OpenCitations Logo
OpenCitations Citation Count
2

Source

Ieice Transactions On Information and Systems

Volume

E106D

Issue

9

Start Page

1461

End Page

1471
PlumX Metrics
Citations

CrossRef : 2

Scopus : 2

Captures

Mendeley Readers : 5

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
0.64697809

Sustainable Development Goals

3

GOOD HEALTH AND WELL-BEING
GOOD HEALTH AND WELL-BEING Logo

6

CLEAN WATER AND SANITATION
CLEAN WATER AND SANITATION Logo

7

AFFORDABLE AND CLEAN ENERGY
AFFORDABLE AND CLEAN ENERGY Logo

8

DECENT WORK AND ECONOMIC GROWTH
DECENT WORK AND ECONOMIC GROWTH Logo

9

INDUSTRY, INNOVATION AND INFRASTRUCTURE
INDUSTRY, INNOVATION AND INFRASTRUCTURE Logo

11

SUSTAINABLE CITIES AND COMMUNITIES
SUSTAINABLE CITIES AND COMMUNITIES Logo

12

RESPONSIBLE CONSUMPTION AND PRODUCTION
RESPONSIBLE CONSUMPTION AND PRODUCTION Logo

13

CLIMATE ACTION
CLIMATE ACTION Logo

14

LIFE BELOW WATER
LIFE BELOW WATER Logo