Human Action Recognition Using Attention Based Lstm Network With Dilated Cnn Features

Loading...
Thumbnail Image

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

ELSEVIER

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Top 0.1%
Influence
Top 1%
Popularity
Top 0.1%

Research Projects

Journal Issue

Abstract

Human action recognition in videos is an active area of research in computer vision and pattern recognition. Nowadays, artificial intelligence (AI) based systems are needed for human-behavior assessment and security purposes. The existing action recognition techniques are mainly using pre-trained weights of different AI architectures for the visual representation of video frames in the training stage, which affect the features' discrepancy determination, such as the distinction between the visual and temporal signs. To address this issue, we propose a bi-directional long short-term memory (BiLSTM) based attention mechanism with a dilated convolutional neural network (DCNN) that selectively focuses on effective features in the input frame to recognize the different human actions in the videos. In this diverse network, we use the DCNN layers to extract the salient discriminative features by using the residual blocks to upgrade the features that keep more information than a shallow layer. Furthermore, we feed these features into a BiLSTM to learn the long-term dependencies, which is followed by the attention mechanism to boost the performance and extract the additional high-level selective action related patterns and cues. We further use the center loss with Softmax to improve the loss function that achieves a higher performance in the video-based action classification. The proposed system is evaluated on three benchmarks, i.e., UCF11, UCF sports, and J-HMDB datasets for which it achieved a recognition rate of 98.3%, 99.1%, and 80.2%, respectively, showing 1%-3% improvement compared to the state-of-the-art (SOTA) methods. (C) 2021 Elsevier B.V. All rights reserved.

Description

Keywords

Artificial Intelligence, Action Recognition, Attention Mechanism, Big Data, Dilated Convolutional Neural Network, Deep Bi-Directional Lstm, Multimedia Data Security, Big Data, Framework, Security, Internet, Machine, Fusion, System, Things, Deep bi-directional LSTM, Multimedia data security, Artificial intelligence, Big data, Attention mechanism, Dilated convolutional neural network, Action recognition

Turkish CoHE Thesis Center URL

Fields of Science

0211 other engineering and technologies, 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Citation

WoS Q

Q1

Scopus Q

Q1
OpenCitations Logo
OpenCitations Citation Count
174

Source

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

Volume

125

Issue

Start Page

820

End Page

830
PlumX Metrics
Citations

CrossRef : 213

Scopus : 241

Captures

Mendeley Readers : 167

SCOPUS™ Citations

237

checked on Feb 03, 2026

Web of Science™ Citations

173

checked on Feb 03, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
18.80838413

Sustainable Development Goals

SDG data is not available