Cognitive model for object detection based on speech-to-text conversion.

Jithendra, Pavuluri; Sai, Tummala Vinay; Mannam, Raj Kumar; Manideep, Ramini; Bano, Shahana

doi:10.1109/ICISS49785.2020.9315985

Cognitive model for object detection based on speech-to-text conversion.

Jithendra, Pavuluri; Sai, Tummala Vinay; Mannam, Raj Kumar; Manideep, Ramini; Bano, Shahana

Authors

Pavuluri Jithendra

Tummala Vinay Sai

Raj Kumar Mannam

Ramini Manideep

Dr Shahana Bano s.bano@rgu.ac.uk
Lecturer

Abstract

The goal of this paper is to develop a model which is the integrated version of both SpeechRecognition and Object detection. This model is developed after undergoing the literature survey and the existing models that are related to Object Detection and Speech Recognition. There are several types of Speech Recognition and Object Detection models available so far. In addition to the existing models, this paper proposes a new model named "Cognitive Model for Object Detection based on Speech-to-Text Conversion, "which is an integrated version of both Speech Recognition and Object Detection models. Firstly, A speech command is provided as an input to the model, it takes the command and processes the data, and then it detects the specified object from a source of images. The detected object is represented with a rectangular box. This approach is implemented with the help of Google Speech Recognition and YOLO object detection models utilizing the Darknet and OpenCV frameworks.

Citation

PAVULURI, J., SAI, T.V., MANNAM, R.K., MANIDEEP, R. and BANO, S. 2020. Cognitive model for object detection based on speech-to-text conversion. In Proceedings of the 3rd International conference on intelligent sustainable systems (ICISS 2020), 3-5 December 2020, Thoothukudi, India. Piscataway: IEEE [online], pages 843-847. Available from: https://doi.org/10.1109/ICISS49785.2020.9315985

Presentation Conference Type	Conference Paper (published)
Conference Name	3rd International conference on intelligent sustainable systems (ICISS 2020)
Start Date	Dec 3, 2020
End Date	Dec 5, 2020
Acceptance Date	Oct 22, 2020
Online Publication Date	Jan 18, 2021
Publication Date	Dec 31, 2020
Deposit Date	Sep 20, 2023
Publicly Available Date	Sep 20, 2023
Publisher	Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed	Peer Reviewed
Pages	843-847
ISBN	9781728170909
DOI	https://doi.org/10.1109/ICISS49785.2020.9315985
Keywords	Speech recognition; Object detection; Machine learning
Public URL	https://rgu-repository.worktribe.com/output/2064056