Pavuluri Jithendra
Cognitive model for object detection based on speech-to-text conversion.
Jithendra, Pavuluri; Sai, Tummala Vinay; Mannam, Raj Kumar; Manideep, Ramini; Bano, Shahana
Authors
Abstract
The goal of this paper is to develop a model which is the integrated version of both SpeechRecognition and Object detection. This model is developed after undergoing the literature survey and the existing models that are related to Object Detection and Speech Recognition. There are several types of Speech Recognition and Object Detection models available so far. In addition to the existing models, this paper proposes a new model named "Cognitive Model for Object Detection based on Speech-to-Text Conversion, "which is an integrated version of both Speech Recognition and Object Detection models. Firstly, A speech command is provided as an input to the model, it takes the command and processes the data, and then it detects the specified object from a source of images. The detected object is represented with a rectangular box. This approach is implemented with the help of Google Speech Recognition and YOLO object detection models utilizing the Darknet and OpenCV frameworks.
Citation
PAVULURI, J., SAI, T.V., MANNAM, R.K., MANIDEEP, R. and BANO, S. 2020. Cognitive model for object detection based on speech-to-text conversion. In Proceedings of the 3rd International conference on intelligent sustainable systems (ICISS 2020), 3-5 December 2020, Thoothukudi, India. Piscataway: IEEE [online], pages 843-847. Available from: https://doi.org/10.1109/ICISS49785.2020.9315985
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 3rd International conference on intelligent sustainable systems (ICISS 2020) |
Start Date | Dec 3, 2020 |
End Date | Dec 5, 2020 |
Acceptance Date | Oct 22, 2020 |
Online Publication Date | Jan 18, 2021 |
Publication Date | Dec 31, 2020 |
Deposit Date | Sep 20, 2023 |
Publicly Available Date | Sep 20, 2023 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Pages | 843-847 |
ISBN | 9781728170909 |
DOI | https://doi.org/10.1109/ICISS49785.2020.9315985 |
Keywords | Speech recognition; Object detection; Machine learning |
Public URL | https://rgu-repository.worktribe.com/output/2064056 |
Files
PAVULURI 2020 Cognitive model for object (AAM)
(3 Mb)
PDF
Copyright Statement
© IEEE
You might also like
Fabric variation and visualization using light dependent factor.
(2023)
Presentation / Conference Contribution
Vehicle spotting in nighttime using gamma correction.
(2022)
Presentation / Conference Contribution
Comprehending object detection by deep learning methods and algorithms.
(2022)
Presentation / Conference Contribution
Detection of image forgery for forensic analytics.
(2022)
Presentation / Conference Contribution
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search