Content-based Image Retrieval using Tesseract OCR Engine and Levenshtein Algorithm

dc.contributor.authorAdjetey, C.
dc.contributor.authorAdu-Manu, K.S.
dc.date.accessioned2022-01-10T14:48:04Z
dc.date.available2022-01-10T14:48:04Z
dc.date.issued2021
dc.descriptionResearch Articleen_US
dc.description.abstract—Image Retrieval Systems (IRSs) are applications that allow one to retrieve images saved at any location on a network. Most IRSs make use of reverse lookup to find images stored on the network based on image properties such as size, filename, title, color, texture, shape, and description. This paper provides a technique for obtaining full image document given that the user has some portions of the document under search. To demonstrate the reliability of the proposed technique, we designed a system to implement the algorithm. A combination of Optical Character Recognition (OCR) engine and an improved text matching algorithm was used in the system implementation. The Tesseract OCR engine and Levenshtein Algorithm was integrated to perform the image search. The extracted text is compared to the text stored in the database. For example, a query result is returned when a significant ratio of 0.15 and above is obtained. The results showed a 100% successful retrieval of the appropriate file base on the match even when partial query images were submitted.en_US
dc.identifier.urihttp://ugspace.ug.edu.gh/handle/123456789/37516
dc.language.isoenen_US
dc.publisherIJACSAen_US
dc.subjectImage Retrieval Systemsen_US
dc.subjectimage processingen_US
dc.subjectOptical Character Recognition (OCR)en_US
dc.subjecttext matching algorithmen_US
dc.subjectTesseract OCR engineen_US
dc.subjectLevenshtein Algorithmen_US
dc.titleContent-based Image Retrieval using Tesseract OCR Engine and Levenshtein Algorithmen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Contentbased-Image-Retrieval-using-Tesseract-OCR-Engine-and-Levenshtein-AlgorithmInternational-Journal-of-Advanced-Computer-Science-and-Applications.pdf
Size:
3.25 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.6 KB
Format:
Item-specific license agreed upon to submission
Description: