GROUNDLIGHT AI/DOCUMENT EXTRACTION
(GENAI ON EDGE DEVICE)
SHANTANU RAUT
PG Student
Department of Computer Science,
G.H. Raisoni University, Amravati,India
Received on: 11 May ,2024 Revised on: 18 June ,2024 Published on: 29 June ,2024
Abstract— Groundlight AI/Document extraction (GenAI on edge device) :- The “Document Data Extraction Tool: Integrating GroundLight AI and Tesseract OCR” project addresses the critical need for efficient data extraction from unstructured documents, such as PDFs and images, prevalent in today’s digital age. By combining the capabilities of GroundLight AI for structured data extraction and Tesseract OCR for text extraction from images, the project aims to automate and enhance the accuracy of document processing. This comprehensive tool features a user-friendly interface, supports multiple document formats, and ensures systematic data organization, significantly reducing manual effort and errors. Rigorous testing validates the tool’s reliability and performance, making it a valuable asset for organizations seeking to unlock insights from their document repositories efficiently. The future scope includes integrating natural language processing for advanced analysis, machine learning for document classification, and expanding format support, thereby further enhancing document management and data utilization across various sectors.
Keywords – Edge-device Application, Python, Document Extraction.
Doi Link – https://doi.org/10.69758/GIMRJ2406I8V12P114
Download