deadduck83’s Profile | Apple Developer Forums

!! Assistance needed: Create ML - Scan file with multiple tables on it

Objective: I am in the process of developing an application that utilizes machine learning (Core ML) to interact with photographs of documents, specifically focusing on those containing tables. Step 1: Capturing the Image The application will initiate by allowing users to take photos of documents. The key here is not just any part of the document, but specifically the sections where tables are present. Step 2: Image Analysis through Machine Learning Upon capturing the image, the next phase involves a machine learning model. Using Apple's Create ML tool with Swift, the application will analyze the image. The model's task is two-fold: Identifying the Table: Distinguish the table from other document information, ensuring it recognizes and isolates the table structure within the photograph. Ignoring Irrelevant Information: Concurrently, the model will disregard all non-table content, focusing the application's resources on the table data. Step 3: Data Extraction and Training Once the table is identified, the real work begins. The application will engage in detailed scrutiny, where it's trained to understand and recognize row and column data based on specific datasets. This training will enable the application to 'read' the table accurately, much like a human would, by identifying the organization of information into rows and columns. Step 4: Information Storage Post-analysis, the application will extract this critical data, storing it in a structured format. Each piece of identifiable information from the rows and columns will be systematically organized into a Dictionary or an Object. This structure is not just for immediate use but also efficient for future data operations within the app. Conclusion: Through these sequential steps, the application transitions from merely capturing an image to intelligently recognizing, deciphering, and storing table data from within a physical document. This streamlined process is all courtesy of integrating machine learning into the app's functionality, promising significant efficiency and accuracy in data handling. Realistically, I have not found any good examples out there so I am attempting to create my own ML (with no experience 😅), so any guidance or help would be very much appreciated.

Posted

deadduck83.

Last updated

User Profile

deadduck83

Posts

Posts

!! Assistance needed: Create ML - Scan file with multiple tables on it