The Scientific Document Management System (SDMS) allows different types of documents to be uploaded from different sources into the system. SDMS can recognize the file type, based on predefined and learned recognition templates. The relevant data from the document can then be extracted, based on data extraction templates. Data can include images, tables, and keywords which are then exported to STARLIMS or third-party systems.
The following steps are used by the system to recognize and extract data from a "raw" document:
1. | Provide definitions - Prior to uploading documents for processing, you should: |
• | Classify the document's file type. File type is used to associate other elements used for processing the file such as the UXML structure (for export of extracted data), import script (to import extracted data to STARLIMS), File Recognition Pattern (FRP) and Data Recognition Pattern. Is also allows the system to automatically associate the file with a workflow. |
• | Define unified XML templates to allow the system to bind the unstructured data from the document to structured UXML which is an accessible format in other systems. The user can use STARLIMS predefined objects. |
• | If you would like your document to be reviewed, approved or rejected by other users, you create a workflow for the document. The workflow steps can include creating, reviewing and modifying, approving, exporting the data to other systems, and other actions that may be defined. Permissions are assigned to users for each of the steps in the workflow. |
For more information about these and other settings, refer to the Configuring SDMS Administration Tabs chapter.
2. | Upload files into SDMS - After settings are defined, documents can be uploaded into SDMS for processing. Documents can be uploaded from multiple sources: |
• | From STARLIMS - Such as through an attachment. |
• | SDMS Grabber - A polling application that checks a location on a defined server to see if there are files to be uploaded. |
• | Direct upload through the Upload control such as from the Documents and Workflows modules in SDMS. |
• | Microsoft Office - Users working on documents using Microsoft Word, Excel, Power Point or Outlook can upload a document from within these applications. |
• | Potentially there can be another system linked to SDMS using Web services. |
For more information about upload methods, refer to the Loading Files into SDMS chapter.
3. | Create templates for unrecognized files - After a document is uploaded, the system attempts to recognize it, based on meta tag values in the document that uniquely identify the document. For example, an instrument ID can identify an instrument type document. When a new type of document is uploaded, it is typical to create both a file recognition template (FRP) and a data extraction template (DRP) for it. |
• | File recognition pattern (FRP) - Includes the meta tag values, file type, and optionally a workflow. When uploading similar documents, the system will search the meta tags in the document and will assign the relevant file type and workflow. |
• | Document recognition pattern (DRP) - Includes the method for extracting the required data objects such as keywords, tables and graphs, and transforming the raw data from an unstructured format into the XML structured format. |
For more information about creating recognition and data extraction methods, refer to the Designing Document Templates chapter.
4. | Map data to the Unified XML - Once the DRP design is done, it is necessary to bind the extracted data to a unified XML structure which allows the system to export the data to other applications. The data is extracted using the Unified XML template previously created. |
For more information about Unified XML templates, refer to the Binding Extracted Information to Unified XML section.
5. | Execute Workflow Tasks - You can link a document to a workflow and require steps for the approval of the document. Based on the release steps, actions, and permissions defined in the workflow, different users can be asked to perform tasks such as reviewing, editing, approving, or rejecting the document. |
For more information about designing workflows, refer to the Workflow Designer section.
NOTE Workflow tasks can be managed either through the My Tasks STARLIMS application or the Outlook SDMS Add-Ins. For more information about executing workflows using My Tasks and Outlook SDMS Add-Ins, refer to the Executing Workflow Tasks (My Tasks) chapter.
6. | Export data from SDMS – Usually the last step of a workflow is the Export stage. Then the file can be sent to STARLIMS or a third party system. A Batch Processing runs in the background. The process identifies file types with an Importing script and checks if the files are in the Export stage. Then the data will be exported to the STARLIMS tables. |
NOTE For more information about exporting extracted data, refer to the Exporting Data from SDMS section.
|