As many as one in every three documents can be hidden within a file or document management system because thousands of image-based files like JPEGs, TIFFs, and PNGs are added to repositories like Dropbox and SharePoint without the document owner realizing they aren’t fully text-searchable.
Not being able to search a document’s pages for names, phrases, or other general information, can make locating it a difficult and time-consuming task.
Some organizations have tried and failed using Multi-Function devices to stop the creation of hidden documents. A scanner can’t process a backlog of files already in a repository, for example, or documents that come into the organization other ways, like as email attachments.
John Kelly, Head of IT at ByrneWallace, said hidden documents had become an issue for the Irish firm, partly because of patchy OCR processing.
John explained that “because the firm didn’t actively OCR documents, we ended up with so many that weren’t text-searchable.”
Hidden documents were considered an unacceptable risk by John and the firm’s stakeholders, so John began to look for integrated OCR software that could recognize and process non-searchable documents.
“The driving force for change was the risk that we wouldn’t be able to discover historical documents because we couldn’t search for the content,” said John.
The search led John to OCR software that could recognize non-searchable documents and automatically add a text layer.
OCR software finds and converts hidden documents at ByrneWallace
At ByrneWallace, OCR software is processing backlogs of legacy documents plus any newly-profiled documents to identify hidden files requiring conversion to text-searchable PDFs.
When it runs in the backend – such as within ByrneWallace’s iManage document management system – OCR software can search and convert as staff upload documents.
This set-and-forget style of OCR processing avoids the issue of people forgetting to OCR documents prior to uploading.
Because it runs in the backend and not at the point of scanning, OCR software can process an entire repository of documents. This was important for ByrneWallace since it wanted historical documents in its iManage repository to be 100% discoverable.
John reports that since the IT team integrated OCR software with its iManage document management system, the staff at Byrne Wallace can search for 100% of documents.
OCR processing is just one of the ways ByrneWallace is using our product suite. Read the case study and discover how the firm is managing the risk of email data breaches, simplifying PDF redaction, and achieving more reliable document comparison.
About the author
Melody has more than 15 years' experience in marketing products and services to the legal industry. She spearheads marketing efforts in the EMEA region for DocsCorp, and can be spotted behind the booth at events and conferences across the region. When not working from the London office, Melody enjoys travelling and spending quality time with her family.