Sharepoint Forum

Ask Question   UnAnswered
Home » Forum » Sharepoint       RSS Feeds

Indexing scanned images

  Asked By: Katie    Date: Feb 20    Category: Sharepoint    Views: 1041

This is not so much a Sharepoint technical question, but I'm hoping for some insight from experienced Sharepointers.
I've been asked to look into solving a specific problem using the tools at hand, which at this point includes SPS 2003. The main issues seem to boil down to one involving document storage and search. A particular department has documents (mostly pdf and tif) coming into a set of public folders in Exchange from remote locations. All of these files come from manually scanned pages of paper, as far as I know. Users have a very hard time finding what they need in the ensuing mess.

In a test environment I have duplicated the process, connecting the public folders involved to document libraries in a WSS site. I have turned on tif OCR and installed the pdf iFilter from Adobe.

1. In general, OCR doesn't seem to be accurate enough in either pdf or tif documents - and not automatic for the pdf files in any case. Does anyone have experience with this - maybe a better 3rd party OCR product?

2. I've tried a custom list requiring metadata tagging of the searchable info from the documents, but this seems cumbersome and a bit of overkill, frankly. Each incoming document would need to be opened as it arrived, the pertinent data extracted and then typed into a custom field for searches at a later point in time.

In the end, I don't believe that SPS is the tool to solve this problem. There is some collaboration needed on the resulting documents, but the main problem is indexing and searching scanned images for text/numbers. If this could be solved adequately in Exchange alone, there would be little need for SPS IMHO.



2 Answers Found

Answer #1    Answered By: Kyla Eckert     Answered On: Feb 20

Are you saying that the automatic OCR indexing  of TIFF files in SPS
isn't meeting your needs?

If so, I don't know of any way to make that more accurate. If not, you
might want to consider learning about this default option of SPS.

Answer #2    Answered By: Alisha Holmes     Answered On: Feb 20

Correct - I was refering to the automatic OCR of tiff files being less accurate than hoped for!

I think this falls way outside the scope of SPS, so I'll focus on other projects instead.

Didn't find what you were looking for? Find more on Indexing scanned images Or get search suggestion and latest updates.