Using full-text search with PDF files in SQL Server 2008
Asked Answered
P

4

5

I have SQL Server 2008 R2 and am trying to implement full-text search on a PDF BLOB.

I have installed the iFilter from Adobe and confirmed it is installed

Using

EXEC sp_help_fulltext_system_components 'filter';

filter .pdf E8978DA6-047F-4E3D-9C78-CDBE46041603
C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin\PDFFilter.dll
11.0.1.36 Adobe Systems, Inc.

I then created a fulltext catalog for the FT Index and created the FT index

CREATE FULLTEXT INDEX ON Compliance_Updates
( 
FileDesc
 Language 1033,
 FileData
   TYPE COLUMN FileDataType
) 
 KEY INDEX PK_Compliance_Updates
     ON FT_Compliance_Updates; 

I then forced a rebuild of the index after adding some PDF's to the table. The index shows..

Catalogue Size : 0MB
Item Count : 2
Unique Key Count : 7
Name : FT_Compliance_Updates
Last Population Date : 12/11/2013 09:36
Population Status : Idle

However, when I perform the following search, I get zero results...

SELECT FileID, FileDesc, PubDate 
FROM Compliance_Updates 
WHERE CONTAINS(FileData, 'mortgage')

I've tried deleting the catalog, removing all the table records and indexes (including PK), re-running the iFilter install

exec sp_fulltext_service 'load_os_resources', 1;
exec sp_fulltext_service 'verify_signature', 0;

Restarting SQL Server, re-creating the indexes and FT catalog, nothing seems to work?

Paulinapauline answered 12/11, 2013 at 9:45 Comment(0)
I
4
  • Version 11.x didn't work for me, but 9.x worked.
  • Also you need to add C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\ at the end of the System's PATH variable as well. Start > Control Panel > System > Advanced Environment Variables -> System Variables -> find PATH
Inearth answered 10/3, 2014 at 7:9 Comment(0)
B
4

Version 11.x didn't work for me too. 9.x works :) It is hard to find 9.x 64 Bit on the website of Adobe. But on FTP you could find it here: ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/

Botch answered 27/11, 2014 at 10:30 Comment(1)
I don't seem to be able to install 9.x on Windows 10. The setup runs but then it immediately disappear without installing anything.Pasahow
P
2

FWIW, even with SQL Server 2014, I was not able to get Version 11.x to work and so downloaded Version 9.x from the FTP link kindly provided above. Version 9.x still seems to be the way to go as it also worked for me! :^)

Pox answered 17/6, 2015 at 17:29 Comment(0)
V
0

2022

Adobe trad IFilter is still currently available and as browsers move from ftp to http(s) the old legacy download links are often considered insecure.

currently available

http://ftp.adobe.com/pub/adobe/acrobat/win/11.x/PDFFilter64Setup.msi http://download.adobe.com/pub/adobe/acrobat/win/9.x/PDFiFilter64installer.zip

Alternatively

SumatraPDF (which I support) can install a Search Filter via installer options (if SumatraPDF is installed, is free for commercial use) however when used with outlook for previewing or SQL searching those are not its intended uses, so support may be limited, ensure the index is rebuilt after testing smaller areas to ensure windows search index is functioning. For support on index searching ask Microsoft!

Free for non-commercial use there is the TET one from Pdf-lib https://www.pdflib.com/download/tet-pdf-ifilter/ business users should get paid support.

On Windows Server systems TET PDF IFilter can be evaluated without a license. However, it will only process PDF documents with up to 10 pages and 1 MB size unless a valid license key has been applied.

In all cases the designed use is Windows Search such as here finding the requested word mortgage and note that SumatraPDF does NOT highlight the result in any page previews only those, once the documents are opened.

enter image description here

Vagina answered 22/5, 2022 at 11:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.