Create a thumbnail preview of documents (PDF, DOC, XLS, etc.) in PHP (LAMP)
Asked Answered
B

2

9

When users upload certain files to my site (such as .doc, .xls, .pdf, etc) I'd like to be able to generate a preview thumbnail (of the first page of the document). I'm working with PHP in a LAMP stack but would be happy with any library or command-line tool that can do the job (Linux highly preferred).

Bushweller answered 10/10, 2011 at 4:27 Comment(3)
@BrianRoach Nope - already saw that question before posting. It only refers to PDFs. I'm looking for a tool that can do general documents (including PDFs but also XLS, DOC, and so on).Bushweller
well for this, there is a trick combine both of this #1225730 and #468293 which is convert xls or whatever to pdf then get the image from pdfSculley
I'm looking for this same thing and agree that this applies to more than just PDF/Office docs (eg LaTEX or SAS)Photocomposition
V
11

It's not easy to convert certain document formats to image. php alone cannot do this. The 'proper' way to do this is to first of all have the program installed on your server that can open the document in that format. For example, for .doc documents you can use OpenOffice it also can open most other document formats You then need to setup your open office to work in 'headless' mode, sending the output to virtual display (XVFB is what you going to need on Linux)

You php script will then call OpenOffice, passing the path to uploaded doc. OpenOffice will actually open that doc. Then you need to create an image from the screen buffer. You can use ImageMagick for that

Then once you have the capture of your screen you can resize it to a thumbnail.

Look at this link for more details

http://www.mysql-apache-php.com/website_screenshot.htm

Ventriloquist answered 12/10, 2011 at 14:23 Comment(1)
Good answer, but I think you might have meant ImageMagick, in case someone is looking for itPhotocomposition
M
3

There are plenty of ways to tackle this issue considering a wide range of available APIs that can be used (some require a subscription). If the preferred approach would be to use native PHP without relying on third-party applications, there are a few libraries that can come in handy such as PHP Office (Note that it varies which version to use based on your PHP version as an older deprecated version can still be found online).

There are many ways to do it, the approach that this answer follows shall require the availability of composer and built-in Imagick extension in PHP to facilitate using the library. This answer shall cover the way to create thumbnails for Excel, PDF, and Word files only as for PowerPoint files the PHP library that handles it has an issue with creating thumbnails due to lack of PDF writer as stated in this StackOverFlow Question (Convert PPT and PPTX to PDF - PHP).

After installing composer and ensuring the availability of Imagick extension within your PHP version, run the below composer codes to install the library using composer (simply go to your project directory and open cmd there):

PHPWord

composer require phpoffice/phpword:dev-master

PHPSPreadsheet

composer require phpoffice/phpspreadsheet

add those lines at the top of the PHP script that shall perform this task:

require_once '../vendor/autoload.php'; // Calls Composer

use PhpOffice\PhpSpreadsheet\IOFactory as SpreadsheetIOFactory;;
use PhpOffice\PhpSpreadsheet\Writer\Pdf\Mpdf as excelMPDF;
use \PhpOffice\PhpSpreadsheet\Worksheet\PageSetup;
use \PhpOffice\PhpSpreadsheet\Style\Fill;

use \PhpOffice\PhpWord\IOFactory as wordIOFactory;
use PhpOffice\PhpWord\Writer\Pdf\Mpdf as wordMPDF;

The approach would be by making a certain tweak by converting all Non-PDF documents to PDF and then utilizing the Imagick PHP extension to create the desired thumbnail.

I have written a function that does that for you. As explained earlier it starts by initiating Imagick object and then creates the thumbnail accordingly based on the input file's extension. Note that this code only requires the path to the file along with the path name without the need to upload anything as the respected files are being read by the library and Imagick extension accordingly.

Note: [0] is added in the $im->readImage function to indicate the first page of the PDF.

$im = new Imagick();
$im->setResolution(600, 600);
if($ext == 'pdf'){
    $im->readImage($pf . '[0]');
} else if ($ext == 'xls' || $ext == 'xlsx') {
   $spreadsheet = SpreadsheetIOFactory::load('path/to/file.xlsx');
   $spreadsheet->getActiveSheet()->getParent()->getDefaultStyle()->getFill()->setFillType(\PhpOffice\PhpSpreadsheet\Style\Fill::FILL_SOLID);
   $spreadsheet->getActiveSheet()->getParent()->getDefaultStyle()->getFill()->getStartColor()->setARGB('FFFFFFFF');
   // Create a new PDF writer using mPDF
   $writer = new excelMPDF($spreadsheet);   
   // Set the output file path
   $outputFilePath = 'path/filename.pdf';
   // Write the PDF to the output file path
   $writer->save($outputFilePath);
   $im->readImage('path/filename.pdf'. '[0]');
} else if ($ext == 'doc' || $ext == 'docx') {
     // Load the Word document
    $phpWord = wordIOFactory::load('path/to/file.docx');
    // Set up the PDF writer
    $writer = new wordMPDF($phpWord);   
    // Set the output file path
    $outputFilePath = 'path/filename.pdf';

   // Write the PDF to the output file path
   $writer->save($outputFilePath);
   $im->readImage('path/filename.pdf'. '[0]');
}
// Set the background color to white
$im->setImageBackgroundColor('#FFFFFF');
$im->setImageFormat('jpg');
$im->setImageFilename('image_name.jpg');
$fileHandle = fopen('path/image_name.jpg', "w");
$im->writeImageFile($fileHandle);
$output = $im->getimageblob();
$outputtype = $im->getFormat();
header('Content-Type: ' . $outputtype);
$im->destroy(); 

echo 'Thumbnail Created!';

The code is pretty straightforward and simple it also can be integrated into whatever project you are working on. Unfortunately, due to limitations with PHPPresentation handling PowerPoint files shall not be covered in this answer as of now.

Hope this helps and that it saved someone's time.

Microreader answered 27/4, 2023 at 2:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.