Perforce Chronicle 2012.2/486814
API Documentation
|
Filter to convert a Microsoft Word 2007 document to text. More...
Public Member Functions | |
filter ($docx) | |
Extract text contents from a Word format. |
Filter to convert a Microsoft Word 2007 document to text.
This implementation uses Zend_Search_Lucene_Docuemtn_Docx to extract text contents from a word document (supports Word 2007 format only.)
P4Cms_Filter_DocxToText::filter | ( | $ | docx | ) |
Extract text contents from a Word format.
string | $docx | the Docx to be filtered. |
Zend_Search_Lucene_Document_Exception |
{ // shortcut if we have an empty string if (!strlen($docx)) { return; } // write contents to a tmp file $tempFile = tempnam(sys_get_temp_dir(), 'word'); file_put_contents($tempFile, $docx); $document = Zend_Search_Lucene_Document_Docx::loadDocxFile($tempFile); // remove the temp file unlink($tempFile); return $document->getFieldValue('body'); }