PHP Classes

Create Microsoft Word DOCX files from HTML in PHP Part 2: More Complex Documents

Recommend this page to a friend!
  Blog PHP Classes blog   RSS 1.0 feed RSS 2.0 feed   Blog Create Microsoft Word...   Post a comment Post a comment   See comments See comments (14)   Trackbacks (0)  

Author:

Viewers: 2,029

Last month viewers: 121

Categories: PHP Tutorials

In the first part of this article it was presented the class VsWord as a means to create Microsoft Word DOCX articles from HTML.

Read this article to learn how to compose more complex documents either using HTML or calls to the class that can insert document elements programmatically.




Loaded Article

Contents

More Complex Documents

How to Align a Text

Using Font Styles

Page Breaks

Creating Tables Dynamically

Creating an Ordered List

Conclusion

PHP Word Document Generator to Convert HTML to DOCX

More Complex Documents

The greatest feature of the VSWord package is to be able to parse HTML and convert it into elements of a document to be saved in DOCX format as it was explained in the first part of this article.

However, you can compose the document directly using classes that represent those different types of document elements.

Let us take a look how you can create more complex documents using either HTML or document element classes.

How to Align a Text

One of the main elements in a word processing document is a paragraph. In this example you can see how to create a paragraph and align it to the left, right, center or both using the AlignNode class.

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();

 $paragraph = new PCompositeNode(); 
 $paragraph->addPNodeStyle( new AlignNode(AlignNode::TYPE_RIGHT) );
 $paragraph->addText("Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...");
 $doc->getDocument()->getBody()->addNode( $paragraph );

 $paragraph = new PCompositeNode(); 
 $paragraph->addPNodeStyle( new AlignNode(AlignNode::TYPE_LEFT) );
 $paragraph->addText("Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...");
 $doc->getDocument()->getBody()->addNode( $paragraph );

 $paragraph = new PCompositeNode(); 
 $paragraph->addPNodeStyle( new AlignNode(AlignNode::TYPE_CENTER) );
 $paragraph->addText("Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...");
 $doc->getDocument()->getBody()->addNode( $paragraph );

 $paragraph = new PCompositeNode(); 
 $paragraph->addPNodeStyle( new AlignNode(AlignNode::TYPE_BOTH) );
 $paragraph->addText("Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...");
 $doc->getDocument()->getBody()->addNode( $paragraph );

 echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>';

 $doc->saveAs('align.docx');

?>

In this example we first include the VsWord class then define the autoloader for the package classes. Then we create a new object of that class.

As I mentioned before, the DOCX format is actually an XML, so it consists from nodes. So for the paragraph I am creating a new composite node. The I add styles and data in that node.

Now we have a paragraph node, we now add a style in which we define the alignment, as you can notice we will use a predefined constant for the alignment. We have four types of alignment: right, left, center, and both.

After this we add the text. We add just the text in that paragraph. If we have an article of ten paragraphs, then we make ten paragraph nodes separately. The result is then added to the main document body. That is repeated until we create all the paragraphs we need and finally save it in own DOCX file.

OK let us do the same now with HTML:

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();  
 $parser = new HtmlParser($doc);  

 $html = '<p align="right">Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...</p>';

 $html .= '<p align="left">Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...</p>';

 $html .= '<p align="center">Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...</p>';

 $html .= '<p align"justify">Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...</p>';

 $parser->parse($html); 
 echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>'; 
 $doc->saveAs('align.docx');

?>

Using Font Styles

Like defining the alignment of the paragraph, we also can change the font style of its text. For that we can use several classes that represent the different types of text styles like the font size, bold, italic, underline, etc..

Let us take look at an example of how to do it using these classes:

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();
 $body = $doc->getDocument()->getBody();

 $title = new PCompositeNode();
 $rTitle = new RCompositeNode();
 $title->addNode($rTitle); 
 $rTitle->addTextStyle(new BoldStyleNode());
 $rTitle->addTextStyle(new FontSizeStyleNode(36));
 $rTitle->addText("Header 1");
 $body->addNode( $title );

 $paragraph = new PCompositeNode();
 $rParagraph= new RCompositeNode();
 $paragraph->addNode($rParagraph);  
 $rParagraph->addTextStyle(new FontSizeStyleNode(14));
 $rParagraph->addText("Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...");
$body->addNode( $paragraph );

 $paragraph2 = new PCompositeNode();
 $rParagraph2 = new RCompositeNode();
 $paragraph2->addNode($rParagraph2);  
 $rParagraph2->addTextStyle(new FontSizeStyleNode(14));
 $rParagraph2->addText("Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...");

 $rParagraph3 = new RCompositeNode();
 $paragraph2->addNode($rParagraph3);
 $rParagraph3->addTextStyle(new FontSizeStyleNode(11));
 $rParagraph3->addTextStyle(new ItalicStyleNode());
 $rParagraph3->addText('Italic text');

 $body->addNode( $paragraph2 );
 
 $doc->saveAs('./base.docx');

?>

In this example I created a title as a header, then using the RCompositeNode class I added a text style, first make it bold, then add a font size of 36. The same was done for the next two paragraphs. They use a font size of 14. At the end I am adding an italic text of size 11 to the second paragraph.

Now let us do the same now using HTML:

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();  
 $parser = new HtmlParser($doc);  

 $html = '<font size="36"><b>Header 1</b></font>';

 $html .= '<p><font size="14">Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...</font></p>';

 $html .= '<p><font size="14">Some more text ... More text about... Some more text ... More text about... Some more text ... More text about...</font> <font size="11"><i>Italic text</i></font></p>';

 $parser->parse($html); 
 echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>'; 
 $doc->saveAs('base.docx');

?>

Page Breaks

Above it were presented examples with a paragraph and a title. Now I am showing how to make a page break. In HTML we do not need that, but in Word documents we can create a page break and start a new page.

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();
 $body = $doc->getDocument()->getBody();

 $title = new PCompositeNode();
 $rTitle = new RCompositeNode();
 $title->addNode($rTitle); 
 $rTitle->addTextStyle(new BoldStyleNode());
 $rTitle->addTextStyle(new FontSizeStyleNode(36));
 $rTitle->addText("Header 1");
 $body->addNode( $title );

 $body->addNode(new PageBreakNode());

 $title = new PCompositeNode();
 $rTitle = new RCompositeNode();
 $title->addNode($rTitle); 
 $rTitle->addTextStyle(new BoldStyleNode());
 $rTitle->addTextStyle(new FontSizeStyleNode(36));
 $rTitle->addText("Header 2");
 $body->addNode( $title );

 $doc->saveAs('./pagebreak.docx');

?>

A page break can be added simple by adding a page break node to the document body object as you may see by the line $body->addNode(new PageBreakNode());

Between the two headers you can see this line which will make a page break.

In HTML you can use the CSS style page-break-after to force a page break in the document.

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();  
 $parser = new HtmlParser($doc);  

 $html = '<font size="36"><b>Header 1</b></font>';
 $html .= '<p style="page-break-after: always;"></p>';
 $html .= '<font size="36"><b>Header 2</b></font>';

 $parser->parse($html); 
 echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>'; 
 $doc->saveAs('base.docx');

?>

Creating Tables Dynamically

Another important element in word processing documents is a table. Above we saw how to create documents from HTML, how to create paragraphs or headers and how to add styles. Here is how to add tables.

<?php

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $data = array(
  array(
   "Name"=>"A. Pushkin",
   "Age"=>"31",
   "Phone"=>"none",
   "Address"=>"SPb, pr. Moyki 17",
   "Mail"=>"none",
  ),
  array(
   "Name"=>"M. Ivanov",
   "Age"=>"54",
   "Phone"=>"521-8798",
   "Address"=>"Moskov, pr. Lenina 56",
   "Mail"=>"m.ivanov@info.com",
  ),
  array(
   "Name"=>"M. Chernova",
   "Age"=>"23",
   "Phone"=>"+7-911-7865421",
   "Address"=>"Penza, pr. Lenina 12",
   "Mail"=>"none",
  ),
  array(
   "Name"=>"V. Ut",
   "Age"=>"34",
   "Phone"=>"none",
   "Address"=>"SPb, pr. Lenina 12",
   "Mail"=>"none",
  ),
 );

 $doc = new VsWord();
 $body = $doc->getDocument()->getBody();

 $table = new TableCompositeNode();
 $body->addNode($table);
 $style = new TableStyle(1); 
 $table->setStyle($style);
 $doc->getStyle()->addStyle($style);

 $first = TRUE;
 foreach($data as $item) {

  if($first) {//add header
   $tr = new TableRowCompositeNode();
   $table->addNode( $tr ); 
   foreach($item as $key=>$value) {
    $col = new TableColCompositeNode(); 
    $tr->addNode($col);
	 
    $rTitle = new RCompositeNode();
    $col->getLastPCompositeNode() -> addNode( $rTitle ); 
    $rTitle->addTextStyle(new BoldStyleNode());
    $rTitle->addTextStyle(new FontSizeStyleNode(18));
    $rTitle->addText($key);
		
   }
   $first = false;
  }

  $tr = new TableRowCompositeNode();
  $table->addNode($tr); 
  foreach($item as $key=>$value) {
	
   if($key == "Mail") {
    $col = new TableColCompositeNode();
    $tr->addNode($col); 
    $link = new HyperlinkCompositeNode(); 
    $rLink = new RCompositeNode();
    $rLink->addText( $value );
    $link->addNode( $rLink );
    $col->getLastPCompositeNode()->addNode( $link );
    $col->getLastPCompositeNode()->addNode( new BrNode() );
   } else {
    $col = new TableColCompositeNode(); 
    $col->addText( $value );
    $col->getLastPCompositeNode()->addNode( new BrNode() );
    $tr->addNode( $col );
   }
  }
 }

 echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>';

 $doc->saveAs('./table2.docx');

?>

Here I am creating a table from an array. First we create a TableCompositeNode object and choose a style for it.

Then we loop through the array. For the first row we make it as a table head, the rest are normal rows. In every row we create also columns, so it will be a nested loop. Notice that we can manipulate every cell like we did with the Mail cells.

Let us make the same table using the HTML method:

<?php

 $data = array(
  array(
   "Name"=>"A. Pushkin",
   "Age"=>"31",
   "Phone"=>"none",
   "Address"=>"SPb, pr. Moyki 17",
   "Mail"=>"none",
  ),
  array(
   "Name"=>"M. Ivanov",
   "Age"=>"54",
   "Phone"=>"521-8798",
   "Address"=>"Moskov, pr. Lenina 56",
   "Mail"=>"m.ivanov@info.com",
  ),
  array(
   "Name"=>"M. Chernova",
   "Age"=>"23",
   "Phone"=>"+7-911-7865421",
   "Address"=>"Penza, pr. Lenina 12",
   "Mail"=>"none",
  ),
  array(
   "Name"=>"V. Ut",
   "Age"=>"34",
   "Phone"=>"none",
   "Address"=>"SPb, pr. Lenina 12",
   "Mail"=>"none",
  ),
 );

 require_once '../vsword/VsWord.php'; 
 VsWord::autoLoad();

 $doc = new VsWord();  
 $parser = new HtmlParser($doc);  

 $html = '<table>';

 $first = TRUE;
 foreach($data as $item) {

  if($first) { //add header
   $html .= '<tr>';
   foreach($item as $key=>$value) {
    
    $html .= '<th>';
    $html .= '<font size="18">'.$key.'</font>';
    $html .= '</th>';
    
   }
   $html .= '</tr>';
   $first = false;
  }

  $html .= '<tr>';
  foreach($item as $key=>$value) {
  
   if($key == "Mail"){
    $html .= '<td>';
    $html .= '<a href="'.$value.'">'.$value.'</a>';
    $html .= '</td>';
   } else {
    $html .= '<td>';
    $html .= $value;
    $html .= '</td>';
   }
  }
  $html .= '</tr>';
 }

 $html .= '</table>';

 $parser->parse($html); 
 echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>'; 
 $doc->saveAs('./table2.docx');
 
?>

Creating an Ordered List

Finally I will show an example how to make an ordered list. Since doing it programmatically using objects would be very complex, I am presenting here just the way to do it with HTML.

<?php

require_once '../vsword/VsWord.php'; 
VsWord::autoLoad();

$doc = new VsWord();  
$parser = new HtmlParser($doc);  

$html = '
<ul>
 <li>Level Node 1</li>
 <li>Level Node 2</li>
 <li>Level Node 3 
  <ul>
   <li>Level Node 2</li>
   <li>Level Node 2</li>
   <li>Level Node 2
    <ul>
     <li>Level Node 3</li>
     <li>Level Node 3</li>
     <li>Level Node 3
      <ul>
       <li>Level Node 4</li>
      </ul>
      </li>
    </ul>
    </li>
    <li>Level Node 2</li>
   </ul>
   </li>
 <li>Level Node 1</li>
 <li>Level Node 1</li>
 <li>Level Node 1</li>
</ul>';

$parser->parse($html); 
echo '<pre>'.($doc->getDocument()->getBody()->look()).'</pre>'; 
$doc->saveAs('tree.docx');

?>

Like the first example when it comes to ordered lists we use HTML directly. But we also can do it by nodes and styles.

Conclusion

As you may be read, the VSWord class provides two ways to compose DOCX documents: one using the package objects for the document elements and another using the parser class to convert HTML to the the corresponding document objects.

As you have noticed, as the documents grow in complexity, the work that the developer need to do to compose documents programmatically becomes too hard and tedious.

The down side of using HTML to compose the DOCX document is that the parser takes some time and CPU to convert the HTML. However the gains in productivity for the developer are tremendous, so you may end up always doing it with HTML thanks to this brilliant VSWord class.

If you liked this article or you have questions about composing DOCX documents using this class, post a comment here.




You need to be a registered user or login to post a comment

1,611,040 PHP developers registered to the PHP Classes site.
Be One of Us!

Login Immediately with your account on:



Comments:

8. Is this Topic still follow ? - Joachim de Bernis (2018-04-28 21:30)
Problem with vsWord... - 2 replies
Read the whole comment and replies

7. Issue with HTML styling - Pujol777 (2017-10-11 09:16)
Issue with HTML styling... - 0 replies
Read the whole comment and replies

4. Problem in parsing - anant kumar singh (2017-03-15 11:09)
Html containing images is not parsed correctly... - 1 reply
Read the whole comment and replies

6. "Align" styling is not working while generating from Html Parser - Abhay Pai (2017-03-15 11:07)
"Align" styling is not working while generating from Html Parser... - 0 replies
Read the whole comment and replies

5. error after generate word document - wafia mustafa (2016-12-26 23:43)
when open file it shows error and not open... - 0 replies
Read the whole comment and replies

3. fatal error in VsWord.php - Ian Onvlee (2015-10-28 19:31)
fatal error in VsWord.php... - 0 replies
Read the whole comment and replies

2. image - Dominique (2015-10-20 14:13)
image from relative path... - 2 replies
Read the whole comment and replies

1. Thanks for turtorial - Ayubxon Xudoyberdiyev (2015-10-07 21:44)
Thanks for turtorial... - 1 reply
Read the whole comment and replies



  Blog PHP Classes blog   RSS 1.0 feed RSS 2.0 feed   Blog Create Microsoft Word...   Post a comment Post a comment   See comments See comments (14)   Trackbacks (0)