PHP Classes

File: example.txt

Recommend this page to a friend!
  Classes of Philipp Strazny   PHP Split File by Pattern   example.txt   Download  
File: example.txt
Role: Sample output
Content type: text/plain
Description: usage examples
Class: PHP Split File by Pattern
Split files into chunks divided by pattern strings
Author: By
Last change:
Date: 11 years ago
Size: 2,974 bytes
 

Contents

Class file image Download
Example for using FilePatternSplitter max_und_moritz.txt contains a text-only version of a famous German cartoon. For some purposes, it may be useful to have each chapter in its own file. Luckily, each chapter heading starts with an underscore in this file, so we issue: $ php FilePatternSplitter.php split max_und_moritz.txt '/^_/' ./fps00001_max_und_moritz.txt ./fps00002_max_und_moritz.txt ./fps00003_max_und_moritz.txt ./fps00004_max_und_moritz.txt ./fps00005_max_und_moritz.txt ./fps00006_max_und_moritz.txt ./fps00007_max_und_moritz.txt ./fps00008_max_und_moritz.txt ./fps00009_max_und_moritz.txt ./fps00010_max_und_moritz.txt When we check the contents of these files, we see that the individual chapters are nicely put into their individual chapters and separated from the Gutenberg preamble: $ head -n 1 fps* ==> fps00001_max_und_moritz.txt <== The Project Gutenberg EBook of Max und Moritz, by Wilhelm Busch ==> fps00002_max_und_moritz.txt <== _VORWORT._ ==> fps00003_max_und_moritz.txt <== _Erster Streich._ ==> fps00004_max_und_moritz.txt <== _Zweiter Streich._ ==> fps00005_max_und_moritz.txt <== _Dritter Streich._ ==> fps00006_max_und_moritz.txt <== _Vierter Streich._ ==> fps00007_max_und_moritz.txt <== _Fünfter Streich._ ==> fps00008_max_und_moritz.txt <== _Sechster Streich._ ==> fps00009_max_und_moritz.txt <== _Letzter Streich._ ==> fps00010_max_und_moritz.txt <== _SCHLUSS._ However, the last file does not only contain the final chapter, but also the Gutenberg license. In order to also separate that one, we simply add a second pattern: $ php FilePatternSplitter.php split max_und_moritz.txt '/^_/' '/^End of the/' ./fps00001_max_und_moritz.txt ./fps00002_max_und_moritz.txt ./fps00003_max_und_moritz.txt ./fps00004_max_und_moritz.txt ./fps00005_max_und_moritz.txt ./fps00006_max_und_moritz.txt ./fps00007_max_und_moritz.txt ./fps00008_max_und_moritz.txt ./fps00009_max_und_moritz.txt ./fps00010_max_und_moritz.txt ./fps00011_max_und_moritz.txt $ head -n 1 fps* ==> fps00001_max_und_moritz.txt <== The Project Gutenberg EBook of Max und Moritz, by Wilhelm Busch ==> fps00002_max_und_moritz.txt <== _VORWORT._ ==> fps00003_max_und_moritz.txt <== _Erster Streich._ ==> fps00004_max_und_moritz.txt <== _Zweiter Streich._ ==> fps00005_max_und_moritz.txt <== _Dritter Streich._ ==> fps00006_max_und_moritz.txt <== _Vierter Streich._ ==> fps00007_max_und_moritz.txt <== _Fünfter Streich._ ==> fps00008_max_und_moritz.txt <== _Sechster Streich._ ==> fps00009_max_und_moritz.txt <== _Letzter Streich._ ==> fps00010_max_und_moritz.txt <== _SCHLUSS._ ==> fps00011_max_und_moritz.txt <== End of the Project Gutenberg EBook of Just to verify that things worked appropriately, we can merge the files again: $ php FilePatternSplitter.php merge . merged into max_und_moritz.txt.merged and then check against the original: $ diff -s max_und_moritz.txt max_und_moritz.txt.merged Files max_und_moritz.txt and max_und_moritz.txt.merged are identical