Miss Christine Anderson's website has great design & contents: worth your visit!
Miss Christine Anderson's website has great design & contents: worth your visit!
THE PHP STRING FUNCTIONS BLUES
|
|
A foreword on an inherent php lack of clarity as far as strings are concerned
|
Here are listed and described nearly all (nearly 100% indeed) of the PHP methods meant to manage strings.
Let me say that Php string methods seem especially tailored to drive you insane. I believe I can spare you a few headaches if I provide you with this rule of thumb: there is a great divide in Php string methods between those meant to handle strings as a whole and those meant to handle the single chars within the string.
The issue gets thorny not only because Php documentations invariably fail to report with sufficient clarity this divide and to stress sufficiently outspokenly this situation, but also because the functions geared to handle isolated chars within whole input strings (that is, extracting single chars or replacing them or locating them) are assemebled in such a way that, if you pass to their argument Y meant to be a char not a char but an entire strings, those functions would downright extract from the argument Y only the first char, thus going on working without apparent errors and thus surreptitiously handling the arguments you passed as entire words as if they weren't but the first char in them.
This is bound to multiply your confusion, because the function will go on working and thus you would go on assuming it is working on the whole words you passed whilst it stealthily extracted from them only the first char and worked on that only: you may end up harebrained before you understand this, and you may well spend a lot of sleepless nights after this. Thus far, I've found no manual (I bought 6) that stressed this absurd situation in the interest of the poor, besieged webmaster in order to save him fruitless days.
Instance: the php function named strrchr takes in two arguments, an input string and a char to be located in it (from the right edge, not from the beginning: say like a crab. This is not questionable, many scripting languages have indexOf and lastIndexOf methods meant to scan for matches either from the beginning or from the end of a string. The questionable thing is the extraction of a char from within a whole word without a manual notifying the entangled confusion that fatally ensues):
strrchr ('hallo world hi', 'hallo')
You may believe the function is to report where the segment 'hallo' can be found in the string 'hallo world hi'.
Wrong: it searches just for the h of ' hallo'... - none thel less it does not complain; manuals seem reticent; and you end up wasting hours unproductively when a mere clear warning would have been sufficient.
By and large whenever you see the substring chr in the name of a php function be warned that this problem may be in place: it is meant for cha rs despite upon passing whole words it won't complain - but you would, trying to understand why ho error gets generated but not even the expected outcome. If you are especially unlucky -like myself- you can spend days on it because the testing examples you chose were by chance tricky and deluded you in the idea the chr function was matching the whole string you passed, while it was not.
In this file I tried to detach them to the fore by adding in the sections where any of them are included a blue written text saying char oriented.
Also, as far as nomenclatures are concerned, you may want to keep in mind that wherever in the name of a Php function you find two letters r one after the other, chances are that the second r stands for reversed; well ok, take it this way: at least it saves typing time.
Of course, you have a few functions( like strrpos and strtr) containing no chr reference, which none the less are char oriented functions.
I just guess otherwise it would have been too simple. What do you think?
PHP STRING FUNCTIONS
|
|
PHP Built in String management functions
|
 |
 |
chr & ord |
 |
 used |
ARGUMENTS |
RETURNS |
chr( number )
ord( stringChar ) |
chr: string ord: number |
 |
USAGE |
 |
chr given an input of one number, returns its numerical ASCII character.
ord, given a character in between quotes, returns the corresponding ASCII number.
print( ord('A') );
//prints: 65
print( chr(65) );
//prints: A
|
 |
 |
is_string & strlen & str_repeat & strrev |
 |
 commonly used |
ARGUMENTS |
RETURNS |
is_string( argument )
strlen( string )
str_repeat(string, number)
strrev(string) |
is_string: boolean
strlen: number
str_repeat: string
strrev: string
|
 |
USAGE |
 |
is_string is a simple built in method to assess whether the passed argument you're dealing with is a String or not.
strlen just returns the length of a string (the amount of chars in it, that is).
str_repeat takes in a string as first argument and a number as its second argument: it returns a string where the input string is repeated for number times:
print str_repeat('hallo ', 3);
//prints: hallo hallo hallo
The second argument is mandatory, even if zero. If zero it prints nothing.
strrev takes in a string and reverses both its words and letters (original not affected):
$string='hallo my friend';
print strrev($string);
//prints: dneirf ym ollah
|
 |
 |
chop & trim & ltrim & rtrim |
 |
 commonly used |
ARGUMENTS |
RETURNS |
chop( string )
trim( string ) ltrim( string ); rtrim( string ) |
string |
 |
USAGE |
 |
These functions deal with trailing or leading white spaces or other non word, non numerical and non punctuations characters in a string, and remove them.
I assume you are aware that there are in a string not only white spaces that you may want to remove but also the so called new line, carriage return and tabulation tabs (at times we subsume all of these instances in the name white spaces, meant as the whole of the non non-printable characters) that you may want to remove as well when they are not within a string but on its edges.
Instance: an user types a username and by "mistake" adds a white space, so before checking via your php script the validity of such username calling in the table of the valid username, you may want to remove these spurious chars at the beginning or at the end of the user input string; in fact, remember that a string that says "hallo" and one that says "hallo " with a trailing white space, are the same to human eyes but not for a computer which deals with the white space as an added char not unlike a, say, letter s added at the end, namely something that entirely changes the text: as I said elsewhere, when scripting almost a match means no match at all.
- chop (& rtrim): removes white spaces new lines and tab and carriage return chars from the end of a string. Original not affected.
$string=' hallo world ' ;
print 'string length BEFORE operation: '.strlen($string).'<br>';
$curtail=chop($string);
//prints:
print '<br>string length AFTER operation: '.strlen($curtail);
//prints:
- trim: removes white spaces new lines and tab and carriage return chars from the end and the beginning of a string. Original not affected.
$string=' hallo world ' ;
print 'string length BEFORE operation: '.strlen($string).'<br>';
$curtail=trim($string);
//prints:
print '<br>string length AFTER operation: '.strlen($curtail);
//prints:
- ltrim: removes white spaces new lines and tab and carriage return chars from the beginning of a string. Original not affected.
$string=' hallo world ' ;
print 'string length BEFORE operation: '.strlen($string).'<br>';
$curtail=ltrim($string);
//prints:
print '<br>string length AFTER operation: '.strlen($curtail);
//prints:
|
 |
 |
explode & implode |
 |
 commonly used |
ARGUMENTS |
RETURNS |
explode( separator, string )
implode( separator, array) |
boolean |
 |
USAGE |
 |
|
 |
 |
substr & substr_count |
 |
 commonly used, they find or count |
ARGUMENTS |
RETURNS |
substr( string, startNum, lengthNum )
substr_count( string, subString ) |
substr: string
substr_count; number |
 |
USAGE |
 |
substr_count returns a number representing the amount of times the second argument substring is found within first argument string. That simple.
substr returns a segment ( substring, in fact) of the first argument string: such segment starts with the offset index given by the second argument (which is mandatory, if you omit it php strangely enough does not assume its default is zero namely the onset of the input string - the reason it does not is that it assumes it is impossible you may mean as substring the whole string, but you see: I prefer codes that include chances instead than excluding them), and proceeds on for as many chars as dictated by the third and last number (and this time if you omit it php does assumes you meant the end of the string. Cool).
You can use as a startNum argument a negative number, and thus the function will start from the tail of the string: -1 being the last char index, -2 being the index of the char before the last one, and so on.
The difference between strstr and substr is that the former takes in a string to return the sub string from there, whereas substr takes in a numerical indexes to return the sub string from such numerical index.
Basically using substr to return a string from a numerical index is the same as getting the numerical index of a substring by strpos and passing such result to substr:
$string='hallo world hallo';
print substr($string, 0);
print substr($string, strpos($string, 'hallo'));
print strstr($string, 'hallo');
//all the above print: hallo world hallo
In javascript a built in method named substring is tailored to take in two numbers, the former the starting offset of the substring and the latter the ending offset of the substring, being the former an inclusive range integer and the latter an exclusive range integer.
Now, if you are more familiar with such javascript implementation all you have to keep in mind is that in the substr php function its 3rd argument (mirrired in its javascript equivalent by the javascript second argument) is a length, not an index; if you want to convert a length to an index, remember that it is the difference less one between the two indexes; that is, imagine you want a substring from index 4 to index 12: in javascript it would be:
String.substring(4,12+1);
Its php equivalent is:
substr($String, 4, 12-4-1);
|
 |
 |
strpos & stripos & strstr & stristr & strchr & strrpos & strripos & strrchr |
 |
 commonly used, they find positions
strrpos, strrchr: Php 4 char oriented |
ARGUMENTS |
RETURNS |
strpos(string, string, num)
stripos(string, string, num)
strstr(string, string)
stristr(string, string)
strchr(string, string)
strrpos(string, string)
strrchr(string, string, num) |
strpos, strrpos: number | boolean: false
strstr, stristr, strchr, strrchr: string | boolean: false |
 |
USAGE |
 |
You provide these functions with a string to be searched upon as first argument and a string to be located as a second argument: if the latter is included in the former string, a number is returned representing the position offset where the first letter of the string to be found is located within the scanned string.
- strpos locates (by returning a numerical index) the first instance position
$string='hallo world hallo' ;
print strpos($string,'hallo');
//prints: 0
- stripos is the case insensitive version of strpos.
- strstr: is a conceptual integration of strpos above: whereas strpos locates a match and returns the number of its index offset, strstr still locates the first match but doesn't return the number but the whole string from there:
$string='hallo world hallo' ;
print strstr($string,'hallo');
//prints: hallo world hallo
The difference between strstr and substr is that the former takes in a string to return the sub string from there, whereas substr takes in a numerical indexes to return the sub string from such numerical index.
- stristr is the same as strstr, but this round case insensitive.
- strchr: apparently this is exactly an alias of strstr; at least so also Leon Atkinson reports. Why it has been named including the char suffix is beyond my comprehension: the function indeed seems not to be char oriented:
$string='hallo world hallo';
print strchr($string,'hxallo');
//prints: boolean false
printing false, means it didn't limit itself to locate the first char of the argument "hxallo", namely a mere h, but searched for the whole word, which obviously was not present in the input string.
- strrpos locates (by returning a numerical index) the last instance position
$string='hallo world hallo' ;
print strrpos($string,'hallo');
//prints: 12
$string='hallo' ;
$searched='hallo';
print strrpos($string,'ovest');
/*prints: 4, which is absurd: implies ovest is inside hallo...*/
The only way till Php 4 included you may find strrpos useful was in case you've to search for a char that concludes the strings you're dealing with most of the time, like for instance a forward slash concludes many times the web address of a web site.
Conversely, strpos is enormously useful.
Of course, a problem arises till Php 4: if you want(ed) to search for the numerical index of a whole substring occurrence (and not just a char, that is) from the rightmost edge of an input string, till Php 5 you miss(ed) an adequate function to do that. I have crafted one that does that.
Following the naming practice of Php I called it strposr and takes in three arguments: the input string, the string you search for, and a third argument which is optional and if passed makes this search case sensitive (the default is case insensitive that is); it returns the numerical index of the last instance found or boolean false if none is found (to check if the outcome is false see below gettype):
Starting with Php 5 strrpos searches also for whole strings.
- strripos is the case insensitive version of strrpos, with the same inconveniences as strrpos till Php 4 included. From Php 5 these limitations have been finally eliminated.
-
strrchr locates the last instance position and returns the string from there [so it is similar to strstr];
$string='hallo world hallo';
print strrchr($string, 'r');
//prints: rld hallo
Therefore strrchr is not just an alias of strrpos as some manuals report: as you see strrchr returns a string, whereas strrpos returns a number! Also, strrchr's second argument, like strrpos's, does not match the whole passed argument but just the first char present in such argument, and instead of reporting just its index like strrpos does, it reports the whole string from there:
$string='hallo guys';
print strrchr($string,'ovest');
//prints: o guys
as you see it prints onward from the last occurrence of the first letter of 'ovest' found, namely reports a string (not a number) and grabs it from a searched string (ovest) which was nowhere in the input meant to be searched upon.
In case of no match all these functions return a boolean which is: false. I do strongly recommend that to verify whether an unsuccessful match occurred you do not check whether
if(strpos($string, 'hallo')==0)
because integer zero would be assumed equivalent to boolean false by Php (zero and false are considered synonymous by all scripting languages) but at the same time zero could even be the returned offset of a... successful match! in fact zero would be the returned number in case a successful match got detected at the very onset of the $string.
Therefore to verify whether strpos or strrpos have returned a successful match, use the php function gettype() and verify whether
if( gettype( strpos($string, 'not there') ) == boolean){
print "no match";
}
Write boolean as a bareword - no leading dollar sing, that is. Alternatively, if you nest boolean within quotes, the validation is still performed correctly.
These functions are case sensitive so if you're concerned about a case insensitive search you'd first turn both strings either to lowercase or to uppercase [strtolower or strtoupper].
The third strpos argument is a number (and note that the strstr function although an integration to strpos falls short of this feature at this point), and is an optional argument: you can use it to determine the level of the input string upon which you want to start your search, in case you don't want to start from the very beginning.
The presence of such number may also be used in order to reiterate through a text to find all the occurrences (though a regular expression might still be best suited in this case):
$string='hallo world hallo' ;
$searched='hallo';
$offsets=array();
$next=0;
while(1){
$entry = strpos(
$string,
$searched,
$next+$offsets[sizeof($offsets)-1] /*juggling*/
);
if(gettype($entry)==boolean){break;};
$next=strlen($searched);
$offsets[]=$entry;
}
print implode(",",$offsets);
//prints: 0,12
The code above initializes an array named $offsets and returns all the offsets of all the found instances as an array of numbers.
TECH NOTE: when I declare:
$next+$offset[ sizeof($offsets)-1 ]
I grab an entry of the $offsets array which does not exist (-1): the returned value of that invocation is therefore null: how can then I sum a number ($next) with a NULL without triggering an error?
By exploiting the so called juggling of the data type of Php: Php can deduce from the operators (in our case: +) that the operation you want to perform is a sum, and that it has to convert the incompatible data type to the less harmful among the compatible ones, which in most numerical cases is zero (except for multiplications or divisions, so be careful with jugglings! Envision first what the return of your juggling is to be accordingly to your variables).
Searching a match using these functions (that is, strpos chiefly for I told you what big problems the str rpos function may entail) exacts less time, in case of long inputs, than using regular expressions; yet, and of course, as far as generality of searched text is concerned these functions have not in the least the flexibility of regular expressions: by these functions you can search a highly determined text, but by regular expressions you can search for a pattern of text, which means regardless of its specific embodiments ( trivial instance: search for any word of 7 letters that starts with a h); therefore the "unsaid" I think many manuals mean when they suggest to use these functions instead than regular expressions is this: use regular expressions only when you're searching for a pattern.
|
 |
 |
strtolower & strtoupper & ucfirst & ucwords |
 |
 commonly used |
ARGUMENTS |
RETURNS |
strtolower(string) strtoupper(string) ucfirst(string) ucwords(string)
|
string |
 |
USAGE |
 |
- strtolower converts a whole string to lowercase.
- strtoupper converts a whole string to uppercase.
- ucfirst converts the first letter of each new line to upper case.
- ucwords converts the first letter of each word (regardless of whether at the beginning of a new line or not) to uppercase.
Originals not affected:
$string='hallo world hallo';
print ucwords ($string);
//prints: Hallo World Hallo
print $string;
//still prints: hallo world hallo
|
 |
 |
strcmp & strcasecmp & strncmp & strncasecmp & strnatcmp & strnatcasecmp |
 |
 commonly used |
ARGUMENTS |
RETURNS |
strcmp(string1, string2) strcasecmp(string)
strncmp(string1, string2, num)
strncasecmp(string1, string2, num)
|
string |
 |
USAGE |
 |
strcmp is rarely explained in a smooth way, and the reason is that it is probably partially flawed.
It takes in two mandatory arguments, both strings. It then checks whether the two strings are identical. If they are it returns zero. This is the only ground upon which you must rely upon strcmp.
Manuals spend a lot of time explaining to you what is returned when the strings are not identical: what matters is that what is returned when they are not identical is not zero; but there is no clear logic in the way this number other than zero is; in fact:
print strcmp('hallo', 'hallo');
//prints: 0
//but now see:
print strcmp('allowed', 'hallo');
//prints: -1
print strcmp('wwwwwww', 'hallo');
//prints: 1
Now, the manuals say that when the first string is longer than the second, a number less than zero is returned. It holds false, simply: above both ' wwwwwww' and ' allowed' have a length of 7 chars, but compared against the same string 'hallo' they yield two different numbers.
Also, note that to verify whether two strings are identical the simple direct comparison is much faster:
if('hallo''hallo'){print 0;};
//prints: 0
strcasecmp is the case insensitive version of strcmp:
print ('hallo', 'Hallo');
//prints: 1
print ('hallo', 'Hallo');
//prints: 0
strncmp is nearly the same as strcmp: it differs in this, that it takes in a third argument which must be a number, and upon the two strings passed for the comparison it checks only as many characters from the beginning of the strings as the number passed as third parameter dictates:
print strncmp('hallo', 'haww',2);
//prints: 0
Printing zero, it means that as far as the leading two chars are concerned, the strings are identical.
strncasecmp is the case insensitive version of strncmp.
As for strnatcmp it is a function that should be used for the following purpose:
check on the file about natsort what is the issue about the so called natural comparisons; therefore if you'r dealing with a natural comparison issue, you can use strnatcmp to pass it as argument to a sorting function so that the sorting process would be made following a natural compare logic (to understand what it is meant you'd first read what a custom sorting process in php is: for that click here).
strnatcasecmp is the case insensitive version of strnatcmp. Perfect for customed sorting of arrays [ usort]
|
 |
 |
strspn & strcspn & similar_text |
 |
 rarely used, they find
char oriented |
ARGUMENTS |
RETURNS |
strspn( string, string)
strcspn( string, string)
similar_text(string, string) |
number |
 |
USAGE |
 |
strspn takes as arguments 2 strings. The second string is actually considered as a collection of chars. It verifies whether in the string passed as first argument are present chars listed in the second argument. If none it returns zero.
If in the string passed as first argument are present chars listed in the second argument, the function returns the length of the first segment in the first argument string where such chars were located.
Since this function performs such comparison only with the beginning of the first string, I never swa its utility:
print strspn('jim hallo','jki');
//prints: 2
//but:
print strspn('hallo jim','jki');
//prints: 0
It means that at the beginning of the string there were two letters (j,i) which were listed in 'jki'.
strcspn returns the index position of such an occurrence, in our case it would return 6. Slightly more flexible than strspn because in both the cases above strcspn would return an index position whereas strspan in the second case returns zero.
similar_text returns a number which is the amount of chars found in common between the two string arguments; it proceeds considering that if a char X is found one time in string1 and a chart X is found say 4 times in string2, it amounts to one X in common; if it finds that a char X is found three times in string1 and 4 times in string2, it amounts to three X shared, and so on. It is case sensitive.
print similar_text('www.google.com','www.altavista.com');
//prints: 9
in fact we have:
w in common: 3
c in common: 1
o in common: 1
m in common: 1
dots in common: 2
l in common: 1
____________
=9
|
 |
 |
strtr |
 |
 used, it replaces
char oriented |
ARGUMENTS |
RETURNS |
strtr( string, string, string ) |
string |
 |
USAGE |
 |
strtr gets in a string and replaces from within it all the occurrences of the second passed argument with the third passed argument. If no occurrences are found, it returns the original string.
There is something very subtle to keep in mind:
both the second and third argument are not considered as strings actually but as collections of chars: namely the function does not substitutes chars matching that sequence but it just substitutes all the chars found in the second argument wherever they can be located in the input string, also if not with that sequence; and it replaces all these char occurrences with the corresponding char in the string passed as third argument:
$string='hallo world hallo';
print strtr($string, 'hw','HW');
//prints: Hallo World Hallo
As you may have noted, it didn't search for "hw" but it searched for separated instances of "h" and "w" and replaced them with the items in the third argument assuming the first char of the third argument has to replace the first char of the second, and the second char of the third argument has to replace the second char of the second argument and so on...
If the second argument falls short of chars against the third argument nor problem: if the second argument includes more chars than the third, only those with an appropriated match in the third argument are to be replaced:
$string='hallo world hallo';
print strtr($string, 'hwl','HW'); //prints: Hallo World Hallo
As you see the second argument would ask for replacement of the lowercase L too, but since it missed a corresponding entry in the third argument which included just 2 chars, the Ls have not been replaced.
Alternatively, the function accepts only two parameters, the input string and the second argument has to be an array: within the latter, the chars found as key will be swapped with the values found; note that in this case, this is a valuable exception: in fact in this fashion strtr would be no longer char oriented but would swap detecting also combinations of chars, example:
print strtr("hallo>", array(">"=>">", "ll"=>"LL"));
//prints:
haLLo>
|
 |
 |
chunk_split & wordwrap |
 |
 used, they split |
ARGUMENTS |
RETURNS |
chunk_split( string, num, string )
wordwrap(string, num, string) |
string |
 |
USAGE |
 |
This function gets in the input string and if the second parameter is passed (if it is not it just returns the input string. But if you pass such second parameter it must be higher than zero or an error might ensue, for such parameter is the divisor of a division and you can't divide by zero) then it has to be a number.
The function divides (splits) the input string into sub segments whose length of each amounts to the number passed as second argument; these segments by default get separated by a white space, or by the third argument if you pass it (and should be a string).
Of course if the last amount of chars is not sufficient to produce an amount of the given range (that is, the length of the string is not perfectly divided by the second argument), it just gets added also if inferior.
$string='hallo';
print chunk_split($string, 4,'X');
//prints: hallXoX
$string='hallo world hallo';
print chunk_split($string, 3,'X');
//prints: halXlo XworXld XhalXloX
Note that original white spaces already present in the input string are correctly just considered and handled as elements and chars like any other, thus entering in the amount of "splittable" chars.
wordwrap wraps a string to the given number of characters (passed as second argument) using a string break character (the third argument).
The default value for the second argument is 75, and the third argument default is a new line. Beware that hidden chars such as already present new lines or carriage returns are counted in as valid chars, though invisible to human eyes.
|
 |
 |
preg_match & preg_match_all & preg_split & split & spliti & preg_replace & str_replace & str_ireplace & substr_replace & wordwrap |
 |
 commonly used |
ARGUMENTS |
RETURNS |
preg_match( STRINGEDregExp, input, array )
preg_match_all( STRINGEDregExp, input, array, number )
preg_split( STRINGEDregExp, input, number )
split(splitter, input)
spliti(splitter, input)
preg_replace( STRINGEDregExp, string, input )
str_replace(find, replacement, input)
str_ireplace(find, replacement, input)
substr_replace(input, replacement, number, number)
wordwrap(input, number, string)
|
preg_match: integer
preg_match_all: number
preg_split: array
split: array
spliti: array
preg_replace: string
str_replace: string
substr_replace: string
wordwrap: string |
 |
USAGE |
 |
You'd use these functions to find matches or splitting or replacing items only when you don't know the precise embodiments of a string pattern: that is, you know for instances that you are looking for, say, sets of two subsequent numbers and of course you don't know which one they may be exactly. Of course, nothing would prevent you from using regular expressions for finding precise, exact words, but since regular expressions are really time consuming, you'd use them only for patterns, and whenever you know the exact word use one of the alternatives that accomplish the same task without resorting to regular expressions.
I assume you're already a bit familiar with regular expressions.
The regular expressions passed to these php functions must always be passed as a stringed version (in between quotes namely) including both the forward and backward slashes that regular expressions want. That is, if your regular expression would be:
/\d\d/
namely searching for two subsequent digits, nest it in between quotes:
/\d\d/
Do not add the global g modifier to your regular expression. Adding the case insensitive modifier i appears conversely safe.
- preg_match:
$string='phone number is 55 55 56 country code is 01';
print preg_match('/\d\d/', $string);
//prints: 1 (would have printed 0 if no matches)
This function has absurd, convoluted ways to report the matches. That is, to know which the actual matches are you initialize within it a third argument:
preg_match('/\d\d/', $string);
print gettype();
//prints: array
Now the output still return an integer but you can also scan it by (or whatever other name you may have given to such variable):
$string='phone number is 55 55 56 country code is 01';
preg_match('/\d\d/', $string, $matches);
print($matches[0]);
//prints: 55
I'd suggest to you to resort to the following function instead than to preg_match. use preg_match only to check whether at least one instance of your pattern is present, and for nothing else.
- preg_match_all:
$string='phone number is 55 55 56 country code is 01';
print preg_match_all('/\d\d/', $string, $matches);
//prints: 4
Namely it returns the amount of matches. The third argument, which you can initialize right on the spot and with whatever name you prefer as long as it is not conflicting with the name of some other variable in the same scope and with the same name, must apparently always be included or an error may follow.
So to know also which the matches are, you take avail of this third argument: it is going to become an array storing the instances:
$string='phone number is 55 55 56 country code is 01';
preg_match_all('/\d\d/', $string);
Now $matches is available to be scanned. Its shape is as follows:
: it is an array of all the instances found:
print_r($matches);
//prints: Array ( [0] => 55 [1] => 55 [2] => 56 [3] => 01 )
If you perform a capture of the elements by parenthesis (to know what capturing is see this file on regular expressions, it is geared on javascripts but the concepts are nearly identical) then $matches would store as subsequent set of indexes from 1 onward the found captured instances for each parenthesized element as arrays (whereas index [0] would still report matches to the whole pattern regardless of the parenthesis meant to instruct the captures) or, to use the terms used at php.org:
Orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized [=captured, my note] subpattern, and so on.
Example:
$string=$HTTP_USER_AGENT;
preg_match_all('/(.+)([0-9])\.([0-9]+)/',$string,$matches);
//full match:
print_r($matches[0]); print '<br>';
//browser:
print_r($matches[1]); print '<br>';
//browser version:
print_r($matches[2]); print '<br>';
//browser subversion:
print_r($matches[3]);
On my Internet Explorer this prints:
Array ( [0] => Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0 )
Array ( [0] => Mozilla/4.0 (compatible; MSIE 6.0; Windows NT )
Array ( [0] => 5 )
Array ( [0] => 0 )
namely each of those entries is an array, consequently for instance:
print($matches);
//just prints: 5
The fourth argument, if passed, would slightly modify the type of returned matrix $matches. I assume this modification is so mild that it is not necessary to use more bandwidth just to describe such small change.
- preg_split:
$string='hallo world hallo folks';
$splitted=preg_split('//',$string);
print_r($splitted);
This returns an array whose each element is one of the chars, white spaces and hidden chars included, composing the string; thus:
print(sizeof($splitted));
//prints: 25
print($splitted[1]); print($splitted[5]);
//prints: ho
Note that the hidden chars such as those of starting line or new line are included, therefore
$splitted[0]
would print an empty slot for it stands for the start of line hidden char. Ok?
$string='hallo world hallo folks';
$splitted=preg_split('/hallo/',$string);
print_r($splitted);
//prints: Array ( [0] => [1] => world [2] => folks )
The notation [0] => means that entry [0] is pointing at an hidden char.
Of course, remember that as usual the splitter is omitted from the output!
Anyway preg_split allows two further arguments:
- third argument: a number which flags how many splinters you want at most (that is, it sets a limit to the output). If passed as negative 1 namely -1 is the same as saying: all the possible ones.
Numerical values exceeding the amount of splinters that it is possible to produce are to be handled as just a reference to the highest possible amount.
- The foruth argumtn can be a bareword, that is not within quotes. If such bareword is: PREG_SPLIT_NO_EMPTY, the returned output will exclude whatever empty entry.
So in such case the [0] index won't be empty. Yeah somewhat convoluted I agree. Anyway, example:
$string='hallo world hallo folks';
$splitted=preg_split('/hallo/', $string, , );
print_r($splitted);
//prints: Array ( [0] =>world [1] => folks ):
- split:
Use this function when you want to split by a known string; that is, without resorting to regular expressions. Splits globally by default (that is, splits by all instances found). Returns an array.
$string='hallo jim hallo mike';
$splitted=split('hallo',$string);
print_r($splitted);
//prints: Array ( [0] => [1] => jim [2] => mike )
- spliti:
case insensitive version of split (see above).
- preg_replace:
$string='hallo world hallo folks';
$replaced=preg_replace('/\w{5}/','',$string);
print($replaced);
//prints: jim mike
The expression above '/\w{5}/' searches for sets of words including 5 letters (in our example only 'hallo' fits the pattern) and replaces them, in our example, just with an empty set: ''.
Note that preg_replace needs three arguments: the regular expression, the replacement, the input string as last.
The original $string is not affected, a new one is produced.
- str_replace:
Use this function when you want to replace by a known string; that is, without resorting to regular expressions. Replaces globally by default (that is, replaces by all instances found). Returns a string, original not affected.
$string='hallo jim hallo mike';
$replaced=str_replace('hallo', 'goodbye', $string);
print($replaced);
//prints: goodbye jim goodbye mike
From Php 5 is available the case insensitive version str_replace
- substr_replace:
Use this function when you want to replace by a known string; that is, without resorting to regular expressions. Replaces locally for, being a substring matching function, has to be selective by definition: it sticks to the substring alone.
Returns a string, original not affected.
It is different from str_replace insofar it replaces within a substring and thus it affects only such segment. You locate such substring considering how to pass its arguments: firstly the input string, secondly the replacement, thirdly the numerical index upon which you want to start the replacement upon the input string, lastly the length until which you want this replacement to occur (if omitted, it assumes until the end). Do consider this example:
$string='hallo jim hallo mike';
$replaced=substr_replace($string, 'goodbye', 0, 1);
print($replaced);
//prints: goodbyeallo jim hallo mike
you may note that we've started with index 0 (and substr_replace doesn't count in hidden chars for it is not regular expression oriented so it is not that much sophisticated!) and we've set that we want to replace for as many as 1 char: namely we said that we just wanted to replace the first h! Therefore it derives that funny goodbyeallo.
And yes, saying $replaced=substr_replace($string, 'goodbye',); would have generated goodbyehallo jim hallo mike!
$string='hallo jim hallo mike';
$replaced=substr_replace($string, 'goodbye', 5, 4);
print($replaced);
//prints: hallogoodbye hallo mike
$replaced=substr_replace($string, 'goodbye', 5, 0);
print($replaced);
//prints: hallogoodbye jim hallo mike
$replaced=substr_replace($string, 'goodbye', 0, 0);
print($replaced);
//prints: goodbyehallo jim hallo mike
- wordwrap: useful for formatting line lengths in outputs such as emails or similar.
Takes in a string, divides it by default into substrings of lines of 75 chars long. If you want to change this latter value, pass the second argument as a number of the desired value. The third argument is optional and represents the delimiter to be added at the end of any line, it defaults to a new line.
$string='Hallo darling, I am here communicating to you that I am arriving tomorrow around 3pm with the first flight. Take care, Susan.';
print wordwrap($string, 20, '<br>');
/*prints: Hallo darling, I am
here communicating
to you that I am
arriving tomorrow
around 3pm with the
first flight. Take
care, Susan.
*/
Does not interrupt whole words:
$string=' is';
print wordwrap($string, , '<br>');
/*prints: supercalifragilissimum
is
*/
|
 |
 |
strtok |
 |
 commonly used
char oriented |
ARGUMENTS |
RETURNS |
strtok( string, splitter ) |
string |
 |
USAGE |
 |
This function basically splits a string after a second argument (which is of course of the string data type).
Unlike chunk_split which splits a string based on a given amount of chars (number of chars, that is), and unlike explode which returns an array, strtok divides the string by the given splitter if it is found, and returns a string: if the splitter argument is nowhere to be found within the string, it just returns the whole string (that is: it ignores it); if the splitter argument is omitted upon your first call to strtok upon a given string, your php engine might even crash! So never omit it.
Do note that the splitter argument treats the passed chars as isolated chars, that is if you were to pass as splitter say ' abc' then strtok would not look for 'abc' but for a and b and c wherever they may be found as isolated chars.
If the splitter is found, it returns the string up to the first instance of such char excluded:
$string='John Smith, avenue foo 15; phone 5555551, office 5555552';
print strtok($string,',');
//prints: John Smith
Remember that the splitter input is always removed from the output string.
It is a common practice to recursively inspect a string by calling in strtok within a loop: in such case the subsequent calls to strtok within the loop should be performed omitting the string argument: this would not crash your php engine, that is it won't as long as the call to strtok with one single argument is performed at a time subsequent a previous call made with both arguments on the same string input; omitting such argument instructs the strtok function to understand that what is meant is to proceed until exhaustion of the instances of the splitter argument in the string input: to do this is advisable to save the splitter as a variable of a wider scope than the loop. Study this example:
$string='John Smith, avenue foo 15; phone 5555551, office 5555552';
$tokens=',';
$token=strtok($string,$tokens);
while( $token ){
print $token.'<br>';
$token=strtok($tokens);
}
/*prints: John Smith
avenue foo 15; phone 5555551
office 5555552 */
If you add to the splitter chars more than one char, it would split after them all:
$string='John Smith, avenue foo 15; phone 5555551, office 5555552';
$tokens=',;';
$token=strtok($string,$tokens);
while( $token ){
print $token.'<br>';
$token=strtok($tokens);
}
/*prints: John Smith
avenue foo 15
phone 5555551
office 5555552 */
Be warned about this:
$string='http:www.foo.com/user/folder/htmls/myfile.html';
$tokens='';
$token=strtok($string,$tokens);
while( $token ){
print $token.'<br>';
$token=strtok($tokens);
}
/*prints: http: */
It would have been supposed to print the string splitted after all the instances of /, but it stops at the first. I think that this happens because the double slashes are believed to be a comment - crazy, but the only viable explanation I can find. In fact:
$string='.foo.com/user/folder/htmls/myfile.html';
$tokens='/';
$token=strtok($string,$tokens);
while( $token ){
print $token.'<br>';
$token=strtok($tokens);
}
/*prints: www.foo.com
user
folder
htmls
myfile.html
*/
|
 |
 |
serialize & unserialize |
 |
 used |
ARGUMENTS |
RETURNS |
serialize( mixed )
unserialize( string ) |
serialize: string unserialize: mixed |
 |
USAGE |
 |
These functions are not much used but are quite useful and powerful.
serialize would get in as its only argument whatever data type, including arrays and complex matrixes, and would encode them in a string with special chars meant to represent the original object.
unserialize would convert such a string into the original object!
$foo1=array('a','b','c');
$foo2=array('one'=>1, 'two'=>2, 3=>3,'four'=>4);
$foo3=array($foo1, $foo2);
$foo3[1]['five']=5;
$foo3[1]['four']='is four';
$isSix=array(6);
$foo3[1]['six']=$isSix[0];
$storeMe=serialize($foo3);
print $storeMe;
that would print this strange thing stored into $storeMe:
a:2:{i:0;a:3:{i:0;s:1:"a";i:1;s:1:"b";i:2;s:1:"c";}i:1;
a:6:{s:3:"one";i:1;s:3:"two";i:2;i:3;i:3;
s:4:"four";s:7:"";s:4:"";i:5;s:3:"six";i:;}}
Please note that even the dynamically added 'five' element and the updated value of element 'four' are correctly included! The element ($isSix) passed as a variable relinquishes its value for the serializing process - that is, does not store the variable name but only its value, which is arguably preferable: unserializing an object which would have held the names of variables might have jeopardized your script if the file which makes such invocation to unserialize wouldn't have defined, previously, the requested variable names with the compatible/expected data types.
$object=unserialize($storeMe);
print $object[1]['five'];
//prints 5
That is, $object is now again a full fledged, perfectly valid matrix!
Very useful to store in databases complex objects updated by users as mere strings without taking the pain to initialize many dedicated database cells in order to make each of them hold each value of each entry!
|
 |
 |
urlencode & urldecode & rawurlencode & rawurldecode |
 |
 commonly used |
ARGUMENTS |
RETURNS |
urlencode( mixed )
urldecode( string )
rawurlencode(string)
rawurldecode(string)
|
string |
 |
USAGE |
 |
- urlencode & rawurlencode:
given an url (a web address, that is; but it could just be any string, actually! it's just commonly used for urls) which is not encoded, namely which still contains chars such as punctuations and white spaces, it returns the url with such chars encoded accordingly to the web specifications (namely symbols, so to say, that start with a % sign):
$string="mailto: john_smith@foo.com";
print (rawurlencode($string));
that prints:
mailtojohn_smithfoo.com
Prefer using rawurlencode for it's more complete, meaning it works also with binary data.
Original not affected. This method is analogous to the javascript method named escape.
- urldecode & rawurldecode:
perform the opposite (of course, you'd use rawurldecode to decode something you coded by rawurlencode):
print rawurldecode('mailto%3A%20john_smith%40foo.com');
//prints: mailto: john_smith@foo.com
This method is analogous to the javascript method named unescape.
|
 |
 |
parse_url & parse_str |
 |
 commonly used |
ARGUMENTS |
RETURNS |
parse_url( string )
parse_str(string)
|
parse_url: associative array
parse_str: NOTHING |
 |
USAGE |
 |
parse_url does the following: breaks an URL (web address) into an associative array whose each index follows this scheme (that is, the words I highlight in blue are keywords of the associative array returned):
$address = 'http://www.foo.com:80/folder/login.html?password=dunno&user=jimmy';
$parsed=parse_url($address);
print_r ($parsed);
/*prints:
Array ( [scheme] => http,
[host] => www.foo.com,
[port] => 80,
[path] => /folder/login.html,
[query] => password=dunno&user=jimmy )
*/
you could thus inspect the query by checking $parsed['query']; (with or without the quotes around 'query': Php knows it is associative and you meant it with the quotes, namely as a String data type that is).
parse_str is suited to get as its argument a query string alone: that is, it is useful only as long as the type of string you pass to it is in the format type of the query string section of a web query.
Such queries have the characteristic of being composed of sets of names and associated values which take the shape:
'name=value'
and which are joined to each other by an ampersand sign (&):
'name1=value1&name2=another value&name3=value3'
Consequently parse_str is particularly suited to be used upon the query value drawn by parse_url:
$parsed = parse_url( 'http://www.foo.com/login.html?password=dunno&user=jimmy&fullname=jimmy smith' );
$yourQuesryIs=$parsed[query];
parse_str($yourQuesryIs);
What parse_str makes with such query string is this: it desegregates names from values and creates as many "global" variables whose name is given by each name, and each of them holding as its value the value it got assigned in the query string. Therefore in our example above the call to parse_str has created the following global variables:
$password; $user; $fullname;
In fact:
print($password);
//prints: dunno
print($user);
//prints: jimmy
print($fullname);
//prints: jimmy smith
Please do note that the scope of the variables created by parse_str is actually local: that is, I said it is global to mean that it creates variables with a scope external to the call, but is the call to parse_str is performed, for instance, within a function, then the variables it creates will have a lifetime that will last only within the scope of the function.
|
 |
 |
nl2br & quotemeta & addslashes & stripslashes & strip_tags & htmlentities & htmlspecialchars |
 |
 commonly used |
ARGUMENTS |
RETURNS |
nl2br( string )
quotemeta(string)
addslashes(string)
stripslashes(string)
strip_tags(string, string)
htmlentities(string)
htmlspecialchars(string)
|
string |
 |
USAGE |
 |
- nl2br:
this function converts all the new lines inside a string into <br> html tags.
- quotemeta:
I assume you know what escaping a char means. if you don't have a look at this file section and locate by yourself the subsection where it deals with escaping.
quotemeta escapes (pre pends to each of them a backward slash, that is) the following chars:
. \ + * ? [ ] ( ) $ ^
It is useful to pass PHP code that needs to escape its punctuation identifiers.
print quotemeta('$myVar=5*2-4');
//prints: \$myVar=5\*2-4
- addslashes:
this function returns the input text prepending backward slashes to:
' " \
$string='my name is "John Smith"';
print (addslashes($string));
//prints: my name is \"John Smith\"
- stripslashes:
complements addslashes and removes slashes:
$string='my name is \"John Smith\" you \know \\';
print stripslashes($string);
//prints: my name is "John Smith" you know
- strip_tags:
It removes all the html tags from an input string. Traditionally used before parsing form text fields with data provided by an user input.
If you want to allow certain tags, you can include them in the second argument, which is optional and which must be a string including the opening (but not the corresponding closing) tags of a pair, each listed after another (and it doesn't matter whether capitalized or not, they would work in both cases):
$string='I am <br><strong>John Smith </strong> <script>while(1){alert()}</script>';
print( strip_tags($string, '<strong><br>') );
/*prints:
I am <br><strong>John Smith</strong> while(1){alert()}
*/
- htmlentities:
returns the input string with the html special chars transformed into their matching html entities; most remarkably used to convert < and > htm opening and closing tags with < and >
$string='you can use <html> - by happysite ©';
print htmlentities($string);
/*prints:
you can use <html> - by happysite ©
*/
- htmlspecialchars:
same as htmlentities before but less powerful (less entities are contemplated, that is).
|
 |
 |
md5 |
 |
 used |
ARGUMENTS |
RETURNS |
md5( string )
|
string |
 |
USAGE |
 |
md5 is an encryption function: takes in a string and encrypts it. It is one way and as Leon Atkinson declares in his PHP guide "It is theorized that the algorithm for the md5 function will produce unique identifiers for all strings."
print md5('This file is fairly good');
//prints: 1ad93ca4c665e8cd87780752dae64211
|
|
|