Regular expressions are used for a complex  manipulation by lines in PHP

Support of the interface between a script and regular expressions is carried out through

The following functions: split (), ereg (), ereg_replace (). (dop. The editor). The first

Argument for all three functions is the line specifying regular expression. This

The line will consist of regular and special symbols. Regular symbols have that

Value as at in other unix commands, while special symbols

Have special value. Further sledujuet - the full list of special symbols and

Their values as it perceives a parser php:


. ' Is a special symbol which corresponds{meets} to any symbol, for

Exception of a symbol of a new line. Using konkatenaciju, we can set

Regular expressions it is similar ' a.b ' which any corresponds{meets} three-symbolical

To line which begins with ' a ' and comes to an end ' b '.


* ' In itself it not a design; it - a suffix which means, that

Previous regular expression, can be repeated as is wished many times.

In line " fo * ", the symbol "*" is applied to a symbol " o ' so " fo * " sets "f" with

Subsequent any quantity{amount} of symbols "o".


In case of zero quantity{amount} of symbols "o" the line " fo * " will be also

To correspond{meet} "f".


The symbol "*" Always applies to *naimen`shemu* possible{probable} previous

To expression. Thus, " fo * " sets recurrence "o", instead of recurrence "fo".


Process sravnenijar processes a design "*", trying to coordinate so

It is a lot of recurrences as far as much they them can be found. Then he continues

Processing of other part of a pattern. If, subsequently, will appear nesootvetsvtie with

shablogam, there is a return, by rejection of some recurrences "*", in

Case if it makes possible concurrence of other part of a pattern. For example,

The pattern " c [ad] *ar " for a line "caddaar", " [ad] * " all over again coincides with "addaa", but

It does not allow to coincide to the following symbol "a" in a pattern. So the last

Concurrence "[ad]" otmenjautsja, and the following symbol "a" is tried again. Now

Pattern soovetstvuet.


+ ' "+" It is similar "*" except that one is required at least

Conformity for a previous sample. Thus, " c [ad] +r " does not coincide with

"cr", but will coincide with what or still that can be set by a pattern " c [ad] *r ".


? ' "?" It is similar "*" except that allows to set zero or more

Conformity for the set pattern. Thus, a pattern " c [ad]? r " will set

"cr" or "car" or "cdr", and anything it is more than line.


[...] ' "[" begins " set of symbols " which comes to the end with a symbol "]".

In the most simple case, symbols between these two brackets form set.

Thus, "[ad]" sets symbols "a" or "d", and " [ad] * " sets ljubouju p

osledovatel`nost` symbols "a" and "d" (switching and an empty line) from what follows,

That the pattern " c [ad] *r " sets "car", etc.


The range of symbols also can be switched on in set of symbols, with the help

Symbol "-", placed between two others. Thus, the pattern "[a-z]" sets

Any symbol of the bottom register. Ranges can freely alternate with single

Symbols, as in a pattern " [a-z $ %.] " which sets any symbol of the bottom register

Or symbols "$", "%" or a point.


Pay attention, that the symbols usually being special, inside

Sets of symbols any more are not those. Inside set of symbols

There is completely an excellent{a different} set of special symbols: "]", "-" i "^".


To switch on "]" in set of symbols, it is necessary to make his  the first

Symbol. For example, the pattern " [] a] " sets a symbol "]" or "a". To switch on a symbol

"-", it is necessary to use it  in such context where he cannot specify a range:

That is or the first symbol, or right after a range.


[^...] ' " [^ " begins " excluding set of symbols " which sets any

Symbol except for set. Thus, the pattern "[^a-z0-9a-z]" sets any

Symbol *za iskljucheniem* letters and figures. "^" is not a special symbol in

Set, if only it not the first symbol. A symbol the following after "^"

It is processed as if he is the first (it can be "-" or "]").


^ ' Is a special symbol which sets an empty line - but only in

Case if he costs{stands} in the beginning of a line of a pattern. Otherwise the pattern will not be

To correspond{meet}. Thus, the pattern "^foo" sets "foo" in the beginning of a line.


$ ' It is similar "^", but only sets the end of a line. So the pattern, " xx * $ " sets

Line with one or more symbol "x" at the end of a line.


' Has two values: shields the set forth above special symbols

(switching " "), also sets additional special designs.


As " " shields special symbols, "$" is regular

The expression specifying only a symbol "$", and "[" is regular expression,

Specifying only "[", and so on.


Basically, " " with subsequent any symbol corresponds{meets} only to this

To symbol. However, there are some exceptions: symbols, which, when " "

The special design precedes. Such symbols usually always set them

Own value.


Any new special symbols are not determined. All expansions to syntax

Regular expressions are made, definition new two-symbolical designs,

Which begin with " ".


| ' Sets alternative. Two regular expressions a and b with "|" between them

Form expression which sets something to that corresponds{meets} or And or b.


So expression, "foo|bar" or "foo" or "bar", but any other line.


"|" it is applied to maximum big surrounding expressions. Only "(...)"

Around of expressions can limit capacity "|".


There is a full opportunity perebora with returns when the set is set

"|".


(...) ' Is a design of grouping which serves three purposes: 1.

To conclude in itself set "|" alternatives of other operations. So, a pattern

" (foo|bar) x " corresponds{meets} either "foox" or "barx".


2. To include complex  expression for postfiksnogo "*". So a pattern " ba (na) * "

Sets "bananana", etc., with any (a zero or boleee) quantity{amount} "na".


3. To note required podstroku for the subsequent reference{manipulation}.


This last function - not consequence{investigation} of idea concerning a grouping of expressions

Brackets; it - separate feature which sets the second value for the same

The design "(...)", as there is no practically any conflict between

These two values. An explanation of this feature:


digit ' After the termination{ending} of a design "(...)", the analyzer remembers the beginning and

The end of the text which has been concurrent to this design. Then, later in regular

Expression it is possible to use " " with poledujuhhej in figure (digit), that means " to set

The same text, which sootvetstvovuet digit to a presence{finding} in a design

' (...) ' "." (...) " Designs are numbered in ascending order in regular

Expression.


To lines specifying first nine designs "(...)", appearing in

Regular expression - there correspond{meet} numbers from 1 up to 9. " 1 " up to "9" can be

It is used for the reference{manipulation} to the text, corresponding "(...)" Designs. These 9

The saved designs registers are known as well as.


For example, a pattern " (. *) 1 " sets any line which will consist of two

Identical parts. " (. *) " sets the first part which can be all than

It is necessary, but the subsequent "1" sets precisely same to the text.


The saved designs or registers can be used inside single

Expressions, or, they can be taken and be used somewhere else. Addition

The third parameter to reg_match () or reg_search () will define{determine} a file, in which

9 registers will be written down. Thus the additional register (zero enters the name

Element) in which the line concurrent with all expression is set. For example:



<? $string = " this is a test "; $cnt = reg_match (" (w *). * (") echo $cnt; echo

$regs [0]; echo $regs [1]; echo $regs [2];>


Above mentioned all over again will print quantity{amount} of the concurrent symbols (14 in it

Case) and then all concurrent line, sposledujuhhim the first word of a line and

The last.


b ' Sets an empty line, but only if she is in the beginning or in the end

Words. Thus, "bfoob" corresponds{meets} to any site "foo" as

Separate word. " bball (s |) b " corresponds{meets} "ball" or "balls" as

Separate words.


b ' Sets an empty line, if she not in the beginning or not at the end of a word.


<' Sets an empty line, but only, if she - in the beginning of a word.


> ' Sets an empty line, but only, if she at the end of a word.


w ' Sets any symbol being a component of a word.


w ' Sets any symbol which - is not a component of a word.



--------------------------------------------------------------------------------

With files and lines Today I shall tell the basic functions of job to you about regular

Expressions, and also about the main functions of job with lines and files. In it

Section you will meet functions. With the help of these functions it is possible to make

Replacement of the certain elements of a line to carry out search in line, to work with

The set patterns and many other things. Not so it is a lot of functions, but some them them

Can represent the certain difficulties at job as have set

Various parameters. Today I shall acquaint you with key parameters, which

Allow to make the main actions. So, we shall consider these functions under the order.


$s = implode ($a, $c); we have already met this function in the last release.

She allows to connect all elements of a file in one line. Here $s - a line, in

Which the result, $a - a file, $c - a pattern will be placed. The pattern is a set

Symbols for sklejki lines. This set will be inserted between all elements

File. For example, we have such file:



$a [0] = "string1"; $a [1] = "string2"; $a [2] = "string3";


Accordingly, function implode ($a, "*") will return to us a line

" string1 *** string2 *** string3 ".


$a = explode ($c, $s); Function explode is return implode. She breaks

Line $s using a pattern $c also places elements in a file $a. For example, if

To take a line "string1*string2*string3" and to execute function $a = explode ("*", $s),

That we shall receive such file:



$a [0] = "string1"; $a [1] = "string2"; $a [2] = "string3";


$a = split ($c, $s); Job funcii is absolutely identical explode, behind that

Exception, that in her it is possible to use regular expressions. It means, that

Already it is impossible to use for simple breakdown of a line a symbol "*", as he

Is regular expression (see section above). Therefore for breakdown of a line

It is possible to use kakij-nibud` other symbol, for example, "~".


ereg ($c, $s); Function ereg returns true if in line $s it is found

Conformity to regular expression $c. $c here is any set described in

The previous section of regular expressions. For example, we have a line $s = " here is

testing string ". Function ereg (" ^here. * ", $ s) will return true, as in regular

Expression it is underlined, that the word here should be in the beginning of a line (spec. A symbol

"^" Specifies it) and after that words can go any symbols (a design

". * "). An example of the program which checks this conformity:


<? $s = " here is testing string "; if (ereg (" ^here. * ", $ s)) echo " It is found! ";

else echo " it is not found. ";?>


And a small example which searches for a pattern in any part of a word: <? $s = " here

is testing string "; if (ereg (" .*testing. * ", $ s)) echo " It is found! "; else echo " Not

It is found. ";?>


$s = ereg_replace ($c, $c1, $s); This function replaces all symbols in line $s,

Suitable under regular expression $c on symbols $c1. an example, in which we

We replace all figures in line with signs "+":



<? $s = " 1 here 2 is 3 testing 4 string 5 "; $s = ereg_replace ("[0-9]", "+", $s);

echo $s;?>


As you can see, function returns result in the set variable. $s =

str_replace ($c, $c1, $s); Job of function is similar ereg_replace, behind that

Exception, that in parameter $c it is impossible to use regular expressions. This

Function can be used, when you do not have complex  pattern for replacement, and it is necessary

To make simple search and replacement of several symbols. For example, function $s =

str_replace ("*", "+", "str1*str2*str3") will replace in the set line all symbols "*"

On symbols "+".