11 Replies Latest reply on Dec 2, 2010 11:56 AM by MurraySummers

    php preg_replace not working

    ExPluda Level 1

      Hello,

       

      I was asked to try to 'clean' a client database filled with one html rich text editor, but I'm having trouble with php preg_replace who is not cleaning what I want...

       

      What can be my problem?

       

      This is what I'm using

       

      $patterns[0] = '/<div><p></div>/';
      $patterns[1] = '/<div><br></div>/';
      $patterns[2] = '/class=\"Apple-style-span\" /';
      //
      $replacements[0] = '<br>';
      $replacements[1] = '<br>';
      $replacements[2] = ' ';
      //
      if($texto_a_editar = preg_replace($patterns, $replacements, $texto_a_editar)) {;
          echo $texto_a_editar."";
      }

       

      the text remains the same as before preg_replace

       

      Thanks!

       

      Pluda

        • 1. Re: php preg_replace not working
          garywpaul Level 5

          Take a look atexample 2 of using indexed arrays with this function, see if that helps

           

          http://php.net/manual/en/function.preg-replace.php

           

          If you dont get a suitable answer, you might want to try post in the Dreamweaver Application Development forum. It is better suited for this type of question.

           

          Gary

          • 2. Re: php preg_replace not working
            David_Powers Adobe Community Professional (Moderator)

            ExPluda wrote:

             

            What can be my problem?

            Invalid regular expressions.

             

            You need to escape the forward slashes in </div>. I'm pretty sure you don't need the backslashes in front of the double quotes either.

             

            It's late a night here, so I haven't tested it, but I think this should fix your problem.

             

            $patterns[0] = '/<div><p><\/div>/';
            $patterns[1] = '/<div><br><\/div>/';
            $patterns[2] = '/class="Apple-style-span"/';
            
            • 3. Re: php preg_replace not working
              garywpaul Level 5

              The other thing in looking back that looks strange to me is your semi-colon at the opening bracket

               

              if($texto_a_editar = preg_replace($patterns, $replacements, $texto_a_editar)) {;
                  echo $texto_a_editar."";

              }

               

              That does not seem to fit, but is it throwing an error?

               

              Gary

              • 4. Re: php preg_replace not working
                ExPluda Level 1

                Good morning,

                 

                thanks to both for reply.

                 

                now just the class=\"Apple-style-span\" remains

                 

                echo "Before : ".$texto_a_editar;

                //

                $patterns = array();

                $patterns[0] = '/<div><p><\/div>/';

                $patterns[1] = '/<div><br><\/div>/';

                $patterns[2] = '/class=\"Apple-style-span\"/';

                $replacements = array();

                $replacements[0] = '<br>';

                $replacements[1] = '<br>';

                $replacements[2] = ' ';

                //

                if($texto_a_editar = preg_replace($patterns, $replacements, $texto_a_editar)) {

                echo " After : ".$texto_a_editar."";

                }

                 

                 

                this is the output

                 

                Before preg_replace : This <font class=\"Apple-style-span\" color=\"#996600\"><font class=\"Apple-style-span\" face=\"helvetica\">is</font></font> just a text <b>example</b>

                 

                After : This <font class=\"Apple-style-span\" color=\"#996600\"><font class=\"Apple-style-span\" face=\"helvetica\">is</font></font> just a text <b>example</b>

                 

                and another thing, is there a way of use a regex (I don't unthestant regex...) to join the repeated tags, here per example

                 

                <font class=\"Apple-style-span\" color=\"#996600\"><font class=\"Apple-style-span\" face=\"helvetica\">

                 

                should be

                 

                <font color=\"#996600\" face=\"helvetica\">

                 

                Thanks again

                 

                Pluda

                • 5. Re: php preg_replace not working
                  David_Powers Adobe Community Professional (Moderator)

                  ExPluda wrote:

                   

                  now just the class=\"Apple-style-span\" remains

                   

                  That's the same issue as before. You need to escape the backslashes. I didn't realize the backslashes were in the text.

                   

                  $patterns[2] = '/class=\\"Apple-style-span\\"/';
                  

                   

                  To combine the two font tags, you also need to get rid of the duplicate font tag after the text. Although it would be possible to build a regular expression that deals with the Apple-style-span class at the same time, it's a lot easier to get rid of the class first, and then combine the tags by running preg_replace again. The regex you need as the pattern looks like this:

                   

                  (<font [^>]+)>\s*<font( [^>]+>[\s\S]+)(?=</font>)\s*</font>
                  

                   

                  Because it contains forward slashes, wrap it in curly braces instead of forward slashes when passing it to preg_replace.

                   

                  The replacement text is simply '$1$2'.

                   

                  $amended_text = preg_replace('{(<font [^>]+)>\s*<font( [^>]+>[\s\S]+)(?=</font>)\s*</font>}', '$1$2', $original);
                  

                   

                  If you're doing any programming, learning to use regular expressions is an important skill. Regular expressions are not easy, but it's well worthwhile learning at least the basics. I have written a two-part tutorial that should help you get going: http://www.adobe.com/devnet/dreamweaver/articles/regular_expressions_pt1.html.

                  • 6. Re: php preg_replace not working
                    ExPluda Level 1

                    Mr Powers, thanks for help,

                     

                    but still doesn't work, would it be because of the '=' sign on class= ?

                     

                    this is the output

                     

                    Before : p0p0<font class=\"Apple-style-span\" face=\"impact\"><span class=\"Apple-style-span\" style=\"font-size: xx-large;\"><font class=\"Apple-style-span\" color=\"#003333\">p0p0p</font></span></font><br><div><font class=\"Apple-style-span\" face=\"impact\"><span class=\"Apple-style-span\" style=\"font-size: xx-large;\"><font class=\"Apple-style-span\" color=\"#003333\"></font></span></font>kpkokpko</div><br><div>0p0p</div><br />

                     

                    Amended_text : p0p0<font class=\"Apple-style-span\" face=\"impact\"><span class=\"Apple-style-span\" style=\"font-size: xx-large;\"><font class=\"Apple-style-span\" color=\"#003333\">p0p0p</font></span></font><br><div><font class=\"Apple-style-span\" face=\"impact\"><span class=\"Apple-style-span\" style=\"font-size: xx-large;\"><font class=\"Apple-style-span\" color=\"#003333\"></font></span></font>kpkokpko</div><br><div>0p0p</div>

                     

                    the complete script

                     

                    echo "Before : ".$texto_a_editar."<br />";

                    $patterns = array();

                    $patterns[0] = '/<div><p><\/div>/';

                    $patterns[1] = '/<div><br><\/div>/';

                    $patterns[2] = '/class=\\"Apple-style-span\\"/';

                    $replacements = array();

                    $replacements[0] = '<br>';

                    $replacements[1] = '<br>';

                    $replacements[2] = ' ';

                    //

                    $texto_a_editar = preg_replace($patterns, $replacements, $texto_a_editar);

                    //

                    $amended_text = preg_replace('{(<font [^>]+)>\s*<font( [^>]+>[\s\S]+)(?=</font>)\s*</font>}', '$1$2', $texto_a_editar);

                    echo "amended_text : ".$amended_text;

                     

                     

                    And thanks for the tuturial, I'm going to study it, I try many times, but it is indeed dificult regex

                    • 7. Re: php preg_replace not working
                      MurraySummers Level 8

                      David:

                       

                      I'm a little confused by the content under the head "Matching literal text".  You state that the following 12 characters have special meaning -

                       

                      $()*+.?[\^{|

                       

                      Then you say "This regex matches three digits and a literal space, followed by another three digits, a literal hyphen, and four more digits."  It seems that this statement applies to the previously mentioned 12 characters that have special meaning. Please read through that section.  Is something jumbled up there?

                      • 8. Re: php preg_replace not working
                        David_Powers Adobe Community Professional (Moderator)

                        ExPluda wrote:

                         

                        but still doesn't work, would it be because of the '=' sign on class= ?

                        No. I have just copied and pasted your code into Dreamweaver, and tested it in Live Code. The first and second patterns don't work because there are no matches in the $texto_a_editar. The third pattern (for the Apple-style-span) works perfectly.

                         

                        I'm not sure what you're trying to eliminate with the first pattern. It simply looks for a <p> tag sandwiched between opening and closing <div> tags, with no text or space in between.

                         

                        I suspect that your second pattern is back to front. It should be this:

                         

                        $patterns[1] = '/<\/div><br><div>/';
                        

                         

                        That searches for a closing and opening <div> tag with a <br> tag in between.

                         

                        The complex regex that I gave you doesn't work because there's a <span> tag sandwiched between the <font> tags.

                         

                        To deal with the <span>, you need to use this:

                         

                        $amended_text = preg_replace('{(<font [^>]+)>\s*<span([^>]+)><font( [^>]+>[\s\S]+?)(?=</font>)(</font>)}', '$1$2$3', $texto_a_editar);
                        

                         

                        I have tested this with the example code you have given, and it works. The value of $amended_text looks like this:

                         

                        p0p0<font face="impact" style="font-size: xx-large;" color="#003333">p0p0p</font><br><div><font face="impact" style="font-size: xx-large;" color="#003333"></font>kpkokpko<br>0p0p</div><br>

                         

                        However, you'll notice that the second <font> tag doesn't have any text in between (it doesn't in $texto_a_editar either). It looks as though you have got some very dirty code that needs cleaning up. You can do a lot with regular expressions, but if there are a lot of variations in the code that you're searching for, it's going to be a lengthy process devising the appropriate patterns to search for.

                        • 9. Re: php preg_replace not working
                          ExPluda Level 1

                          Yes, I think this has a lot of dirty code...

                           

                          the first pattern sometimes appears, and I think ists better to have <br> then <div><br></div>

                           

                          using exactly this code

                           

                          <?php
                          $texto_a_editar = '<div style="text-align: center;"><b>Testing</b> <font class="Apple-style-span" face="arial"><font class="Apple-style-span" color="#FF3333"><i>cleaning</i></font></font></div><div style="text-align: center;"><br></div><div style="text-align: center;"><u>this</u> <span class="Apple-style-span" style="font-size: x-large;">strange</span></div><div style="text-align: center;"><br></div><div style="text-align: center;"><s>rte</s></div>';
                          
                          echo "Before : ".$texto_a_editar."<br />";
                          //
                          $patterns = array();
                          $patterns[0] = '/<div><p><\/div>/';
                          $patterns[1] = '/<div><br><\/div>/';
                          $patterns[2] = '/class=\\"Apple-style-span\\"/';
                          $replacements = array();
                          $replacements[0] = '<br>';
                          $replacements[1] = '<br>';
                          $replacements[2] = '';
                          //
                          $texto_a_editar = preg_replace($patterns, $replacements, $texto_a_editar);
                          /*
                          if($texto_a_editar = preg_replace($patterns, $replacements, $texto_a_editar)) {
                               echo " After : ".$texto_a_editar."";
                          }
                          */
                          $amended_text = preg_replace('{(<font [^>]+)>\s*<span([^>]+)><font( [^>]+>[\s\S]+?)(?=</font>)(</font>)}', '$1$2$3', $texto_a_editar);
                          //$amended_text = preg_replace('{(<font [^>]+)>\s*<font( [^>]+>[\s\S]+)(?=</font>)\s*</font>}', '$1$2', $texto_a_editar);
                          echo "amended_text : ".$amended_text;
                          ?>
                          

                           

                          I've got in view source this result

                           

                          <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
                          <html xmlns="http://www.w3.org/1999/xhtml">
                          <head>
                          <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
                          <title>Untitled Document</title>
                          </head>
                          
                          <body>
                          
                          Before : <div style="text-align: center;"><b>Testing</b> <font class="Apple-style-span" face="arial"><font class="Apple-style-span" color="#FF3333"><i>cleaning</i></font></font></div><div style="text-align: center;"><br></div><div style="text-align: center;"><u>this</u> <span class="Apple-style-span" style="font-size: x-large;">strange</span></div><div style="text-align: center;"><br></div><div style="text-align: center;"><s>rte</s></div><br />amended_text : <div style="text-align: center;"><b>Testing</b> <font  face="arial"><font  color="#FF3333"><i>cleaning</i></font></font></div><div style="text-align: center;"><br></div><div style="text-align: center;"><u>this</u> <span  style="font-size: x-large;">strange</span></div><div style="text-align: center;"><br></div><div style="text-align: center;"><s>rte</s></div></body>
                          </html>
                          
                          

                          or just formated,

                          amended_text : 
                          
                          <div style="text-align: center;">
                               <b>Testing</b>
                               <font  face="arial">
                                    <font  color="#FF3333">
                                         <i>cleaning</i>
                                    </font>
                               </font>
                          </div>
                          <div style="text-align: center;">
                               <br>
                          </div>
                          <div style="text-align: center;">
                               <u>this</u>
                               <span  style="font-size: x-large;">strange</span>
                          </div>
                          <div style="text-align: center;">
                               <br>
                          </div>
                          <div style="text-align: center;">
                               <s>rte</s>
                          </div>
                          
                           
                          

                          this output, complicates even more, since now per example the <div><br></div> has a style inside, so it will not be replaced.

                           

                          I think this will be very more complicated then what I was expecting...

                           

                          Many thanks for help, I must regular expressions carefully study.

                          • 10. Re: php preg_replace not working
                            David_Powers Adobe Community Professional (Moderator)

                            Murray *ACP* wrote:

                             

                            "This regex matches three digits and a literal space, followed by another three digits, a literal hyphen, and four more digits."  It seems that this statement applies to the previously mentioned 12 characters that have special meaning. Please read through that section.  Is something jumbled up there?

                            Thanks for pointing that out, Murray. Two paragraphs from much further down the page have been duplicated. It should read as follows:

                            In  the land of regular expressions, most characters match themselves. The only exceptions are the following 12 characters:

                             

                            $()*+.?[\^{|

                             

                            These characters have special meanings in regular expressions.

                            I have written to the editor to ask him to correct it. I think the error must have crept in when ADC reformatted all articles as single pages.

                            • 11. Re: php preg_replace not working
                              MurraySummers Level 8

                              Thank goodness!  I was trying to figure out how those 12 characters could match a telephone number, and it gave me a serious brain cramp....