Thursday, April 28, 2011

Can regex fix this?

The page: /index.php?page=6&test=1&test2=2

The code below strip's page=6 off of this so I can add our new page to the url and add the $url_without_page_var back to our link again:

$_SERVER['argv'][0]
// Displays:   page=6&test=1&test2=2

And

 $url_without_page_var=preg_replace('/page=(\d+)/i','',$_SERVER['argv'][0]);
// Displays: &test=1&test2=2

Ok so that code lets me take the page=6 off of page=6&test=1&test2=2 I can then change the page number and add this all back together.

My problem is often the page in the URL will not always be in the same position, it may not be the 1st or last items And this is what will happen when, see how the URL is incorrect now;

/page.php?&test=1&test2=2 ERROR-HERE-ALSO page=9

Is it possible to fix this?

From stackoverflow
  • You can just add a new page number at the end of URL without doing anything with old one. Earlier one will be automatically replaced by new one.

    If you have a URL

    page.php?page=1&var1=foo&var2=bar&page=100
    

    and you check your $_GET array in PHP you will see that $_GET['page'] has value of 100

    If you want to replace the value you can use this regex and instead of stripping page number and add it later you could do this in one preg_replace

    preg_replace('/(page=)\d+/i', '\1' . $newNumber, $_SERVER['argv'][0]);
    

    That will work no matter which position page is on.

    jasondavis : Yes I am hoping to avoid this though
    RaYell : Added alternative solution to my answer.
  • Hi,

    you might want to take a look at parse_str, to extract informations from the query string and get them as an array

    You then can

    • remove the page entry from the array, with unset, for instance
      • or re-define it
    • re-construct the query string

    For instance :

    $str = "first=value&arr[]=foo+bar&arr[]=baz";
    
    parse_str($str, $output);
    echo $output['first'];  // value
    echo $output['arr'][0]; // foo bar
    echo $output['arr'][3]; // baz
    

    And, to get back the remaining pieces togetehr, http_build_query might be useful ;-)


    As an example :

    $str = 'page=6&test=1&test2=2';
    parse_str($str, $data);
    
    var_dump($data);
    $data['page'] = 10;
    
    $new_str = http_build_query($data);
    var_dump($new_str);
    

    The first var_dump gives you this :

    array
      'page' => string '6' (length=1)
      'test' => string '1' (length=1)
      'test2' => string '2' (length=1)
    

    => You know the page number.

    And the second gives you a nice query string :

    string 'page=10&test=1&test2=2' (length=22)
    

    Which should be what you need :-)

  • Modify your regexp to look for an optional trailing ampersand and delete it as well. Then when you put the URL back together, put an ampersand where necessary.

    $url_without_page_var=preg_replace('/page=(\d+)\&?/i','',$_SERVER['argv'][0]);
    $new_url = $url_without_page_var . '&page=9';
    
    jasondavis : this is kinda what I was hoping to do but what about when the page is the first item, in that case I wouldn't want it to remove the & and if I a new & all the time that could break the url, any idea?
    zombat : You would definitely want to remove the ampersand if page was the first item. Having a URL like "index.php?&var1=1" is semi-erroneous. It should be "index.php?var1=1" to be correct. Adding new ampersands to the URL can not break it, however. There shouldn't really be a reason why you'd be piling ampersands on without a set of data values behind it either.
  • Why can't you just reconstruct the url?

     $query = $_GET;
     unset($query['page']);
     $length = ($length = strpos($_SERVER['REQUEST_URI'], '?')) ? $length : strlen($_SERVER['REQUEST_URI']); 
     $url = substr($_SERVER['REQUEST_URI'], 0, $length) . '?' . http_build_query($query);
    

    http_build_query() does require PHP5 but you could easily rewrite that function.

    EDIT: Added the $length variable to fix the code.

    Andrew Moore : +1 for the use of `http_build_query()`. I understand it requires PHP5, but PHP4 is the equivalent of IE6 in the PHP world... It's a damn sore that just doesn't want to go away.
  • The $_GET variable provides an array of all the variables set through the URL. $_REQUEST is an array of all the GET and POST variables set. Here's the fish:

    $url_string = "index.php?";
    foreach( $_GET as $k => $v ){
        if( $k != "key_you_dont_want"){ // <-- the key you don't want, ie "page"
            if( $url_string != "index.php?" )
                $url_string .= "&"; // Prepend ampersands nicely
            $url_string .= $k . "=" . $v;
        }
    }
    

    Regex is a bit overkill for this problem. Hope it makes sense. I've been writing Python and Javascript for the past few weeks. PHP feels like a step backwards.

    EDIT: This code makes me happier. I actually tested it instead of blindly typing and crossing fingers.

    unset( $_GET["page"] );
    
    $c = count($_GET);
    $amp = "";
    $url_string = "test.php";
    
    if( $c > 0 ) {
     $url_string .= "?";
     foreach( $_GET as $k => $v ){
      $url_string .= $amp . $k . "=" . $v;
      $amp = "&";
     }
    }
    
    jasondavis : That almost works perfectly except the output I get is
    index.php?&test=1&test2=2 see it also adds a & right after.php? do you know how to fix that?
    Zack : +1, agree that Regexp shouldn't be used here. I like this method you use too.
    jasondavis : Actually looking at your code I see you have it coded to not add it to index.php my actual file was test3.php i changed it and it does work thank you !
    Mike Meyer : Good, I'm glad it worked for you.
    jasondavis : Now I only have 1 problem maybe you have an idea, your code is flawless if there is 2 variable or more but if page is the only one that exist, when it add "&page=PAGENUMBER" to it, then it will have the & that i add right behind the ? mark, I mean on my site it is very rare that this would ever happen but it would be nice to code it right so it will never break, also where is there a way to fill in $url_string = "index.php?" dynamicly for any page name
    Mike Meyer : Updated for that fix.
    Gumbo : You don’t need to check for `!empty($_GET["page"])`. Just delete it.
    Mike Meyer : True that. I thought it would raise an undefined index notice but that doesn't seem to be the case. Thanks.
  • 
    $url = 'index.php';
    if (isset($_GET['page'])) unset($_GET['page']);
    if (count($_GET) > 0) $url .= "?".implode("&", $_GET);
    

0 comments:

Post a Comment