In computing, regular expressions allows programmers to match complex patterns within text. I was fortunate to learn how to work with regular expressions when I was 20, in 2001 through the help of a perl programmer named Larry back in the day when I was working for qode.com.
A friend of mine asked me how he could match different parts of a url. He wanted the:
[1] the entire url
[2] the entire domain
[3] only the domain & tld.
This is what I came up with in PHP.
Result:
Array
(
[0] => http://www.thinkleandro.com/labels/politics.html
[1] => http://www.thinkleandro.com/labels/politics.html
[2] => www.thinkleandro.com
[3] => thinkleandro.com
)
A friend of mine asked me how he could match different parts of a url. He wanted the:
[1] the entire url
[2] the entire domain
[3] only the domain & tld.
This is what I came up with in PHP.
$string = 'http://www.thinkleandro.com/labels/politics.html';
$regExp = "/"; $regExp.= "("; // Start of [1] Entire Url $regExp.= ".*?://"; // Match Protocol (http://) $regExp.= "("; // Start of [2] Entire Domain $regExp.= ".*?"; // Match all subdomains (if any) $regExp.= "("; // Start of [3] $regExp.= "[^.]*?"; // Match everything except for a period (domain name without sub domain) $regExp.= "..{3,6}?"; // Match (period) and 3-6 characters (domain tld $regExp.= ")"; // End of [3] $regExp.= ")"; // End of [2] $regExp.= "/.*"; // Match Everything after the domain (/...) $regExp.= ")"; // End of [1] $regExp.= "/";
preg_match($regExp,$string,$matches);
print_r($matches);
Result:
Array
(
[0] => http://www.thinkleandro.com/labels/politics.html
[1] => http://www.thinkleandro.com/labels/politics.html
[2] => www.thinkleandro.com
[3] => thinkleandro.com
)
Labels: php, programming





