Imagine you've got some user input that is supposed to be a valid URL, but it's user input, so you can't be sure of anything. It's not very consistent data, so you at least make sure to prepend a default scheme to it. It's a fairly common case. Sometimes I see it solved this way:
1 2 |
my $url = 'example.com'; $url = 'http://' . $url unless $url =~ m{http://}i; |
This converts example.com to http://example.com, but it can be error prone. For instance, what if I forgot to make the regex case insensitive? Actually, I've already made a mistake. Did you spot it? In my haste I've neglected to deal with https URLs. Not good. URI::Heuristic can help here.
1 2 3 |
use URI::Heuristic qw(uf_uristr); my $url = 'example.com'; $url = uf_uristr( $url ); |
This does exactly the same thing as the example above, but I've left the logic of checking for an existing scheme to the URI::Heuristic module. If you like this approach, but you'd rather get a URI object back then try this:
1 2 3 4 |
use URI::Heuristic qw(uf_uri); my $url = 'example.com'; $url = uf_uristr( $url ); say $url->as_string; |
Caveats
1 2 |
use URI::Heuristic qw(uf_uri); my $url = uf_uri('/etc/passwd'); # file:/etc/passwd |
Are we sure this is what we want? Checking the scheme is helpful and even if we weren't using this module, we'd probably want to do this anyway.
1 2 3 4 5 6 7 |
use List::AllUtils qw( any ); use URI::Heuristic qw(uf_uri); my $url = uf_uri('/etc/passwd'); unless ( $url->scheme && any { $url->scheme eq $_ } ('http', 'https') ) { die 'unsupported scheme: ' . $url->scheme; } |
That's it! This module has been around for almost 18 years now, but it still solves some of today's problems.