Why UrlValidator do not work for some of my Urls?
Asked Answered
B

7

11
String[] schemes = {"http","https"};
UrlValidator urlValidator = new UrlValidator(schemes, UrlValidator.ALLOW_ALL_SCHEMES);
System.out.println(urlValidator.isValid(myUrl));

the following URL says, invalid. Any one know why is that. the localnet is a localnetwork. But this works for any other public network (it seems).

http://aunt.localnet/songs/barnbeat.ogg
Bedizen answered 8/11, 2012 at 12:10 Comment(2)
UrlValidator presumably from apache commons? Have a look at the code and figure it out. Might be because it doesn't recognise localnet as a tld.Whither
org.apache.commons.validator.routines.UrlValidator;Bedizen
W
4

As I thought, its failing on the top level;

String topLevel = domainSegment[segmentCount - 1];
if (topLevel.length() < 2 || topLevel.length() > 4) {
  return false;
}

your top level is localnet.

Whither answered 8/11, 2012 at 12:35 Comment(1)
This was true for older versions of commons-validator, but is not true if you use org.apache.commons.validator.routines.UrlValidator from the current version (1.6)Osmo
C
5

The class you're using is deprecated. The replacement is

org.apache.commons.validator.routines.UrlValidator

Which is more flexible. You can pass the flag ALLOW_LOCAL_URLS to the constructor which would allow most addresses like the one you are using. In our case, we had authentication data preceeding the address, so we had to use the even-more-flexible UrlValidator(RegexValidator authorityValidator, long options) constructor.

Chancellery answered 17/4, 2013 at 18:30 Comment(4)
ALLOW_LOCAL_URLS is not enough to get validation path for such host name.Buckler
I had to rebuild authority regular expression to skip domain segment validation code: new UrlValidator(new RegexValidator("^([\\p{Alnum}\\-\\.]*)(:\\d*)?(.*)?"))Buckler
Thanks Yves using RegexValidator worked for me too, otherwise http://fooserver.local/ isn't considered valid. I guess this regex would have to be changed to also support IPV6 addresses. I just asked on their mailing list markmail.org/thread/3ozz3azcnlpt4cxu if their list of LOCAL_TLDS should be expanded.Nne
@YvesMartin your regex seems to match everything in the third group, and that group seems to match everything.Southwesterly
W
4

As I thought, its failing on the top level;

String topLevel = domainSegment[segmentCount - 1];
if (topLevel.length() < 2 || topLevel.length() > 4) {
  return false;
}

your top level is localnet.

Whither answered 8/11, 2012 at 12:35 Comment(1)
This was true for older versions of commons-validator, but is not true if you use org.apache.commons.validator.routines.UrlValidator from the current version (1.6)Osmo
G
1

This is fixed in the 1.4.1 release of the Apache Validator:

https://issues.apache.org/jira/browse/VALIDATOR-342 https://issues.apache.org/jira/browse/VALIDATOR/fixforversion/12320156

A simple upgrade to the latest version of the validator should fix things nicely.

Groth answered 17/8, 2015 at 19:20 Comment(2)
The VALIDATOR-342 bug is an other problem. Is because of the .rocks TLD. "The .rocks TLD is quite recent and is not included in the list used by the Domain validator."Curtal
Thanks @drchuck! It helped in my case. I was tricked by Google. They're showing commons-validator 1.4.0 as the first result while current version is 1.5.1!Fieldstone
T
0

check line 2 it should be

new UrlValidator(schemes);

if you want to allow 2 slashes and disallow fragments

new UrlValidator(schemes, ALLOW_2_SLASHES + NO_FRAGMENTS);
Tripitaka answered 8/11, 2012 at 12:23 Comment(0)
A
0

Here is the source code for isValid(String) method:

You can check the result at each step by manual call to understand where it fails.

Alfie answered 8/11, 2012 at 12:24 Comment(2)
oops, those methods are protected. But the docs says that isValid is deprecated: commons.apache.org/validator/apidocs/org/apache/commons/…Alfie
i use "validator.routines.UrlValidaor" is not deprecatedBedizen
K
0

The library method fails on this URL:

  "http://en.wikipedia.org/wiki/3,2,1..._Frankie_Go_Boom"

Which is perfectly legal (and existing) URL.

I found by trial and error that the below code is more accurate:

public static boolean isValidURL(String url)
{
    URL u = null;
    try
    {
        u = new URL(url);
    }
    catch (MalformedURLException e)
    {
        return false;
    }

    try
    {
        u.toURI();
    }
    catch (URISyntaxException e)
    {  
        return false;  
    }  

    return true;  
}
Kook answered 30/9, 2013 at 11:30 Comment(1)
Note that this only works if your URL is already encoded! URI only accepts URLs with encoded query parameters, else it will throw an exception!Southwesterly
I
0

You can use the following:

UrlValidator urlValidator = new UrlValidator(schemes, new RegexValidator("^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$"), 0L);
In answered 23/10, 2014 at 17:39 Comment(1)
This does not allow ports to be given whatsoever, and the last part can only be maximum 6 characters as defined in your Regex {2,6}Southwesterly

© 2022 - 2024 — McMap. All rights reserved.