Java Pattern class doesn't have a public constructor, why?
Asked Answered
C

6

11

I've been reviewing Java Regex Library, surprised by the fact the Pattern class does not have a public constructor which I've taken for granted for years.

One reason I suspect the static compile method is being used in favor of constructor could be that constructor would always return a new object while a static method might return a previously created (and cached) object provided that the pattern string is the same.

However, it is not the case as demonstrated by the following.

public class PatternCompiler {
    public static void main(String[] args) {
        Pattern first = Pattern.compile(".");
        Pattern second = Pattern.compile(".");
        if (first == second) {
            System.out.println("The same object has been reused!");
        } else {
            System.out.println("Why not just use constructor?");
        }
    }
}

Any other strong rationales behind using static method over constructor?

Edit: I found a related question here. None of the answers there convinced me either. Reading through all answers, I get a feeling that a static method has quite a few advantages over a public constructor regarding creating an object but not the other way around. Is that true? If so, I'm gonna create such static methods for each one of my classes and safely assume that it's both more readable and flexible.

Cecilla answered 7/12, 2012 at 7:28 Comment(11)
Did you try do dive into the source code?Etiquette
Perhaps the reason is to ensure readability.Etiquette
@JanDvorak new Pattern(".") is as readable if not more so, isn't it :)Cecilla
What if, hypothetically, there was another way to build a Pattern than from a regex string, that also happens to be a String (say, from a JSON-encoded DFS transition table)?Etiquette
@JanDvorak Well, you can overload a constructor as often as you can a static method.Cecilla
not if both overloads have the same signature.Etiquette
@JanDvorak Good point. I think I got what you meant. compile may potentially have a mechanism to decide which constructor to use depending on the actual input provided that multiple constructors are available. Feel free to list it as an answer and I'll vote for it :)Cecilla
No, the point is you can have several methods with different names, but you can't have several constructors with different names. Example: Pattern.compile(str) and Pattern.compileFromJson(str) and Pattern.compileFromXML(str).Historiography
@PhilippWendler Thanks for correcting me. This is a good point deserving to be an answer I think.Cecilla
What would be gained by adding a constructor? This is an honest question.Fenugreek
[This][1] is related. [1]: #856018Jarlathus
M
15

Generally, a class won't have a public constructor for one of three reasons:

  • The class is a utility class and there is no reason to instantiate it (for example, java.lang.Math).
  • Instantiation can fail, and a constructor can't return null.
  • A static method clarifies the meaning behind what happens during instantiation.

In the class of Pattern, the third case is applicable--the static compile method is used solely for clarity. Constructing a pattern via new Pattern(..) doesn't make sense from an explanatory point of view, because there's a sophisticated process which goes on to create a new Pattern. To explain this process, the static method is named compile, because the regex is essentially compiled to create the pattern.

In short, there is no programmatic purpose for making Pattern only constructable via a static method.

Minneapolis answered 7/12, 2012 at 7:50 Comment(12)
Does the second apply here too?Alvord
@Alvord Nope, there's no case in which Pattern.compile returns null. Instantiation can still fail, but it just throws an exception.Minneapolis
look at this gist please: gist.github.com/sanandrea/686a5a3d762e177d74fe it returns null when it fails to compile the pattern. If I did it correctly...Alvord
Pattern.compile itself isn't returning null; you set pattern to null. And then nothing alters pattern if Pattern.compile throws an exception.Minneapolis
A constructor can't return null, so it can be necessary to create a static construction wrapper if the goal is to either return an instance or return null.Minneapolis
@Alvord Is this clear to you? I feel like I'm not getting my points across to you. What is unclear about the second point?Minneapolis
yes I am not understanding why the second point does not apply here. Could you please provide an example where the second point applies if this is not the case.Alvord
If you look at the source code of the pattern class, you can see that Pattern.compile can't return null, because it simply returns what calling the constructor does, and constructors can't return null, so the second point does not apply here. I can't think of an example of the second point off the top of my head, but it's something I've seen before.Minneapolis
So either the second point is not well defined or it applies also in this case.Alvord
How does it apply in the case of pattern? The source code clearly cannot return null, so I'm not sure why you think this. Your example creates a use case where null is the result of a failed pattern compilation, but this doesn't mean that Pattern.compile itself returns null (because it doesn't!).Minneapolis
SplashScreen is an example of the second point: docs.oracle.com/javase/8/docs/api/java/awt/SplashScreen.htmlMinneapolis
thank you for all the explanations in first place. I want to remove the downvote but yet your point does not convince me. SplashScreen example does not count too: This class cannot be instantiated. Only a single instance of this class can exist, and it may be obtained by using the getSplashScreen() static method. In case the splash screen has not been created at application startup via the command line or manifest file option, the getSplashScreen method returns null. So it is not an instantiation but a singleton pattern.Alvord
E
9

One possible reason is that this way, caching can later be added into the method.

Another possible reason is readability. Consider this (often cited) object:

class Point2d{
  static Point2d fromCartesian(double x, double y);
  static Point2d fromPolar(double abs, double arg);
}

Point2d.fromCartesian(1, 2) and Point2d.fromPolar(1, 2) are both perfectly readable and unambiguous (well... apart from the argument order).

Now, consider new Point2d(1, 2). Are the arguments cartesian coordinates, or polar coordinates? It's even worse if constructors with similar / compatible signatures have entirely different semantics (say, int, int is cartesian, double, double is polar).

This rationale applies to any object that can be constructed in multiple different ways that don't differ in just the argument type. While Pattern, currently, can only be compiled from a regex, different representations of a Pattern may come in the future (admittably, then, compile is a bad method name).

Another possible reason, mentioned by @Vulcan, is that a constructor should not fail.

If Pattern.compile encounters an invalid pattern it throws a PatternSyntaxException. Some people may consider it a bad practice to throw an exception from a constructor. Admittably, FileInputStream does exactly that. Similarly, if the design decision was to return null from the compile method, this would not be possible with a constructor.


In short, a constructor is not a good design choice if:

  • caching may take place, or
  • the constructor is semantically ambiguous, or
  • the creation may fail.
Etiquette answered 7/12, 2012 at 8:3 Comment(11)
+1, but I disagree about constructors not supposed to throw exceptions. I nearly included the same argument about exception throwing in my answer, but then I considered the counter-point of new FileInputStream(..) which can throw an IOException. While I agree that throwing exceptions in constructors is messy, it's not an uncommon practice, especially in the java.net and java.io packages.Minneapolis
Also, the compile method is also very often called from a static initialization block (static final Pattern p = Pattern.compile(".");), so there is no difference here.Historiography
@PhilippWendler au contraire. The constructors that are most often in a static initialisation block are the worst to throw an exception (but the same applies to other methods there - you are right.Etiquette
@Vulcan what about "might be considered a bad practice"?Etiquette
Pattern.compile is nothing more than a named constructor. It can and will be called anywhere and everywhere someone would have said new Pattern if they could -- including static init blocks -- and would cause all the same problems. So any advice about avoiding exceptions in constructors should apply to it as well. The rules don't change just because the name does.Kilohertz
@Kilohertz I believe constructors are not expected to fail. Static methods are.Etiquette
@JanDvorak: Do you have a link or something that says this? I've heard it said in C++, but there are reasons for it there that don't quite apply in Java.Kilohertz
@Kilohertz I don't. Should I remove the statement from the answer?Etiquette
@JanDvorak: Personally, i would. But that's cause i disagree with it, so i'm a bit biased already. :) Keep it if you can back it up.Kilohertz
@Kilohertz I have used the "may be considered" clause. Is it enough to say it might be a matter of opinion?Etiquette
@JanDvorak: As long as the "matter of opinion" part is clarified. Maybe "Some people might consider it a bad practice...", or something like that. Still a bit weasel-wordy, but at least it has less of a "this is so" feel.Kilohertz
A
6

This is just a design decision. In this case there is no "real" advantage. However, this design allows optimisation (caching for instance) without changing the API. See http://gbracha.blogspot.nl/2007/06/constructors-considered-harmful.html

Argon answered 7/12, 2012 at 7:46 Comment(1)
The one more reason is that adding optimisation later into constructor would be a bad practice (passing this while in constructor). Doing the same in a static method is ok.Jehial
H
5

Factory methods have several advantages, some of which are already specified in other answers. The advice to consider factory methods instead of constructors is even the very first chapter in the great book "Effective Java" from Joshua Bloch (a must-read for every Java programmer).


One advantage is that you can have several factory methods which have the same parameter signatures but different names. This you can't achieve with constructors.

For example, one might want to create a Pattern from several input formats, all of which are just Strings:

class Pattern {
  compile(String regexp) { ... }
  compileFromJson(String json) { ... }
  compileFromXML(String xml) { ... }
}

Even if you are not doing this when you create the class, factory methods give you the ability to add such methods latter without causing weirdness.

For example, I have seen classes where the need for a new constructor came later and a special meaning-less second parameter had to be added to the second constructor in order to allow overloading. Obviously, this is very ugly:

class Ugly {
  Ugly(String str) { ... }

  /* This constructor interpretes str in some other way.
   * The second parameter is ignored completely. */
  Ugly(String str, boolean ignored) { ... }
}

Unfortunately, I can't remember the name of such a class, but I think it even was in the Java API.


Another advantage which has not been mentioned before is that with factory methods in combination with package-private constructors you can prohibit sub-classing for others, but still use sub-classes yourself. In the case of Pattern, you might want to have private sub-classes like CompiledPattern, LazilyCompiledPattern, and InterpretedPattern, but still prohibit sub-classing to ensure immutability.

With a public constructor, you can either prohibit sub-classing for everybody, or not at all.

Historiography answered 7/12, 2012 at 7:53 Comment(0)
B
2

If you really want to take the deep dive, plunge into the archives of JSR 51.

Regular expressions have been introduced as part of JSR 51, that’s where you might still find the design decisions in their archives, http://jcp.org/en/jsr/detail?id=51

Billibilliard answered 7/12, 2012 at 7:51 Comment(0)
I
1

It has a private constructor.

 /**
     * This private constructor is used to create all Patterns. The pattern
     * string and match flags are all that is needed to completely describe
     * a Pattern. An empty pattern string results in an object tree with
     * only a Start node and a LastNode node.
     */
    private Pattern(String p, int f) {

and compile method calls into that.

public static Pattern compile(String regex) {
        return new Pattern(regex, 0);
    }

Since you are using == comparison which is for references it will not work

The only reason I can think of this behaviour is that the match flag will be defaulted to zero in the compile method which acts a factory method.

Ionia answered 7/12, 2012 at 7:31 Comment(6)
Well, I didn't say it hasn't a constructor. I just said it does not have a PUBLIC constructor :)Cecilla
The question is why is it so, especially when the compile method is this simple?Etiquette
compile will behave as a factory method creating Pattern instances. The only reason I can think of is defaulting the match flag to zero in the compile method.Ionia
@AjayGeorge sure, but Pattern does not need a factory ;-)Etiquette
The defaulting behavior can just as easily happen in a constructor.Etiquette
true.. it is just hiding that part from most users who are not concerned about what that flag is. Maybe there is a better reason. Let us wait for more answers to be posted.Ionia

© 2022 - 2024 — McMap. All rights reserved.