Robustness in Java

Question:

Good day! Recently I came across a very interesting puzzle in Java. Which I would like to share.

Given next class

public class RobustCheck {

    private final char first;
    private final char second;

    public RobustCheck(char first, char second) {
        this.first = first;
        this.second = second;
    }

    public boolean equals (RobustCheck b) {
        return this.first == b.first && this.second == b.second;
    }

    public int hashCode() {
        return 31 * first + second;
    }

    public static void main(String[] args) {
        Set<RobustCheck> s = new HashSet<RobustCheck>();
        for(int i=0; i<10; i++)
            for(char ch='a'; ch<='z'; ch++) {
                s.add(new RobustCheck(ch, ch));
            }
        System.out.println(s.size());
    }
}

And questions:

  1. Why does the program return size 260 instead of 26? How do I fix it?
  2. What is missing in this program to make it indicate an error at compile time?

Answer:

A challenge to know the cornerstones of the Java language and the flaws in its design. In this case, the method RobustCheck#equals overload ( overloading ) method Object#equals , whereby the comparison objects RobustCheck used method Object#equals , that instead of comparing for equivalence compares an identity. All 260 RobustCheck objects created in the program are not identical, because are physically separate objects located in their own memory regions. Therefore, the HashSet methods "consider" them unequal to each other and add them to the set. While there are only 26 logically unequal RobustCheck objects, all subsequent ones are equivalent to one of these 26. Therefore, we also need to cover the ( overriding ) method of Object#equals , to change the semantics of the comparison. To do this, the signature of the RobustCheck#equals method must match the signature of the Object#equals ; Let's rewrite the method like this:

public boolean equals (Object b) {
    if (b instanceof RobustCheck)
        return this.first == ((RobustCheck)b).first && this.second == ((RobustCheck)b).second;
    else
        return false;
}

So that such errors (when instead of overlapping we mistakenly use overloading) can be noticed by the compiler, let us inform it of our intentions that we override the Object#equals method in the derived RobustCheck class by adding the appropriate annotation:

@Override
public boolean equals (Object b) {
    if (b instanceof RobustCheck)
        return this.first == ((RobustCheck)b).first && this.second == ((RobustCheck)b).second;
    else
        return false;
}

Well, now everything should work as it should:

>java RobustCheck
26

Unfortunately, this does not fix Java design flaws. A more interesting question, in addition to this problem, is how to avoid all this "mumba-jumba" at all when defining the semantics of comparing two objects? In the case of Java, there is unfortunately no answer. But more elaborate languages ​​can be used. For example, Scala , which determines correct semantics for equivalence comparison immutable (immutable) objects, relieving the programmer of this error prone tasks:

object Main extends App {
    // Неизменяемые объекты моделируются в Scala с помощью специальных
    // "case"-классов.
    case class RobustCheck(first: Char, second: Char)

    val s = collection.mutable.HashSet.empty[RobustCheck]

    for (i <- 1 to 10)
        for (c <- 'a' to 'z')
            s += RobustCheck(c, c)

    println(s.size)
}

We expect the result:

>scala Main
26

Without any problems.

Scroll to Top