Back to hashCode() mutability

May 29, 2007

My «HashSet.contains(): does your basket contain something?» post got too expected responses: «There is no way to avoid this behavior, why should you expect something else?».

Sure this behavior can not be changed — it’s in nature of any hashed collection. I do not expect anyone to tilt at windmills. What I do expect is, first, ensure I will never stuck with this bug again and, second, prevent others from falling into the same trap. And this is definitely the point I expect the software vendors to help me.
Let’s look one more into the root of the problem: hash code should not be mutable while the entity exists in the collection. Actually, since we can newer be sure it does not, the hash code should never change since the entity was created. Thus, hash code calculation should be independent of object state, i.e. of its mutable fields.

Thus, the following suspicious code patterns should get a special attention in this context (I’ll generally use Java notation, the C# variation is trivial):

  • Mutable hash code: hashCode() accesses fields that are not final, or calls methods that access fields that are not final or call methods that… and so on.
  • Using overridden hashCode(): an instance of class with overriden hashCode() is added into hash-based collection (or one of its interfaces). More general – to any collection instance.
  • Broken contract: hashCode() and equals() do not access the same fields or do not call the same methods (that do not access the same fields… and so on).

What can prevent or warn us about the patterns mentioned?

  • Language level: not really can be taken into account, since providing language-level object identity is almost equal to just forbidding hashCode() override.
  • C# 3.0 anonymous classes make use of similar approach – the hash code of an object is immutable since both equals() and getHashCode() are compiler generated and both fields and properties are read-only.

  • Compiler warning level: may be nice, also implementing recursive method inspection will require new paradigms definition and will add unnecessary complexity. In addition, this require implementing this functionality separate for each platform language.
  • Code inspection: the most desirable option that should act at bytecode level and can be easily integrated into existing IDEs.
  • IntelliJ IDEA 6 inspections list presents a good inspection for mutable hashcode, and something not so powerful for broken contract.

This small and annoying point is just a tiny part of features missing from existing IDEs (expected to provide developer with an ability to concentrate on application business logic development rather than on language or infrastructure implementation details).

5 Responses to “Back to hashCode() mutability”

  1. virtualblackfox Says:

    Seem like a fun small rule to implement in Gendarme (http://www.mono-project.com/Gendarme)


  2. Yes, as well as in FindBugs.

    The problem is that tools like Gendarme and FindBugs will always loose the battle to new constraints introduced by new framework versions or updates, just because they are reacting rather than proacting in defining there inspections.


  3. […] 11th, 2007 Another example of problem similar to hashCode() mutability is an option for broken «consistent with equals» contract imposed by implementation of […]

  4. kostat Says:

    Let the IDE generate the equals() code, since its implementation is completely dictated by getHashCode(). In some very special cases the order of comparisons can be manually corrected to improve performance.


  5. […] example of problem similar to hashCode() mutability is an option for broken «consistent with equals» contract imposed by implementation of […]


Leave a reply to kostat Cancel reply