Back to hashCode() mutability
May 29, 2007
My «HashSet.contains(): does your basket contain something?» post got too expected responses: «There is no way to avoid this behavior, why should you expect something else?».
Sure this behavior can not be changed it’s in nature of any hashed collection. I do not expect anyone to tilt at windmills. What I do expect is, first, ensure I will never stuck with this bug again and, second, prevent others from falling into the same trap. And this is definitely the point I expect the software vendors to help me.
Let’s look one more into the root of the problem: hash code should not be mutable while the entity exists in the collection. Actually, since we can newer be sure it does not, the hash code should never change since the entity was created. Thus, hash code calculation should be independent of object state, i.e. of its mutable fields.
Thus, the following suspicious code patterns should get a special attention in this context (I’ll generally use Java notation, the C# variation is trivial):
- Mutable hash code:
hashCode()
accesses fields that are notfinal
, or calls methods that access fields that are notfinal
or call methods that… and so on. - Using overridden
hashCode()
: an instance of class with overridenhashCode()
is added into hash-based collection (or one of its interfaces). More general – to any collection instance. - Broken contract:
hashCode()
andequals()
do not access the same fields or do not call the same methods (that do not access the same fields… and so on).
What can prevent or warn us about the patterns mentioned?
- Language level: not really can be taken into account, since providing language-level object identity is almost equal to just forbidding
hashCode()
override. - Compiler warning level: may be nice, also implementing recursive method inspection will require new paradigms definition and will add unnecessary complexity. In addition, this require implementing this functionality separate for each platform language.
- Code inspection: the most desirable option that should act at bytecode level and can be easily integrated into existing IDEs.
C# 3.0 anonymous classes make use of similar approach – the hash code of an object is immutable since both equals()
and getHashCode()
are compiler generated and both fields and properties are read-only.
IntelliJ IDEA 6 inspections list presents a good inspection for mutable hashcode, and something not so powerful for broken contract.
This small and annoying point is just a tiny part of features missing from existing IDEs (expected to provide developer with an ability to concentrate on application business logic development rather than on language or infrastructure implementation details).
May 29, 2007 at 2:56 pm
Seem like a fun small rule to implement in Gendarme (http://www.mono-project.com/Gendarme)
May 29, 2007 at 3:13 pm
Yes, as well as in FindBugs.
The problem is that tools like Gendarme and FindBugs will always loose the battle to new constraints introduced by new framework versions or updates, just because they are reacting rather than proacting in defining there inspections.
June 11, 2007 at 8:07 am
[…] 11th, 2007 Another example of problem similar to hashCode() mutability is an option for broken «consistent with equals» contract imposed by implementation of […]
July 15, 2007 at 8:25 pm
Let the IDE generate the equals() code, since its implementation is completely dictated by getHashCode(). In some very special cases the order of comparisons can be manually corrected to improve performance.
February 19, 2008 at 9:52 am
[…] example of problem similar to hashCode() mutability is an option for broken «consistent with equals» contract imposed by implementation of […]