Comparing Values for Equality in .NET: Identity and Equivalence (Part 1)

This article is now available on the Code Project at http://www.codeproject.com/dotnet/DotNetEquality.asp

Introduction

The various ways of comparing two values for equality in .NET can be very confusing. In fact if we have two objects a and b in C# there are at least four ways to compare their identity, plus one operator that looks like an identity comparison to add to the confusion:

i) if (a.Equals(b)) {}
ii) if (object.Equals(a, b)) {}
iii) if (object.ReferenceEquals(a, b) {}
iv) if (a == b) {}
v) if (a is b) {}

As if that isn’t confusing enough, these methods and operators behave differently depending on:

– whether a and b are reference types or value types
– whether they are reference types which are made to behave like value types for these purposes (System.String is one of these)

This post is an attempt to clarify why we have all these versions of equality, and what they all mean.

What does it mean to be the same?

Firstly, we have to understand that there are actually two basic types of equality for objects:

1. Identity (reference equality)
Two objects are identical if they actually are the same object in memory. That is, references to them point to the same memory address.
2. Equivalence (value equality)
Two objects are equivalent if the value or values they contain are the same.

So if we have two integers, a and b, both set to value 3, they are equivalent (they have the same value) but not necessarily identical (a and b can refer to different memory addresses).

However if two objects are identical (the same object) then they must be equivalent (have the same underlying values).

What type of Equality do we expect?

Clearly these notions of identity and equivalence are related to the concept of reference types and value types.

Value types are intended as lightweight objects that have value semantics: two objects are the same if they have the same value, and then can be used interchangeably. So integers a and b are the same in the example above because their values are both 3, it doesn’t matter if references a and b actually refer to the same underlying object in memory.

We don’t in general expect reference types to behave this way. Suppose we have two separate objects of type Book (a class). Book has one member variable called ‘title’ (a string). Do we necessarily consider these the ‘same’ Book if they have the same title? We might do so, but it isn’t clear.

To clarify the situation we might add an additional field ‘BookId’ which is unique for a given actual book. We could then say that two books are the same if they have the same BookId, even if they have different titles. But then we wouldn’t normally expect to have two separate Books with the same BookId in memory at the same time: there’s only one underlying book. So potentially we can just compare memory addresses to see if two Books are the same.

The point is that equality for reference types is trickier to define. Our default definition is going to be that two reference types are the same if they are identical.

Types of Equality

Now I’ll go through each of the types of equality referred to in the first paragraph in turn and try to explain why they exist. I’ll also explain how they are implemented for value and reference types, and when you should override or overload them.

i) a.Equals(b)

a.Equals(b): Overview

Equals() is a virtual method on System.Object. This means every single object can call this, and in your own type definitions you can override it to give the behaviour you want.

The base System.Object implementation of Equals() is to do an identity comparison. However, Equals() is intended to test for identity or equivalence as appropriate (see the discussion in the paragraph above).

a.Equals(b): Value Types

For value types this method is overridden to do a value (equivalence) comparison. In particular, System.ValueType itself, the root of all value types, contains an override that will compare two objects by reflecting over their internal fields to see if they are all equal. If you inherit this (by setting up a struct) your struct will get this override by default.

a.Equals(b): Reference Types

For reference types, as discussed above, the situation is trickier. In general we expect Equals() for reference types to do an identity comparison (to check whether the objects actually are the same in memory).

However, certain reference types aren’t lightweight enough to work as value types, but nevertheless have value semantics. The canonical example is System.String. System.String is a reference type. However if we have a = “abc” and b = “abc” we expect a to be equal to b. So in the framework Equals() is overridden to do a value comparison.

a.Equals(b): Override or not?

As mentioned above, for value types there is a default override of a.Equals(b) in the base class System.ValueType which will work for any structs you set up. This method uses reflection to iterate over all of the fields of the two value types you are trying to compare, checking that their values are equal. In general this is what you want for value type comparison.

However, the overridden Equals() method uses reflection, which is slow, and involves a certain amount of boxing. For speed optimization it can be good to override this method. For a more detailed discussion of this see Jeffrey Richter’s book ‘Applied Microsoft .NET Framework Programming’.

In general it is considered good practice to leave Equals() doing its default identity comparison when defining new reference types (classes). The exception is when you know you want value semantics for your class (like System.String), or when you want Equals to work in a specific way. In particular, if your class is going to be used as a key in a Hashtable you need to override Equals if that is to be in any way efficient.

Note that if you override a.Equals(b) you should also override GetHashCode() and should consider overriding IComparable.CompareTo().

To be continued in part 2

References:

Jeffrey Richter “Applied Microsoft .NET Framework Programming”
http://www.microsoft.com/mspress/books/sampchap/5353.aspx#SampleChapter

4 thoughts on “Comparing Values for Equality in .NET: Identity and Equivalence (Part 1)

  1. Great article explaining Equivalence and Identity in value and reference types. Very useful topic for Exam 70-536!!
    keep it up and keep writing more articles

  2. “Note that if you override a.Equals(b) you should also override GetHashCode() and should consider overriding IComparable.CompareTo().”

    While I’m still likely to consider the implementing the IComparable interface when overriding bool Equals(object other), I’m almost guaranteed to implement IEquatable.

  3. I was basically looking for tips for my personal website and came across your post, “Comparing
    Values for Equality in .NET: Identity and
    Equivalence (Part 1) | Rich Newman” 400ans400blogues , will you care in case I really make use of a bit of
    of ur concepts? Thank you -Leta

Leave a comment