Investigating a crash in Enumerable.LastOrDefault with a custom collection

 
 
  • Gérald Barré

This post is part of the series 'Crash investigations and code reviews'. Be sure to check out the rest of the blog posts of the series!

Have you ever seen the following thread-safety remark on some concurrent collections?

Thread-Safety remark for ConcurrentBag

In this post, I'll explain what it means. Indeed, we got a crash in an application when using Enumerable.LastOrDefault on a custom collection that has the same thread-safety remark. Here's the call stack of the crash:

System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter 'index')
   at System.Collections.Immutable.Requires.FailRange(String parameterName, String message)
   at System.Collections.Immutable.ImmutableList`1.Node.ItemRef(Int32 index)
   at System.Collections.Immutable.ImmutableList`1.get_Item(Int32 index)
   at System.Collections.Immutable.ImmutableList`1.System.Collections.Generic.IList<T>.get_Item(Int32 index)
   at MyApplication.CustomCollection`1.get_Item(Int32 index)
   at System.Linq.Enumerable.TryGetLast[TSource](IEnumerable`1 source, Boolean& found)
   at System.Linq.Enumerable.LastOrDefault[TSource](IEnumerable`1 source)
   at MyApplication.Program.DoJob[T](CustomCollection`1 items)

Here's the incriminated code:

C#
static void DoJob<T>(CustomCollection<T> items)
{
    var lastItem = list.LastOrDefault();
    Console.WriteLine(lastItem);
}

Here's the code for the custom collection:

C#
class CustomCollection<T> : IList<T>, IReadOnlyList<T>
{
    private readonly object _lock = new object();
    private ImmutableList<T> _list = ImmutableList.Create<T>();

    public T this[int index]
    {
        get => _list[index];
        set => throw new NotSupportedException();
    }

    public int Count => _list.Count;
    public bool IsReadOnly => false;
    public int IndexOf(T item) => _list.IndexOf(item);

    public IEnumerator<T> GetEnumerator() => _list.GetEnumerator();

    public void Add(T item)
    {
        lock (_lock)
        {
            _list = _list.Add(item);
        }
    }

    public bool Remove(T item)
    {
        lock (_lock)
        {
            var oldList = _list;
            _list = _list.Remove(item);
            return _list != oldList;
        }
    }

    // non-relevant methods omitted for brevity
}

The call-stack seems to indicate a thread-safety issue somewhere. There is no reason for a single-thread application to crash here. As you can see, all methods of the collection are thread-safe, so it doesn't seem to come from the collection itself. Also, there is no problem enumerating an ImmutableList<T> while modifying the collection from another thread.

The only remaining part is Enumerable.LastOrDefault. LastOrDefault calls TryGetLast. Let's look at the code of this method:

C#
// https://github.com/dotnet/runtime/blob/f86385a1a452a7ac08277f2ef66ebe302b3c90ca/src/libraries/System.Linq/src/System/Linq/Last.cs#L43
private static TSource TryGetLast<TSource>(this IEnumerable<TSource> source, out bool found)
{
    // null check and non-relevant branches omitted for brevity
    if (source is IList<TSource> list)
    {
        int count = list.Count;
        if (count > 0)
        {
            found = true;
            return list[count - 1];
        }
    }
    else
    {
        using (IEnumerator<TSource> e = source.GetEnumerator())
        {
            if (e.MoveNext())
            {
                TSource result;
                do
                {
                    result = e.Current;
                }
                while (e.MoveNext());

                found = true;
                return result;
            }
        }
    }

    found = false;
    return default!;
}

You can see that there is a special case when the collection implement IList<T>. In this case, it call Count and use the accessor to get the item. When multiple threads access the collection, there is no guarantee that the collection is not changed between the calls to list.Count and list[count - 1]. So, if you remove one item from the collection between those 2 calls, you'll get an IndexOutOfRangeException. This means that even if the GetEnumerator method is thread-safe, the LastOrDefault method is not when the collection implements IList<T>.

C#
int count = list.Count;     // The collection can change between this call
if (count > 0)
    return list[count - 1]; // and this one. If one item is removed, this will crash

There are multiple ways to fix this issue. One of them is to add a new method on the custom collection to get the last item and operate on the ImmutableList<T> directly.

C#
class CustomCollection<T> : IList<T>, IReadOnlyList<T>
{
    private ImmutableList<T> _list = ImmutableList.Create<T>();

    // This call is safe as it operates directly on the ImmutableList<T>
    // instead of the CustomCollection<T>
    public T LastOrDefault() => _list.LastOrDefault();
}

Or you can wrap the instance of the custom collection into a wrapper that only implements IEnumerable<T>:

C#
customCollection.AsOnlyEnumerable().LastOrDefault();

static class EnumerableWrapperExtensions
{
    // Note that AsEnumerable has a different behavior, thus the name AsOnlyEnumerable
    // https://www.meziantou.net/what-s-the-usage-of-asenumerable.htm
    public static IEnumerable<T> AsOnlyEnumerable<T>(this IEnumerable<T> items)
    {
        foreach(var item in items)
            yield return item;
    }
}

Or you can implement it manually:

C#
static void DoJob<T>(CustomCollection<T> items)
{
    T lastItem = default;
    using var enumerator = items.GetEnumerator();
    while (enumerator.MoveNext())
    {
        lastItem = enumerator.Current;
    }

    Console.WriteLine(lastItem);
}

To conclude, be careful when working on multi-threaded applications. There are so many traps…

#Additional resources

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?Buy Me A Coffee💖 Sponsor on GitHub