Using Integer Document IDs in RavenDB Indexes

May 6, 2013

At work, we recently moved our database from MongoDB to RavenDB. In this context, we set up a couple of indexes for all frequent queries to optimize reading documents. We also adjusted quite a few data access methods in our application to query the indexes instead of directly loading documents by their ID.

#The Issue: Indexing Troubles

When we wrote the aforementioned indexes, we ran into a problem with integer document IDs. All of our entity POCOs use an ID property of type int. The document IDs are standard RavenDB document names when using integer IDs: They're composed of the POCO class name and the value of the ID property, thus making them human-readable (e.g. comments/1337).

Before we start, let me give you a quick overview over one of the indexes we're using in our application.

#Our Scenario: Indexing Comments by Topic

In our application, we have a pretty simple Comment class, which looks as follows. Note that it actually has a couple more properties, which I omitted here for the sake of brevity.

public class Comment
{
    public int ID { get; set; }
    public int TopicID { get; set; }
    public string Author { get; set; }
    public string Text { get; set; }
}

The TopicID holds information about the topic that was commented. Since our application requires comments to be queried by topic, we created an index which, well, indexes the TopicID property:

public class Comments_ByTopic
    : AbstractIndexCreationTask<Comment, Comments_ByTopic.QueryResult>
{
    public class QueryResult
    {
        public int ID { get; set; }
        public int TopicID { get; set; }
        public string Author { get; set; }
        public string Text { get; set; }

        // More properties (omitted)
    }

    public Comments_ByTopic()
    {
        Map = comments =>
            from comment in comments
            select new QueryResult
            {
                ID = comment.ID,
                Author = comment.Author,
                TopicID = comment.TopicID,
                Text = comment.Text.Value,

                // More stuff happening here (loading documents, ...)
            };

        Index(x => x.TopicID, FieldIndexing.NotAnalyzed);

        StoreAllFields(FieldStorage.Yes);
    }
}

Actually, our index does a little more than shown here. We don't store the author as a string, for example, but instead an ID referencing the corresponding user document. The index then makes use of RavenDB's LoadDocument<T> feature to pull in the author document for each comment. However, I left out this part since this post isn't about LoadDocument<T>.

With the index defined as above, there was one problem, though: The ID property was never part of the indexed fields and, consequently, was always 0 when queried. So, what do you do? Let's have a look at the workaround that solved the issue for us.

#Our Solution: Two ID properties

We tried different things to make the index work correctly. In the end, we created a separate DocumentID property of type string and told RavenDB to treat it as the document ID:

var documentStore = new DocumentStore
{
    ConnectionStringName = "RavenDB",
    Conventions =
    {
        FindIdentityProperty = prop => prop.Name == "DocumentID"
    }
};

This DocumentID property contains the full (!) ID of each document, e.g. comments/1337. Because we're only interested in the 1337 part — which is the actual integer ID we deal with in our application — we split the string when indexing the documents:

Map = comments =>
    from comment in comments
    select new QueryResult
    {
        ID = int.Parse(comment.DocumentID.ToString().Split('/')[1]),
        Author = comment.Author,
        TopicID = comment.TopicID,
        Text = comment.Text.Value,

        // More stuff happening here (loading documents, ...)
    };

The Map expression as listed above made the index work for us. We also wrote a little wrapper around the integer ID property because we didn't want to change our codebase to use strings as document IDs:

[JsonIgnore]
private string _documentID { get; set; }

public string DocumentID
{
    get { return _documentID; }
    set
    {
        _documentID = value;
        ID = int.Parse(value.Split('/')[1]);
    }
}

[JsonIgnore]
public int ID { get; private set; }

While the solution may seem a little hacky, it works smoothly. Please note that if you're using this DocumentID property, the corresponding documents' names all have to follow the <collectionName>/<ID> pattern.