MongoDB indexing tutorial with examples

Farshad Tofighi
6 min readFeb 9, 2024
MongoDB Indexing

Even if you haven't worked with MongoDB, you probably know that this database is used to work with big data which gives you more speed than other databases like MySQL, but it’s interesting to know that MongoDB uses some types of indexing that even increase the performance of your database much more faster.

In this article, I wanna explain to you about all types of indexing step by step with examples, so let’s go.

How does indexing work in MongoDB?

MongoDB indexing behind the scenes

First off, let’s talk about indexing behind the scenes. In this example, we have a collection called books, and the rating key in the document is indexed, As you see in the picture above, MongoDB created a “Book_Rating” list which stored the value of rating for each document and each one of them has a pointer to it’s document. so whenever you want to search by rating, MongoDB searches on the index list and when finds the value, gets the document referenced to this value instead of searching all the documents.

Alright, so far we’ve understood how indexing works with a simple example. Now, let’s move on to the types of indexes.

MongoDB indexing types:

1. Single field

2. Compound index

3. Multikey index

4. Text index

5. Hashed index

6. Geospatial index

Single Field:

The simplest type of indexing is single field which you can index a single key from a document like the example above in the picture.

it stores information from a single field in a collection. By default, all collections have an index on the _id field. You can add additional indexes to speed up important queries and operations.

db.books.createIndex({rating: 1})

Number 1, indicating ascending order. this means that the data placed in the index list will be arranged in ascending order.

Number -1, indicating descending order. this means that the data placed in the index list will be arranged in descending order.

For example, most of books usually have a rating more than 5, so you can set your index to -1 to sort your index list in descending order and have more chances to find your data faster.

Compound Index:

Compound index is like the single field but the difference is that you can index multiple keys. data is grouped by the first field in the index and then by each subsequent field.

db.books.createIndex({rating: 1, publish_date: -1})

By the command above you can create your compound index on two fields, “rating ” and “publish_date”. this index is useful when most of the time you want to query on two keys or more in your collection.

As I mentioned in the previous example, -1 sorts your index list in descending order on your specific key.

Multikey Index:

Multikey index is an index type that allows indexing on array fields. When you create an index on a field that contains arrays, MongoDB creates separate index entries for each element of the array. This enables efficient querying and retrieval of documents based on the elements within arrays.

MongoDB multikey indexing
MongoDB multikey indexing

Assume we have the document above, now we can create an index on the “genres” array:

db.books.createIndex({"genres": 1})

This index allows for efficient querying based on individual genres. For example, you can find all books in the “Fiction” genre using the following query:

db.books.find({"genres": "Fiction"})

Text Index:

Text index is a type of index that enables full-text search on string content within documents. It allows you to perform text search queries efficiently, enabling features like keyword search, phrase search, and natural language processing.

You can create a text index on the “title” field like this:

db.books.createIndex({title: "text"})

After creating the text index, you can perform text search queries using the $text operator. Here's an example of how you might use the text index to search for articles containing specific keywords:

db.books.find({$text: {$search: "Great Gatsby"}})

This query will return all books in the “books” collection that contain the words “Great” and “Gatsby” in the “title” field.

Hashed Index:

When you create a hashed index on a field, MongoDB hashes the values of that field and stores the hashed values in the index. This makes queries that match on the hashed field fast, as MongoDB only needs to perform a hash computation on the query value to find matching documents.

MongoDB Hashed Index

Here’s how you create a hashed index:

db.books.createIndex({author: "hashed"})

After creating the hashed index, you can perform equality matches on the hashed field efficiently. For example:

db.books.find({author: "F. Scott Fitzgerald" })

It’s important to note that hashed indexes are not suitable for range queries or sort operations. They are specifically designed for equality matches on fields with high cardinality. Additionally, because hashed indexes store hashed values rather than the original values, you cannot directly retrieve the original values from the index.

Geospatial Index:

A geospatial index is a specialized index type that enables efficient querying of spatial data. It allows you to store and query data based on their geographical coordinates, such as points, lines, and polygons. Geospatial indexes use specific geometric algorithms to perform spatial queries, allowing applications to find and analyze data based on their proximity to certain locations, spatial containment, or other geometric relationships.

There are two types of geospatial indexes:

2d Index: This index is used for storing latitude and longitude coordinates on a flat surface, such as a map. It’s suitable for simple geometries like points and lines.

2dsphere Index: This index is used for storing more complex geometries on a sphere, such as points, lines, and polygons. It’s suitable for representing real-world locations on the Earth’s surface.

Here’s an example of how you might use a geospatial index on a “books” collection to store and query books based on their locations:

db.books.createIndex({ shops_location: "2dsphere" })

db.books.insertOne({
title: "The Great Gatsby",
author: "F. Scott Fitzgerald",
shops_location: {
type: "Point",
coordinates: [-73.961389, 40.781111]
}
})

// Query book shops near a specific location
db.books.find({
shops_location: {
$near: {
$geometry: {
type: "Point",
coordinates: [-73.9667, 40.7833],
},
$maxDistance: 10000
}
}
})

So far, we’ve covered all the indexes in MongoDB with examples. However, some other important and practical aspects of this topic remain, which I’d like to explain to you.

1. TTL index

2. Sparse index

let’s find out what are these and how to use them.

TTL Index:

A TTL (Time-To-Live) index is a special type of index that automatically removes documents from a collection after a specified amount of time. This feature is particularly useful for managing data that has a limited lifespan or needs periodic cleanup, such as session data, logs, or temporary data.

When you create a TTL index on a field, MongoDB automatically checks the indexed field in each document to see if its value has expired based on a specified time threshold. If a document’s indexed field value exceeds the specified time threshold, MongoDB removes that document from the collection.

Here’s how you create a TTL index:

// expires after 1 hour
db.books.createIndex({publish_date: 1}, { expireAfterSeconds: 3600})

It’s important to note that the indexed field must be a BSON date or an array of BSON dates in order to create a TTL index. Additionally, TTL indexes have a resolution of approximately 60 seconds, meaning documents may not be removed immediately after their expiration time but rather during the next index cleanup operation.

Sparse Index:

Sparse indexes are particularly beneficial when indexing fields that are not present in all documents, as they can help reduce index size and improve query performance by excluding documents that do not contain the indexed field.

Here’s how you create a sparse index:

db.books.createIndex({rating: -1 }, {sparse: true})

After creating the sparse index, only documents containing the “rating” field will be indexed. Documents without the “rating” field will not be included in the index.

However, sparse indexes may not be as efficient for fields that are present in a significant portion of documents, as the index would still need to be scanned for those documents.

I hope this article has been helpful for you. If you have any questions about this article, please leave a comment, and I’ll be happy to respond. I look forward to your valuable feedback✅

--

--