Home

Published

- 4 min read

World of Identification Numbers (IDs)

img of World of Identification Numbers (IDs)

Background

Identification Numbers (IDs) are widely used in both the real world and the computing world.

Today let’s explore some libraries that help us generate IDs.

What kind of IDs do we want?

The are many ways we can generate an ID, but there are a few properties that we desire:

  • Uniqueness
  • Size
  • Ease of generation
  • Maximum number of keys

ELI5 different ways we can generate an ID

Imagine a world where every newborn is given an ID.

The simplest way is to start counting from 1 and let someone say James to keep track of the number.

Every time a new baby is born, they have to bring the baby to James to get an ID for the baby.

This might work for a small village, but not for a country where villages could be miles apart.

On top of that, when something unfortunate happened to James, who should carry on James’s duty to issue IDs?

One day, a mathematical genius had a great idea!

Every village will have an 8-sided dice with faces ABCDEFGH. For each newborn, they shall roll the dice 32 times to generate an ID. Example: ABCDEFGHABCDEFGHABCDEFGHABCDEFGH

Magically, everyone had a unique ID for centuries.

In the story, the village first used an auto-incremental mechanism to generate IDs, then a random-number generator mechanism.

Auto-incremental

A simple way to do auto-incremental could be:

var glocalCounter  = 1
func GenerateID() int {
	globalCounter += 1
	return globalCounter
}

But there are several issues. For example:

  1. Whenever the server/process restarts, the counter resets back to zero and newly generated IDs will overlap.
  2. When called concurrently, the same ID could be given out twice.

Hence in production, we would normally use a database like MySQL.

Whenever we need a new ID, we will create a new row for a table with an auto-incremental ID field.

CREATE TABLE Persons (
    PersonID int NOT NULL AUTO_INCREMENT,
    LastName varchar(255) NOT NULL,
    FirstName varchar(255),
    Age int,
    PRIMARY KEY (Personid)
);

The database makes the counter persistent and concurrent-safe.

Persistent: Outlives the process that created it, data is saved to disk.

Concurrent-safe: Ensures no same ID is given out twice.

However, there are downsides too:

  1. Depending on what database is used, it could become a single choke point.
  2. Extra maintenance for the database.

Random Generation

There are numerous ways to randomly generate an ID, usually with a mixture of rules.

For example, 128-bit UUID splits into 5 sections with varying lengths. Each section is randomly generated based on time and concatenated to form a UUID.

There are several UUID-inspired libraries and most of them are easy to use:


// go get "github.com/gofrs/uuid/v5"
id, err := uuid.NewV4()
if err != nil {
    return err
}
// 77a20b61-3f8f-47f1-ac3b-e8db2e9b3541

// go get github.com/rs/xid
id := xid.New()
// clfmvq6nr2ki9h6ssqp0

// go get github.com/sony/sonyflake
sf = sonyflake.NewSonyflake(uniqueMachineID)
if sf == nil {
    return err
}

id, err := sf.NextID()
if err != nil {
    return err
}
// 1722610988622020608

With UUIDs, we do not need to maintain a separate database and different servers could all generate IDs by themselves, just like each village having their own dice.

Apart from UUID, many companies have come up with their ID generation logic, like Snowflake by Twitter, MongoID by MongoDB. ID format plays a huge factor in how we store and search for data. As companies scale to millions or billions of users, their use cases evolve and their data storage methods have to evolve along.

For example, here are some disadvantages compared to auto-incremental IDs:

  • UUIDs(128bits) take up more space compared to auto-incremental IDs (32/64bits).
  • UUIDs are usually keys and large key size affects database performance.
  • Some versions of UUIDs are not sortable.

Rabbit Hole

Over to you, what other things should we look out for when designing IDs?

UUID, Snowflake, MongoID, XID, when to use which?

Alright, that is all for today. Let’s get one post smarter every day, and see you tomorrow!

https://github.com/gofrs/uuid

https://github.com/sony/sonyflake/tree/master

https://github.com/rs/xid

Subscribe

* indicates required

Intuit Mailchimp