PP
Published on

How Instagram Scales: Inside Their Brilliant 64-Bit ID Generation System

Authors

    How Instagram Scales: Inside the Brilliant 64-Bit ID Generation System

    When you're building an application that processes billions of new records every day, conventional database design patterns start to fall apart.

    Traditional auto-incrementing IDs create bottlenecks and single points of failure. UUIDs solve uniqueness problems but often come at the cost of larger indexes, increased storage usage, and slower database performance.

    To address these challenges, Instagram engineered a custom 64-bit ID generation system that is globally unique, chronologically sortable, and completely free from network coordination.

    The result is an elegant architecture that scales effortlessly across thousands of database shards while maintaining exceptional performance.

    Let's explore how it works, why it's so effective, and what lessons developers can apply to their own systems.


    The Problem with Traditional ID Generation

    Before diving into Instagram's approach, it's important to understand why common ID generation strategies become problematic at large scale.

    Auto-Incrementing IDs

    Auto-incrementing integers work perfectly in a single database instance.

    id SERIAL PRIMARY KEY
    

    However, once data is distributed across multiple database servers, collisions become inevitable. Two different shards can generate the same ID simultaneously, creating conflicts when data is merged or queried globally.

    For example:

    • Shard A generates Post #105
    • Shard B also generates Post #105

    Without additional coordination, uniqueness is lost.


    UUIDs

    UUIDs solve the uniqueness problem by generating 128-bit random identifiers.

    Example:

    550e8400-e29b-41d4-a716-446655440000
    

    While collisions become practically impossible, UUIDs introduce new challenges:

    • They require 16 bytes instead of 8 bytes for a standard 64-bit integer.
    • Random insertion patterns fragment database indexes.
    • Indexes become larger and less cache-friendly.
    • Query performance degrades as datasets grow.

    At Instagram's scale, these costs become significant.


    Centralized ID Services

    Another approach is using a dedicated ID generation service, similar to Twitter's Snowflake architecture.

    In this model:

    1. Applications request IDs from a central service.
    2. The service guarantees uniqueness and ordering.
    3. Applications use the returned IDs when writing data.

    Although effective, this creates a new dependency:

    • Additional infrastructure to maintain.
    • Network latency on every ID request.
    • A potential single point of failure.

    If the ID service becomes unavailable, data creation across the platform may stop entirely.

    Instagram wanted something simpler.


    The Design Goals

    Instagram's engineering team wanted IDs that were:

    • Globally unique
    • Chronologically sortable
    • Compact (64-bit integers)
    • Fast to generate
    • Independent of centralized coordination
    • Easy to route across shards

    Their solution was to encode multiple pieces of information directly into a single 64-bit integer.


    Anatomy of an Instagram ID

    Instagram divides a 64-bit integer into three distinct sections:

    +---------------------------------------------+---------------+------------+
    |               Timestamp                     |   Shard ID    |  Sequence  |
    |               (41 bits)                     |   (13 bits)   |  (10 bits) |
    +---------------------------------------------+---------------+------------+
    

    Each section serves a specific purpose.


    1. Timestamp (41 Bits)

    The first 41 bits store the number of milliseconds elapsed since a custom epoch.

    Instead of using the Unix epoch, Instagram defines its own starting point.

    For example:

    January 1, 2011
    

    Using 41 bits allows approximately:

    • 2.2 trillion milliseconds
    • Around 69 years of operation

    This ensures IDs remain unique and sortable for decades.


    2. Shard ID (13 Bits)

    The next 13 bits identify the logical database shard.

    With 13 bits available:

    2^13 = 8,192 shards
    

    This gives Instagram enormous horizontal scaling capacity.

    Every generated ID permanently records which shard owns the data.


    3. Sequence Number (10 Bits)

    The final 10 bits store a local counter.

    This counter increments whenever multiple records are created during the same millisecond.

    Capacity per millisecond:

    2^10 = 1,024 IDs
    

    That means every shard can safely generate:

    • 1,024 IDs per millisecond
    • Over 1 million IDs per second

    without collisions.


    How the System Works

    The truly clever aspect of Instagram's design is where the logic executes.

    Instead of generating IDs on application servers, the entire process runs inside PostgreSQL using stored procedures.

    When a new photo, comment, or like is inserted:

    1. PostgreSQL reads the current time in milliseconds.
    2. The custom epoch is subtracted.
    3. The timestamp is shifted into the upper 41 bits.
    4. The shard ID is inserted into the middle 13 bits.
    5. A local sequence number is added to the lower 10 bits.
    6. The final 64-bit integer is returned.

    Because every database shard knows its own shard identifier and maintains its own local sequence counter, no communication with other servers is required.

    The result is instant, globally unique ID generation.


    PostgreSQL Implementation

    A simplified version of Instagram's approach can be implemented directly inside PostgreSQL.

    -- Create a sequence for the local counter
    CREATE SEQUENCE photo_id_sequence;
    
    -- Custom ID generation function
    CREATE OR REPLACE FUNCTION generate_instagram_id(shard_id int)
    RETURNS bigint AS $$
    DECLARE
        our_epoch bigint := 1293840000000;
        seq_id bigint;
        now_millis bigint;
        result bigint;
    BEGIN
        SELECT floor(extract(epoch FROM clock_timestamp()) * 1000)
        INTO now_millis;
    
        seq_id := nextval('photo_id_sequence') % 1024;
    
        result := (now_millis - our_epoch) << 23;
    
        result := result | (shard_id << 10);
    
        result := result | seq_id;
    
        RETURN result;
    END;
    $$ LANGUAGE plpgsql;
    

    Using it in a table is straightforward:

    CREATE TABLE user_photos (
        id BIGINT NOT NULL DEFAULT generate_instagram_id(42),
        user_id INT NOT NULL,
        image_url TEXT NOT NULL
    );
    

    Every inserted row automatically receives a globally unique ID.


    Three Major Engineering Advantages

    This architecture delivers several powerful benefits.

    1. Chronological Sorting for Free

    Since the timestamp occupies the highest-order bits, IDs naturally sort by creation time.

    This means:

    ORDER BY id DESC
    

    produces the same ordering as:

    ORDER BY created_at DESC
    

    without requiring an additional timestamp index.

    At Instagram's scale, reducing index count translates into massive savings in memory and storage.


    2. Built-In Data Routing

    The shard identifier is embedded directly into every ID.

    Any backend service can inspect the ID, extract the shard bits, and immediately determine where the data resides.

    No lookup service is required.

    The ID itself becomes the routing map.

    This significantly simplifies distributed system architecture.


    3. Near-Zero Latency

    Because IDs are generated locally within PostgreSQL:

    • No network requests are required.
    • No central service is contacted.
    • No coordination occurs between shards.

    Record creation happens at native database speed.

    This dramatically reduces write latency across the platform.


    Instagram IDs vs UUIDv7

    Today, many engineers ask:

    Why not simply use UUIDv7?

    UUIDv7 was designed to address the weaknesses of older UUID versions by making identifiers time-sortable.

    A UUIDv7 contains:

    • A timestamp component
    • Random bits for uniqueness

    This makes database indexing much more efficient than traditional UUIDv4.

    However, Instagram's 64-bit approach still offers several advantages.

    FeatureInstagram 64-Bit IDUUIDv7
    Size64 bits (8 bytes)128 bits (16 bytes)
    Index OverheadSmaller and highly cacheableApproximately double the storage
    Chronological OrderingYesYes
    Data RoutingBuilt-in shard informationNot included
    ReadabilityCompact numeric valueLong hexadecimal string

    For most modern SaaS applications, UUIDv7 is an excellent choice because it requires almost no custom infrastructure.

    For ultra-high-scale systems where every byte of index memory matters, Instagram's packed 64-bit format remains one of the most efficient designs ever implemented.


    Key Lessons for Developers

    Instagram's ID generation system is a masterclass in pragmatic engineering.

    Instead of introducing complex coordination layers, the team leveraged simple bit manipulation and existing PostgreSQL capabilities to create a solution that scales to billions of records.

    The most valuable takeaway is not the exact bit layout itself.

    It's the design philosophy:

    • Push complexity to the edges.
    • Eliminate unnecessary coordination.
    • Make data self-describing whenever possible.
    • Use simple mathematical structures before adding infrastructure.

    Sometimes, a carefully designed 64-bit integer can replace an entire distributed system.

    And that's exactly what Instagram did.