The Minecraft protocol

The Minecraft Protocol

The Minecraft Protocol is the set of rules and conventions that define how data is exchanged between the client and server in the Minecraft game. The protocol is used to establish a connection, authenticate users, and exchange game data, such as player movements, chat messages, and block updates.

In this post, we will explore the Minecraft Protocol and learn how to read and write packets to communicate with the Minecraft server.

Data types

The Minecraft Protocol uses various data types to represent different values, such as integers, strings, and arrays. Understanding these data types is essential for reading and writing packets correctly.

Data Types Overview

Here are some common data types used in the Minecraft Protocol:

NameSizeEncodesNotes
Boolean1true or falseEncoded as 0x00 for false and 0x01 for true.
Byte1Signed 8-bit integerRange from -128 to 127.
Unsigned Byte1Unsigned 8-bit integerRange from 0 to 255.
Short2Signed 16-bit integerRange from -32,768 to 32,767.
Unsigned Short2Unsigned 16-bit integerRange from 0 to 65,535.
Int4Signed 32-bit integerRange from -2,147,483,648 to 2,147,483,647.
Long8Signed 64-bit integerRange from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
Float432-bit floating-point numberIEEE 754 single-precision floating-point format.
Double864-bit floating-point numberIEEE 754 double-precision floating-point format.
Text ComponentNBT TagRepresents a String for plain text or a compound tag for formatted text.
JSON TextJSON stringEncoded as a String with JSON formatting.
VarIntSigned 32-bit integerVariable-length encoding for integers.
VarLongSigned 64-bit integerVariable-length encoding for long integers.
BitSetBit FlagsEncoded as a VarInt length followed by a byte array of bit flags.
Fixed BitSetFixed-Length Bit FlagsEncoded as a byte array of bit flags with a fixed length.
StringUTF-8 stringEncoded as VarInt length followed by UTF-8 bytes.
IdentifierStringEncoded as a String with namespace and path.
Entity MetadataNBT TagRepresents metadata for entities.
SlotItem StackRepresents an item stack in a slot.
NBT TagCompound TagRepresents a compound tag with key-value pairs.
Position8Block PositionEncoded as three 32-bit integers for X, Y, and Z coordinates.
Angle11/256th of a full turnEncoded as a byte representing a rotation angle.
UUID16Universally Unique IdentifierEncoded as 128 bits for a unique identifier.
OptionalOptional ValueEncoded as a Boolean flag followed by the value if present.
ArrayList of ValuesEncoded as a VarInt length followed by the array elements.
EnumEnumerated ValueEncoded as a VarInt index for the enum value.
Byte ArrayArray of BytesEncoded as a VarInt length followed by the byte array.

VarInt

A VarInt (Variable Integer) is a method of serializing integers using a variable number of bytes. This technique is often used in protocols and file formats to efficiently encode integer values of varying lengths, optimizing the amount of storage or bandwidth required.

VarInt Encoding

VarInt encodes an integer into one or more bytes. The number of bytes used depends on the size of the integer. Each byte has a continuation bit (most significant bit, MSB), which indicates whether the next byte is part of the integer. If the MSB is 1, the next byte is part of the integer. If the MSB is 0, the current byte is the last byte of the integer.

VarInt Structure

  • The integer is divided into 7-bit groups.
  • Each group is stored in a byte, with the MSB (8th bit) used as the continuation flag.

Encoding Steps

  1. Start with the integer value.
  2. Extract the least significant 7 bits and store them in a byte.
  3. Set the continuation bit (MSB) of the byte to 1 if there are more bits to encode; otherwise, set it to 0.
  4. Shift the integer right by 7 bits.
  5. Repeat the process until the entire integer is encoded.

Decoding Steps

  1. Read the first byte.
  2. Extract the lower 7 bits and append them to the result.
  3. If the continuation bit (MSB) is set to 1, read the next byte and repeat the process.
  4. Continue until a byte with the continuation bit set to 0 is read.

VarInt Example

Let's consider encoding and decoding the integer value 300:

Binary Representation:

300 in binary is 100101100.

Split into 7-bit Groups:

The binary representation is split into two 7-bit groups: 0010110 and 0000100.

Encoding:

  • First byte: 0010110 (binary) -> 0010110 (7 bits) + 1 (MSB) -> 1010110 (binary) -> 0xAC (hex).
  • Second byte: 0000100 (binary) -> 0000100 (7 bits) + 0 (MSB) -> 0000100 (binary) -> 0x04 (hex).
  • The VarInt encoding of 300 is 0xAC 0x04.

Decoding:

  • Read the first byte 0xAC (binary 1010110), extract 0010110 (value 22).
  • Read the second byte 0x04 (binary 0000100), extract 0000100 (value 4).
  • Combine the values: 4 << 7 (shift left by 7 bits) + 22 = 300.

VarInt Samples

Here are some examples of VarInt encoding and decoding:

ValueHex BytesDecimal Bytes
00x000
10x011
1270x7F127
1280x80 0x01128
2550xFF 0x01255
3000xAC 0x04300
20971510xFF 0xFF 0x7F2097151
21474836470xFF 0xFF 0xFF 0xFF 0x072147483647

VarInt as code

Here is an example of a function to read/write a VarInt from a buffer:

Loading...

VarLong

A VarLong (Variable Length Long) is essentially a 64-bit version of a VarInt, which you already understand. Like VarInt, VarLong encodes integers using a variable number of bytes to optimize space.

Key Points

  1. Similarities to VarInt:
    • Both VarInt and VarLong use the continuation bit (MSB) to indicate if more bytes follow.
    • The encoding and decoding process is similar, with the main difference being the number of bits they handle (VarInt for 32-bit, VarLong for 64-bit).
  2. Encoding and Decoding:
    • Encoding:
      • Divide the 64-bit integer into 7-bit groups.
      • Store each group in a byte, setting the MSB to 1 if more bytes follow, or 0 if it is the last byte.
    • Decoding:
      • Read bytes one by one, combining their lower 7 bits until a byte with the MSB of 0 is encountered.

VarLong Example

Let’s encode and decode the integer value 9223372036854775807 (max 64-bit signed integer):

  1. Binary Representation:
    • 9223372036854775807 in binary is 0111111111111111111111111111111111111111111111111111111111111111
  2. Split into 7-bit Groups:
    • The binary representation splits into nine 7-bit groups: 0111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1.
  3. Encoding:
    • First eight bytes: Each 0111111 -> 1111111 (7 bits + MSB 1) -> 0xFF.
    • Ninth byte: 1 -> 0000001s (7 bits + MSB 0) -> 0x01.
    • Result: 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0x01.
  4. Decoding:
    • Read bytes 0xFF (seven times) -> 0111111 each time, combined by shifting left by 7 bits.
    • Read byte 0x01 -> 1.
    • Combine values by shifting each 7-bit segment to its appropriate position, yielding 9223372036854775807.

VarLong Samples

Here are some examples of VarLong encoding and decoding:

ValueHex BytesDecimal Bytes
00x000
10x011
1270x7F127
1280x80 0x01128
2550xFF 0x01255
3000xAC 0x02300
20971510xFF 0xFF 0x7F2097151
21474836470xFF 0xFF 0xFF 0xFF 0x072147483647
92233720368547758070xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0x7F9223372036854775807

VarLong as code

Here is an example of a function to read/write a VarLong from a buffer:

Loading...

BitSet

Bit sets in the context of Minecraft protocol represent packed lists of bits used for efficiently storing and transmitting boolean data. There are two types of bit sets: BitSet and Fixed BitSet. Each serves different purposes and has unique encoding schemes.

A BitSet is a dynamically sized array of bits, typically used for more flexible bit storage.

BitSet Structure

  • Length: A VarInt indicating the number of longs in the following array. It can be 0 if no bits are set.
  • Data: An array of longs, representing the packed bit set.

BitSet Encoding

  1. Write the length of the long array as a VarInt.
  2. Write each long in the array. Each bit in the bit set corresponds to a specific bit position within these longs.

BitSet Decoding

To check if a bit is set:

Data[i/64]&(1(i%64))0 \text{Data}[i / 64] \& (1 \ll (i \% 64)) \neq 0

where 𝑖 is the bit position starting from 0.

BitSet Example

Here is an example of a function to read/write a BitSet from/to a buffer:

1. Create our BitSet structure:

Loading...

2. Read/Write functions:

Loading...

Fixed BitSet Example

A Fixed BitSet is a bit set with a fixed number of bits, typically used for storing a specific number of flags or options.

1. Create our Fixed BitSet structure:

Loading...

Strings

To read strings using a variable length encoding system like VarInt, the process generally involves reading the length of the string first (encoded as a VarInt) and then reading that many bytes to get the string data.

Strings Decoding

  1. Read a VarInt from the buffer to determine the length of the string.
  2. After determining the length of the string, the next step is to read that many bytes from the buffer.
  3. Convert the bytes to a UTF-8 string to get the final string value.

Strings Encoding

  1. Convert the string to a UTF-8 byte array.
  2. Encode the length of the byte array as a VarInt.
  3. Write the byte array to the buffer.

Strings Example

Here is an example of a function to read/write a string from/to a buffer:

Loading...

Identifier

An Identifier is a unique string that represents a resource in the Minecraft game. It consists of two parts: a namespace and a path. Identifiers are used to reference items, blocks, entities, and other game resources.

namespace:path
  • Namespace: The namespace defines the scope or category of the resource. It typically represents the mod or plugin that defines the resource. Common namespaces include minecraft for vanilla resources and custom namespaces for mods.
  • Path: The path is the unique name or identifier of the resource within the namespace. It specifies the specific item, block, or entity being referenced.

Position

To read positions encoded in a 64-bit value, split into three signed integer parts (X, Y, Z), you need to understand how these components are packed and unpacked. The position encoding is defined as follows:

  • X: 26 most significant bits (MSBs)
  • Z: 26 middle bits
  • Y: 12 least significant bits (LSBs)

Position Encoding

To encode the coordinates into a 64-bit value:

val=((x&0x3FFFFFF)38)((z&0x3FFFFFF)12)(y&0xFFF) \text{val} = ((x \& 0x3FFFFFF) \ll 38) \mid ((z \& 0x3FFFFFF) \ll 12) \mid (y \& 0xFFF)

Position Decoding

To decode the 64-bit value back into X, Y, and Z:

val = read_long();
x = val >> 38;
y = (val << 52) >> 52;
z = (val << 26) >> 38;

These shifts assume arithmetic (signed) shifts to preserve the sign of the coordinates. If your language does not support arithmetic shifts, you need to handle potential negative values explicitly.

Position Example

Let's consider encoding and decoding the position (100, 64, 300):

1. Define a Position Structure:

Loading...

2. function to Read a Long (64-bit value):

Loading...

3. function to Decode a Position:

Loading...

Detailed Explanation

  1. Reading the Long Value:
    • The function read_long reads 8 bytes from the buffer and converts them into a 64-bit integer (i64).
  2. Decoding the Position:
    • X Coordinate: Extracted from the top 26 bits.
    • Y Coordinate: Extracted from the bottom 12 bits, handled by shifting left 52 bits and then right 52 bits to get the signed value.
    • Z Coordinate: Extracted from the middle 26 bits, handled by shifting left 26 bits and then right 38 bits to get the signed value.
  3. Sign Correction:
    • After decoding the coordinates, we correct them if they were encoded as large positive numbers due to the bitwise operations, ensuring they retain their signed nature.

Packets

Packets are the fundamental units of data exchanged between the client and server in the Minecraft Protocol. Each packet has a specific purpose and structure defined by its packet ID and data fields. Understanding how to read and write packets is essential for implementing custom functionality in the game.

Uncompressed Packet Structure

FieldSizeDescription
LengthVarIntLength of the packet data (excluding the length field itself).
IDVarIntPacket ID that identifies the type of packet being sent.
DataByte ArrayPacket data fields specific to the packet type.

Compressed Packet Structure

if size >= threshold

FieldSizeDescriptionCompressed?
Packet LengthVarIntLength of the packet data (excluding the length field itself).No
Data LengthVarIntLength of the uncompressed packet data.No
IDVarIntPacket ID that identifies the type of packet being sent.Yes
DataByte ArrayPacket data fields specific to the packet type.Yes

if size < threshold

FieldSizeDescriptionCompressed?
Packet LengthVarIntLength of the packet data (excluding the length field itself).No
Data LengthVarIntIt's 0 for uncompressed packets.No
IDVarIntPacket ID that identifies the type of packet being sent.No
DataByte ArrayPacket data fields specific to the packet type.No

Packet Decoding

If compression is disabled

  1. Read the packet length as a VarInt.
  2. Read the packet ID as a VarInt.
  3. Create a buffer for the packet data with the remaining bytes.
  4. Decode the packet data based on the packet ID.

If compression is enabled

  1. Read the packet length as a VarInt.
  2. Read the data length as a VarInt.
  3. If the data length is 0, the packet is uncompressed, otherwise, it is compressed.
  4. If compressed, decompress Packet ID and Data using zlib.

Some packet notes

  1. Size limitation:
    • Maximum size: Packets cannot exceed 221 − 1 or 2097151 bytes, which is the maximum size representable by a 3-byte VarInt.
    • Length Field: The length field itself must not exceed 3 bytes, even if the encoded value falls within this limit. Encodings longer than necessary, up to 3 bytes, are permitted.
  2. Serverbound packets:
    • The uncompressed length of (Packet ID + Data) must not exceed 223 or 8388608 bytes. A length of 223 is permissible, which differs from the compressed length limit. Notchian clients do not impose a limit on the uncompressed length of incoming compressed packets.
  3. Threshold Handling:
    • If the buffer size containing packet data and ID (as VarInt) is smaller than the specified threshold in Set Compression packet, it is sent uncompressed by setting data length as 0. This mimics a non-compressed format with an extra 0 between length and packet data.
  4. Compression management:
    • The Notchian server rejects compressed packets smaller than the threshold but accepts uncompressed packets that exceed the threshold. Compression can be disabled by sending Set Compression with a negative Threshold or omitting the packet altogether.
BTC

bc1q4uzvtx6nsgt7pt7678p9rqel4hkhskpxvck8uq

ETH/BSC

0x7a70a0C1889A9956460c3c9DCa8169F25Bb098af

SOL

7UcE4PzrHoGqFKHyVgsme6CdRSECCZAoWipsHntu5rZx