packed_data - Documentation for Ruby 3.4 (2024)

Certain Ruby core methods deal with packing and unpacking data:

  • Method Array#pack: Formats each element in array self into a binary string; returns that string.

  • Method String#unpack: Extracts data from string self, forming objects that become the elements of a new array; returns that array.

  • Method String#unpack1: Does the same, but unpacks and returns only the first extracted object.

Each of these methods accepts a string template, consisting of zero or more directive characters, each followed by zero or more modifier characters.

Examples (directive 'C' specifies ‘unsigned character’):

[65].pack('C') # => "A" # One element, one directive.[65, 66].pack('CC') # => "AB" # Two elements, two directives.[65, 66].pack('C') # => "A" # Extra element is ignored.[65].pack('') # => "" # No directives.[65].pack('CC') # Extra directive raises ArgumentError.'A'.unpack('C') # => [65] # One character, one directive.'AB'.unpack('CC') # => [65, 66] # Two characters, two directives.'AB'.unpack('C') # => [65] # Extra character is ignored.'A'.unpack('CC') # => [65, nil] # Extra directive generates nil.'AB'.unpack('') # => [] # No directives.

The string template may contain any mixture of valid directives (directive 'c' specifies ‘signed character’):

[65, -1].pack('cC') # => "A\xFF""A\xFF".unpack('cC') # => [65, 255]

The string template may contain whitespace (which is ignored) and comments, each of which begins with character '#' and continues up to and including the next following newline:

[0,1].pack(" C #foo \n C ") # => "\x00\x01""\0\1".unpack(" C #foo \n C ") # => [0, 1]

Any directive may be followed by either of these modifiers:

  • '*' - The directive is to be applied as many times as needed:

    [65, 66].pack('C*') # => "AB"'AB'.unpack('C*') # => [65, 66]
  • Integer count - The directive is to be applied count times:

    [65, 66].pack('C2') # => "AB"[65, 66].pack('C3') # Raises ArgumentError.'AB'.unpack('C2') # => [65, 66]'AB'.unpack('C3') # => [65, 66, nil]

    Note: Directives in %w[A a Z m] use count differently; see String Directives.

If elements don’t fit the provided directive, only least significant bits are encoded:

[257].pack("C").unpack("C") # => [1]

Packing Method

Method Array#pack accepts optional keyword argument buffer that specifies the target string (instead of a new string):

[65, 66].pack('C*', buffer: 'foo') # => "fooAB"

The method can accept a block:

# Packed string is passed to the block.[65, 66].pack('C*') {|s| p s } # => "AB"

Unpacking Methods

Methods String#unpack and String#unpack1 each accept an optional keyword argument offset that specifies an offset into the string:

'ABC'.unpack('C*', offset: 1) # => [66, 67]'ABC'.unpack1('C*', offset: 1) # => 66

Both methods can accept a block:

# Each unpacked object is passed to the block.ret = []"ABCD".unpack("C*") {|c| ret << c }ret # => [65, 66, 67, 68]# The single unpacked object is passed to the block.'AB'.unpack1('C*') {|ele| p ele } # => 65

Integer Directives

Each integer directive specifies the packing or unpacking for one element in the input or output array.

8-Bit Integer Directives

  • 'c' - 8-bit signed integer (like C signed char):

    [0, 1, 255].pack('c*') # => "\x00\x01\xFF"s = [0, 1, -1].pack('c*') # => "\x00\x01\xFF"s.unpack('c*') # => [0, 1, -1]
  • 'C' - 8-bit unsigned integer (like C unsigned char):

    [0, 1, 255].pack('C*') # => "\x00\x01\xFF"s = [0, 1, -1].pack('C*') # => "\x00\x01\xFF"s.unpack('C*') # => [0, 1, 255]

16-Bit Integer Directives

  • 's' - 16-bit signed integer, native-endian (like C int16_t):

    [513, -514].pack('s*') # => "\x01\x02\xFE\xFD"s = [513, 65022].pack('s*') # => "\x01\x02\xFE\xFD"s.unpack('s*') # => [513, -514]
  • 'S' - 16-bit unsigned integer, native-endian (like C uint16_t):

    [513, -514].pack('S*') # => "\x01\x02\xFE\xFD"s = [513, 65022].pack('S*') # => "\x01\x02\xFE\xFD"s.unpack('S*') # => [513, 65022]
  • 'n' - 16-bit network integer, big-endian:

    s = [0, 1, -1, 32767, -32768, 65535].pack('n*')# => "\x00\x00\x00\x01\xFF\xFF\x7F\xFF\x80\x00\xFF\xFF"s.unpack('n*')# => [0, 1, 65535, 32767, 32768, 65535]
  • 'v' - 16-bit VAX integer, little-endian:

    s = [0, 1, -1, 32767, -32768, 65535].pack('v*')# => "\x00\x00\x01\x00\xFF\xFF\xFF\x7F\x00\x80\xFF\xFF"s.unpack('v*')# => [0, 1, 65535, 32767, 32768, 65535]

32-Bit Integer Directives

  • 'l' - 32-bit signed integer, native-endian (like C int32_t):

    s = [67305985, -50462977].pack('l*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('l*')# => [67305985, -50462977]
  • 'L' - 32-bit unsigned integer, native-endian (like C uint32_t):

    s = [67305985, 4244504319].pack('L*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('L*')# => [67305985, 4244504319]
  • 'N' - 32-bit network integer, big-endian:

    s = [0,1,-1].pack('N*')# => "\x00\x00\x00\x00\x00\x00\x00\x01\xFF\xFF\xFF\xFF"s.unpack('N*')# => [0, 1, 4294967295]
  • 'V' - 32-bit VAX integer, little-endian:

    s = [0,1,-1].pack('V*')# => "\x00\x00\x00\x00\x01\x00\x00\x00\xFF\xFF\xFF\xFF"s.unpack('v*')# => [0, 0, 1, 0, 65535, 65535]

64-Bit Integer Directives

  • 'q' - 64-bit signed integer, native-endian (like C int64_t):

    s = [578437695752307201, -506097522914230529].pack('q*')# => "\x01\x02\x03\x04\x05\x06\a\b\xFF\xFE\xFD\xFC\xFB\xFA\xF9\xF8"s.unpack('q*')# => [578437695752307201, -506097522914230529]
  • 'Q' - 64-bit unsigned integer, native-endian (like C uint64_t):

    s = [578437695752307201, 17940646550795321087].pack('Q*')# => "\x01\x02\x03\x04\x05\x06\a\b\xFF\xFE\xFD\xFC\xFB\xFA\xF9\xF8"s.unpack('Q*')# => [578437695752307201, 17940646550795321087]

Platform-Dependent Integer Directives

  • 'i' - Platform-dependent width signed integer, native-endian (like C int):

    s = [67305985, -50462977].pack('i*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('i*')# => [67305985, -50462977]
  • 'I' - Platform-dependent width unsigned integer, native-endian (like C unsigned int):

    s = [67305985, -50462977].pack('I*')# => "\x01\x02\x03\x04\xFF\xFE\xFD\xFC"s.unpack('I*')# => [67305985, 4244504319]
  • 'j' - Pointer-width signed integer, native-endian (like C intptr_t):

    s = [67305985, -50462977].pack('j*')# => "\x01\x02\x03\x04\x00\x00\x00\x00\xFF\xFE\xFD\xFC\xFF\xFF\xFF\xFF"s.unpack('j*')# => [67305985, -50462977]
  • 'J' - Pointer-width unsigned integer, native-endian (like C uintptr_t):

    s = [67305985, 4244504319].pack('J*')# => "\x01\x02\x03\x04\x00\x00\x00\x00\xFF\xFE\xFD\xFC\x00\x00\x00\x00"s.unpack('J*')# => [67305985, 4244504319]

Other Integer Directives

  • 'U' - UTF-8 character:

    s = [4194304].pack('U*')# => "\xF8\x90\x80\x80\x80"s.unpack('U*')# => [4194304]
  • 'w' - BER-encoded integer (see BER encoding):

    s = [1073741823].pack('w*')# => "\x83\xFF\xFF\xFF\x7F"s.unpack('w*')# => [1073741823]

Modifiers for Integer Directives

For the following directives, '!' or '_' modifiers may be suffixed as underlying platform’s native size.

  • 'i', 'I' - C int, always native size.

  • 's', 'S' - C short.

  • 'l', 'L' - C long.

  • 'q', 'Q' - C long long, if available.

  • 'j', 'J' - C intptr_t, always native size.

Native size modifiers are silently ignored for always native size directives.

The endian modifiers also may be suffixed in the directives above:

  • '>' - Big-endian.

  • '<' - Little-endian.

Float Directives

Each float directive specifies the packing or unpacking for one element in the input or output array.

Single-Precision Float Directives

  • 'F' or 'f' - Native format:

    s = [3.0].pack('F') # => "\x00\x00@@"s.unpack('F') # => [3.0]
  • 'e' - Little-endian:

    s = [3.0].pack('e') # => "\x00\x00@@"s.unpack('e') # => [3.0]
  • 'g' - Big-endian:

    s = [3.0].pack('g') # => "@@\x00\x00"s.unpack('g') # => [3.0]

Double-Precision Float Directives

  • 'D' or 'd' - Native format:

    s = [3.0].pack('D') # => "\x00\x00\x00\x00\x00\x00\b@"s.unpack('D') # => [3.0]
  • 'E' - Little-endian:

    s = [3.0].pack('E') # => "\x00\x00\x00\x00\x00\x00\b@"s.unpack('E') # => [3.0]
  • 'G' - Big-endian:

    s = [3.0].pack('G') # => "@\b\x00\x00\x00\x00\x00\x00"s.unpack('G') # => [3.0]

A float directive may be infinity or not-a-number:

inf = 1.0/0.0 # => Infinity[inf].pack('f') # => "\x00\x00\x80\x7F""\x00\x00\x80\x7F".unpack('f') # => [Infinity]nan = inf/inf # => NaN[nan].pack('f') # => "\x00\x00\xC0\x7F""\x00\x00\xC0\x7F".unpack('f') # => [NaN]

String Directives

Each string directive specifies the packing or unpacking for one byte in the input or output string.

Binary String Directives

  • 'A' - Arbitrary binary string (space padded; count is width); nil is treated as the empty string:

    ['foo'].pack('A') # => "f"['foo'].pack('A*') # => "foo"['foo'].pack('A2') # => "fo"['foo'].pack('A4') # => "foo "[nil].pack('A') # => " "[nil].pack('A*') # => ""[nil].pack('A2') # => " "[nil].pack('A4') # => " ""foo\0".unpack('A') # => ["f"]"foo\0".unpack('A4') # => ["foo"]"foo\0bar".unpack('A10') # => ["foo\x00bar"] # Reads past "\0"."foo ".unpack('A') # => ["f"]"foo ".unpack('A4') # => ["foo"]"foo".unpack('A4') # => ["foo"]russian = "\u{442 435 441 442}" # => "тест"russian.size # => 4russian.bytesize # => 8[russian].pack('A') # => "\xD1"[russian].pack('A*') # => "\xD1\x82\xD0\xB5\xD1\x81\xD1\x82"russian.unpack('A') # => ["\xD1"]russian.unpack('A2') # => ["\xD1\x82"]russian.unpack('A4') # => ["\xD1\x82\xD0\xB5"]russian.unpack('A*') # => ["\xD1\x82\xD0\xB5\xD1\x81\xD1\x82"]
  • 'a' - Arbitrary binary string (null padded; count is width):

    ["foo"].pack('a') # => "f"["foo"].pack('a*') # => "foo"["foo"].pack('a2') # => "fo"["foo\0"].pack('a4') # => "foo\x00"[nil].pack('a') # => "\x00"[nil].pack('a*') # => ""[nil].pack('a2') # => "\x00\x00"[nil].pack('a4') # => "\x00\x00\x00\x00""foo\0".unpack('a') # => ["f"]"foo\0".unpack('a4') # => ["foo\x00"]"foo ".unpack('a4') # => ["foo "]"foo".unpack('a4') # => ["foo"]"foo\0bar".unpack('a4') # => ["foo\x00"] # Reads past "\0".
  • 'Z' - Same as 'a', except that null is added or ignored with '*':

    ["foo"].pack('Z*') # => "foo\x00"[nil].pack('Z*') # => "\x00""foo\0".unpack('Z*') # => ["foo"]"foo".unpack('Z*') # => ["foo"]"foo\0bar".unpack('Z*') # => ["foo"] # Does not read past "\0".

Bit String Directives

  • 'B' - Bit string (high byte first):

    ['11111111' + '00000000'].pack('B*') # => "\xFF\x00"['10000000' + '01000000'].pack('B*') # => "\x80@"['1'].pack('B0') # => ""['1'].pack('B1') # => "\x80"['1'].pack('B2') # => "\x80\x00"['1'].pack('B3') # => "\x80\x00"['1'].pack('B4') # => "\x80\x00\x00"['1'].pack('B5') # => "\x80\x00\x00"['1'].pack('B6') # => "\x80\x00\x00\x00""\xff\x00".unpack("B*") # => ["1111111100000000"]"\x01\x02".unpack("B*") # => ["0000000100000010"]"".unpack("B0") # => [""]"\x80".unpack("B1") # => ["1"]"\x80".unpack("B2") # => ["10"]"\x80".unpack("B3") # => ["100"]
  • 'b' - Bit string (low byte first):

    ['11111111' + '00000000'].pack('b*') # => "\xFF\x00"['10000000' + '01000000'].pack('b*') # => "\x01\x02"['1'].pack('b0') # => ""['1'].pack('b1') # => "\x01"['1'].pack('b2') # => "\x01\x00"['1'].pack('b3') # => "\x01\x00"['1'].pack('b4') # => "\x01\x00\x00"['1'].pack('b5') # => "\x01\x00\x00"['1'].pack('b6') # => "\x01\x00\x00\x00""\xff\x00".unpack("b*") # => ["1111111100000000"]"\x01\x02".unpack("b*") # => ["1000000001000000"]"".unpack("b0") # => [""]"\x01".unpack("b1") # => ["1"]"\x01".unpack("b2") # => ["10"]"\x01".unpack("b3") # => ["100"]

Hex String Directives

  • 'H' - Hex string (high nibble first):

    ['10ef'].pack('H*') # => "\x10\xEF"['10ef'].pack('H0') # => ""['10ef'].pack('H3') # => "\x10\xE0"['10ef'].pack('H5') # => "\x10\xEF\x00"['fff'].pack('H3') # => "\xFF\xF0"['fff'].pack('H4') # => "\xFF\xF0"['fff'].pack('H5') # => "\xFF\xF0\x00"['fff'].pack('H6') # => "\xFF\xF0\x00"['fff'].pack('H7') # => "\xFF\xF0\x00\x00"['fff'].pack('H8') # => "\xFF\xF0\x00\x00""\x10\xef".unpack('H*') # => ["10ef"]"\x10\xef".unpack('H0') # => [""]"\x10\xef".unpack('H1') # => ["1"]"\x10\xef".unpack('H2') # => ["10"]"\x10\xef".unpack('H3') # => ["10e"]"\x10\xef".unpack('H4') # => ["10ef"]"\x10\xef".unpack('H5') # => ["10ef"]
  • 'h' - Hex string (low nibble first):

    ['10ef'].pack('h*') # => "\x01\xFE"['10ef'].pack('h0') # => ""['10ef'].pack('h3') # => "\x01\x0E"['10ef'].pack('h5') # => "\x01\xFE\x00"['fff'].pack('h3') # => "\xFF\x0F"['fff'].pack('h4') # => "\xFF\x0F"['fff'].pack('h5') # => "\xFF\x0F\x00"['fff'].pack('h6') # => "\xFF\x0F\x00"['fff'].pack('h7') # => "\xFF\x0F\x00\x00"['fff'].pack('h8') # => "\xFF\x0F\x00\x00""\x01\xfe".unpack('h*') # => ["10ef"]"\x01\xfe".unpack('h0') # => [""]"\x01\xfe".unpack('h1') # => ["1"]"\x01\xfe".unpack('h2') # => ["10"]"\x01\xfe".unpack('h3') # => ["10e"]"\x01\xfe".unpack('h4') # => ["10ef"]"\x01\xfe".unpack('h5') # => ["10ef"]

Pointer String Directives

  • 'P' - Pointer to a structure (fixed-length string):

    s = ['abc'].pack('P') # => "\xE0O\x7F\xE5\xA1\x01\x00\x00"s.unpack('P*') # => ["abc"]".".unpack("P") # => []("\0" * 8).unpack("P") # => [nil][nil].pack("P") # => "\x00\x00\x00\x00\x00\x00\x00\x00"
  • 'p' - Pointer to a null-terminated string:

    s = ['abc'].pack('p') # => "(\xE4u\xE5\xA1\x01\x00\x00"s.unpack('p*') # => ["abc"]".".unpack("p") # => []("\0" * 8).unpack("p") # => [nil][nil].pack("p") # => "\x00\x00\x00\x00\x00\x00\x00\x00"

Other String Directives

  • 'M' - Quoted printable, MIME encoding; text mode, but input must use LF and output LF; (see RFC 2045):

    ["a b c\td \ne"].pack('M') # => "a b c\td =\n\ne=\n"["\0"].pack('M') # => "=00=\n"["a"*1023].pack('M') == ("a"*73+"=\n")*14+"a=\n" # => true("a"*73+"=\na=\n").unpack('M') == ["a"*74] # => true(("a"*73+"=\n")*14+"a=\n").unpack('M') == ["a"*1023] # => true"a b c\td =\n\ne=\n".unpack('M') # => ["a b c\td \ne"]"=00=\n".unpack('M') # => ["\x00"]"pre=31=32=33after".unpack('M') # => ["pre123after"]"pre=\nafter".unpack('M') # => ["preafter"]"pre=\r\nafter".unpack('M') # => ["preafter"]"pre=".unpack('M') # => ["pre="]"pre=\r".unpack('M') # => ["pre=\r"]"pre=hoge".unpack('M') # => ["pre=hoge"]"pre==31after".unpack('M') # => ["pre==31after"]"pre===31after".unpack('M') # => ["pre===31after"]
  • 'm' - Base64 encoded string; count specifies input bytes between each newline, rounded down to nearest multiple of 3; if count is zero, no newlines are added; (see RFC 4648):

    [""].pack('m') # => ""["\0"].pack('m') # => "AA==\n"["\0\0"].pack('m') # => "AAA=\n"["\0\0\0"].pack('m') # => "AAAA\n"["\377"].pack('m') # => "/w==\n"["\377\377"].pack('m') # => "//8=\n"["\377\377\377"].pack('m') # => "////\n""".unpack('m') # => [""]"AA==\n".unpack('m') # => ["\x00"]"AAA=\n".unpack('m') # => ["\x00\x00"]"AAAA\n".unpack('m') # => ["\x00\x00\x00"]"/w==\n".unpack('m') # => ["\xFF"]"//8=\n".unpack('m') # => ["\xFF\xFF"]"////\n".unpack('m') # => ["\xFF\xFF\xFF"]"A\n".unpack('m') # => [""]"AA\n".unpack('m') # => ["\x00"]"AA=\n".unpack('m') # => ["\x00"]"AAA\n".unpack('m') # => ["\x00\x00"][""].pack('m0') # => ""["\0"].pack('m0') # => "AA=="["\0\0"].pack('m0') # => "AAA="["\0\0\0"].pack('m0') # => "AAAA"["\377"].pack('m0') # => "/w=="["\377\377"].pack('m0') # => "//8="["\377\377\377"].pack('m0') # => "////""".unpack('m0') # => [""]"AA==".unpack('m0') # => ["\x00"]"AAA=".unpack('m0') # => ["\x00\x00"]"AAAA".unpack('m0') # => ["\x00\x00\x00"]"/w==".unpack('m0') # => ["\xFF"]"//8=".unpack('m0') # => ["\xFF\xFF"]"////".unpack('m0') # => ["\xFF\xFF\xFF"]
  • 'u' - UU-encoded string:

    [""].pack("u") # => ""["a"].pack("u") # => "!80``\n"["aaa"].pack("u") # => "#86%A\n""".unpack("u") # => [""]"#86)C\n".unpack("u") # => ["abc"]

Offset Directives

  • '@' - Begin packing at the given byte offset; for packing, null fill if necessary:

    [1, 2].pack("C@0C") # => "\x02"[1, 2].pack("C@1C") # => "\x01\x02"[1, 2].pack("C@5C") # => "\x01\x00\x00\x00\x00\x02""\x01\x00\x00\x02".unpack("C@3C") # => [1, 2]"\x00".unpack("@1C") # => [nil]
  • 'X' - Back up a byte:

    [0, 1, 2].pack("CCXC") # => "\x00\x02"[0, 1, 2].pack("CCX2C") # => "\x02""\x00\x02".unpack("CCXC") # => [0, 2, 2]

Null Byte Directive

  • 'x' - Null byte:

    [].pack("x0") # => ""[].pack("x") # => "\x00"[].pack("x8") # => "\x00\x00\x00\x00\x00\x00\x00\x00""\x00\x00\x02".unpack("CxC") # => [0, 2]
packed_data - Documentation for Ruby 3.4 (2024)

References

Top Articles
Latest Posts
Article information

Author: Allyn Kozey

Last Updated:

Views: 6205

Rating: 4.2 / 5 (43 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Allyn Kozey

Birthday: 1993-12-21

Address: Suite 454 40343 Larson Union, Port Melia, TX 16164

Phone: +2456904400762

Job: Investor Administrator

Hobby: Sketching, Puzzles, Pet, Mountaineering, Skydiving, Dowsing, Sports

Introduction: My name is Allyn Kozey, I am a outstanding, colorful, adventurous, encouraging, zealous, tender, helpful person who loves writing and wants to share my knowledge and understanding with you.