Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Decoder

Constructor

Decoder(data: bytes, encoding: Encoding)

Creates a streaming decoder over data. The internal position starts at 0. Each decode_* call advances the position past the decoded element.

Primitive decode methods

MethodReturnsASN.1 typeTag
decode_integer()IntegerINTEGER0x02
decode_octet_string()OctetStringOCTET STRING0x04
decode_oid()ObjectIdentifierOBJECT IDENTIFIER0x06
decode_bit_string()BitStringBIT STRING0x03
decode_boolean()BooleanBOOLEAN0x01
decode_utc_time()UtcTimeUTCTime0x17
decode_generalized_time()GeneralizedTimeGeneralizedTime0x18
decode_null()NullNULL0x05
decode_real()RealREAL0x09
decode_utf8_string()Utf8StringUTF8String0x0c
decode_printable_string()PrintableStringPrintableString0x13
decode_ia5_string()IA5StringIA5String0x16
decode_numeric_string()NumericStringNumericString0x12
decode_teletex_string()TeletexStringTeletexString / T61String0x14
decode_visible_string()VisibleStringVisibleString0x1a
decode_general_string()GeneralStringGeneralString0x1b
decode_universal_string()UniversalStringUniversalString0x1c
decode_bmp_string()BmpStringBMPString0x1e
decode_any()any Python objectany element
decode_any_str()strany string type

decode_any() dispatch table

decode_any() dispatches on the tag at the current position:

ASN.1 TypePython value
BOOLEANBoolean
INTEGERInteger
BIT STRINGBitString
OCTET STRINGOctetString
NULLNull
OBJECT IDENTIFIERObjectIdentifier
UTF8StringUtf8String
PrintableStringPrintableString
IA5StringIA5String
NumericStringNumericString
TeletexStringTeletexString
VisibleStringVisibleString
GeneralStringGeneralString
UniversalStringUniversalString
BmpStringBmpString
UTCTimeUtcTime
GeneralizedTimeGeneralizedTime
SEQUENCE / SETlist of the above
TaggedTaggedElement
Unknown universalRawElement

decode_any_str() encoding table

decode_any_str() reads one TLV and decodes it as a native Python str, applying the correct encoding for each of the nine ASN.1 string types:

TagTypeDecoding
12UTF8StringUTF-8 (lossy)
18NumericStringUTF-8
19PrintableStringUTF-8
20TeletexString / T61StringLatin-1 (each byte → U+0000–U+00FF)
22IA5StringUTF-8
26VisibleStringUTF-8
27GeneralStringUTF-8
28UniversalStringUCS-4 big-endian
30BMPStringUCS-2 big-endian

Raises ValueError for any other tag; raises EOFError if the decoder is empty. This is the single-call replacement for the duck-typing probe on decode_any():

# Before — three-way probe:
val = decoder.decode_any()
if hasattr(val, 'as_str'):
    s = val.as_str()
elif hasattr(val, 'to_bytes'):
    s = val.to_bytes().decode('latin-1')
else:
    raise ValueError(f"not a string: {type(val)}")

# After — one call, correct encoding for all nine types:
s = decoder.decode_any_str()

Structured / container decode methods

MethodSignatureReturnsDescription
decode_sequence()DecoderConsume a SEQUENCE TLV; return child decoder over its contents.
decode_set()DecoderConsume a SET TLV; return child decoder over its contents.
decode_explicit_tag(tag_num: int)DecoderStrip an explicit context-specific tag [tag_num]; return child decoder over the content.
decode_implicit_tag(tag_num: int, tag_class: str)DecoderStrip an implicit tag; return child decoder over the value bytes only (no tag/length). tag_class is "Context", "Application", "Private", or "Universal".
decode_raw_tlv()bytesRead the next complete TLV (tag + length + value) as raw bytes and advance past it.

Introspection helpers

MethodReturnsDescription
peek_tag()tuple[int, str, bool](tag_number, tag_class, is_constructed) — does not advance the position. Raises EOFError if no data remains.
remaining_bytes()bytesAll bytes from the current position to the end. Useful after decode_implicit_tag to retrieve bare primitive value bytes.
is_empty()boolTrue when the current position equals the data length.
position()intCurrent byte offset.
remaining()intNumber of bytes left.

Full class stub

class Decoder:
    def __init__(self, data: bytes, encoding: Encoding) -> None: ...

    # Primitive types
    def decode_integer(self) -> Integer: ...
    def decode_octet_string(self) -> OctetString: ...
    def decode_oid(self) -> ObjectIdentifier: ...
    def decode_bit_string(self) -> BitString: ...
    def decode_boolean(self) -> Boolean: ...
    def decode_real(self) -> Real: ...
    def decode_null(self) -> Null: ...
    def decode_utc_time(self) -> UtcTime: ...
    def decode_generalized_time(self) -> GeneralizedTime: ...

    # String types
    def decode_utf8_string(self) -> Utf8String: ...
    def decode_printable_string(self) -> PrintableString: ...
    def decode_ia5_string(self) -> IA5String: ...
    def decode_numeric_string(self) -> NumericString: ...       # tag 18
    def decode_teletex_string(self) -> TeletexString: ...      # tag 20
    def decode_visible_string(self) -> VisibleString: ...      # tag 26
    def decode_general_string(self) -> GeneralString: ...      # tag 27
    def decode_universal_string(self) -> UniversalString: ...  # tag 28
    def decode_bmp_string(self) -> BmpString: ...              # tag 30

    # Constructed / tagged
    def decode_sequence(self) -> Decoder: ...
    # Reads a SEQUENCE TLV, advances past it, and returns a new Decoder over
    # the content bytes.  Raises ValueError if the next element is not a SEQUENCE.

    def decode_explicit_tag(self, tag_num: int) -> Decoder: ...
    # Reads an explicit context-specific tag [tag_num], advances past it, and
    # returns a new Decoder over the tagged content.
    # Raises ValueError if the tag number does not match.

    def decode_set(self) -> Decoder: ...
    # Reads a SET TLV (tag 0x31), advances past it, and returns a new Decoder
    # over the content bytes.  Raises ValueError if the next element is not a SET.

    def decode_implicit_tag(self, tag_num: int, tag_class: str) -> Decoder: ...
    # Strips an implicit tag of the given number and class and returns a new
    # Decoder over the raw value bytes.  tag_class must be "Universal",
    # "Context", "Application", or "Private".  Raises ValueError on mismatch.
    # The caller must know the original type and call the appropriate decode_*
    # method on the returned Decoder.
    #
    # Example:
    #   raw_decoder = decoder.decode_implicit_tag(0, "Context")
    #   value = raw_decoder.decode_integer()

    def peek_tag(self) -> tuple[int, str, bool]: ...
    # Returns (tag_number, tag_class, is_constructed) of the next element without
    # consuming any bytes.  Raises EOFError if the decoder is empty.
    # Use for CHOICE dispatch or optional-field detection:
    #   tag_num, tag_class, _ = decoder.peek_tag()
    #   if tag_class == "Context" and tag_num == 0:
    #       version = decoder.decode_explicit_tag(0)

    def decode_raw_tlv(self) -> bytes: ...
    # Reads the complete next TLV (tag + length + value bytes) as a bytes object
    # and advances past it.  Useful when the element type is unknown or when
    # decoding should be deferred:
    #   tlv = decoder.decode_raw_tlv()
    #   inner = synta.Decoder(tlv, synta.Encoding.DER)

    def remaining_bytes(self) -> bytes: ...
    # Returns all remaining bytes from the current position without advancing.
    # Primarily useful after decode_implicit_tag() for **primitive** implicit
    # types whose raw value bytes cannot be decoded with the decode_* methods
    # (those expect a full TLV, but implicit stripping leaves only the value):
    #
    #   # Decode dNSName [2] IMPLICIT IA5String
    #   child = decoder.decode_implicit_tag(2, "Context")
    #   dns_name = child.remaining_bytes().decode("ascii")
    #
    #   # Decode iPAddress [7] IMPLICIT OCTET STRING
    #   child = decoder.decode_implicit_tag(7, "Context")
    #   ip_bytes = child.remaining_bytes()   # 4 or 16 raw bytes

    # Dynamic decoding
    def decode_any(self) -> object: ...
    # Returns a typed Python object.  Sequence/Set → list.
    # Tagged elements → TaggedElement.
    # Unknown universal tags → RawElement.

    def decode_any_str(self) -> str: ...
    # Decode any ASN.1 string type as a Python str (correct encoding per type).
    # Raises ValueError for non-string tags; EOFError if empty.

    # State
    def is_empty(self) -> bool: ...
    def position(self) -> int: ...
    def remaining(self) -> int: ...

Usage examples

Decoding ASN.1 data

import synta

# Decode an INTEGER
data = b'\x02\x01\x2A'  # DER-encoded INTEGER 42
decoder = synta.Decoder(data, synta.Encoding.DER)
integer = decoder.decode_integer()
print(integer.to_int())  # Output: 42

# Decode an OBJECT IDENTIFIER
oid_data = b'\x06\x09\x2a\x86\x48\x86\xf7\x0d\x01\x01\x01'
decoder = synta.Decoder(oid_data, synta.Encoding.DER)
oid = decoder.decode_oid()
print(str(oid))  # Output: 1.2.840.113549.1.1.1

# Decode an OCTET STRING
octet_data = b'\x04\x05hello'
decoder = synta.Decoder(octet_data, synta.Encoding.DER)
octet_string = decoder.decode_octet_string()
print(octet_string.to_bytes())  # Output: b'hello'

# Decode a NULL
null_data = b'\x05\x00'
decoder = synta.Decoder(null_data, synta.Encoding.DER)
null = decoder.decode_null()

# Decode a REAL (IEEE 754 double)
real_data = b'\x09\x01\x40'  # PLUS-INFINITY
decoder = synta.Decoder(real_data, synta.Encoding.DER)
r = decoder.decode_real()
import math
assert math.isinf(float(r))

# Decode any element dynamically
data = b'\x02\x01\x2A'
decoder = synta.Decoder(data, synta.Encoding.DER)
obj = decoder.decode_any()  # Returns Integer, OctetString, list (Sequence/Set), etc.

Decoding SEQUENCE structures

Use decode_sequence() to enter a SEQUENCE and get a child Decoder positioned over the content bytes. Iterate with typed decode_* methods and is_empty().

import synta

# Encoded SEQUENCE { INTEGER 42, BOOLEAN TRUE }
data = b'\x30\x06\x02\x01\x2a\x01\x01\xff'

decoder = synta.Decoder(data, synta.Encoding.DER)
child = decoder.decode_sequence()    # advances past the outer SEQUENCE TLV
assert decoder.is_empty()

while not child.is_empty():
    obj = child.decode_any()         # INTEGER, then BOOLEAN

# Decode an explicit context tag [1] wrapping an INTEGER
tagged_data = b'\xa1\x05\x02\x03\x00\x00\x63'  # [1] EXPLICIT INTEGER 99
decoder = synta.Decoder(tagged_data, synta.Encoding.DER)
child = decoder.decode_explicit_tag(1)   # raises ValueError if tag != [1]
integer = child.decode_integer()
assert integer.to_int() == 99