Appligent Labs

PDF Object Types

Written by Mark Gavin | Apr 2, 2013 4:00:00 AM

by Mark Gavin

A PDF file contains eight (8) types of objects called “CosObjects”.  These are the core building blocks which make up the body of a PDF file.

Objects can be either direct or indirect. Direct objects are just “inline values”.  For example; /Filter /FlateDecode.  Here the key is Filter with a direct name value of FlateDecode.  Indirect objects have an object ID and a generation number and can be referenced by other objects within the PDF file.  For example; /Contents 2 0 R.  The key is Contents and 2 0 R is an indirect reference to a contents stream or a contents array.  Please see the PDF Hello World post for examples of how direct and indirect objects are used.

Type Description Examples
Null This really is a valid object  Null
Boolean True or False  true or false
Integer  Integers 1, 2, 3, 100, 208
Real Real Numbers 0.05, 0.25, 130.23
Name Key Names and Labels /Type, /Page,
/ThisIsName37,
/UTF8Name#007
String PDDocEncoding, UTF16BE or Hex (Testing),
<FFFE0040>,
<1C2D3F>
Array Heterogeneous Ordered Set of Objects [ 0 0 612 792 ],
[ (T) –20.5 (H) 4 (E) ]
Dictionary Key Name and Value Pairs << /Type /Page
/Author (Mark Gavin)
/Resources << /Font [ /F1 /F2 ] >>>>
Stream Data + Dictionary << /Type /XObject /Subtype /Image
/Filter /FlateDecode >>
stream …. endstream

Streams are large blocks of data which commonly holds content operators and/or images. Streams also contain a descriptive dictionary.

Indirect object references have both an object ID and a generation number.  The object ID is the index into the xref table. The generation number represents a version number for the object.

More information on PDF object types can be found in ISO-32000 Section 7.3.