Skip to contents

Documents

pdf_doc_open()
Open a PDF document
pdf_doc_close()
Close a PDF document
pdf_page_count()
Count pages in a PDF document
pdf_doc_info()
Document-level metadata for a PDF
pdf_doc_meta()
Read one entry from a PDF's Info dictionary
pdf_doc_summary()
One-call summary of a PDF document
summary(<pdfium_doc>)
Document-level summary
pdf_parse_date()
Parse a PDF date string into POSIXct
pdf_doc_text()
Read every page's text in one call
pdf_doc_fonts()
Document-level rollup of every embedded / referenced font
pdf_doc_file_id()
Read the document's file identifier from its trailer
pdf_doc_page_mode()
Read the document's PageMode entry from its catalog
pdf_doc_permissions()
Permission flags from a PDF's encryption dictionary
pdf_doc_user_permissions()
User-level document permissions
pdf_doc_security()
Document security handler revision
pdf_doc_xref_valid()
Cross-reference table validity flag
pdf_doc_trailer_ends()
Byte offsets of every %%EOF trailer marker
pdf_doc_is_tagged()
Is the document marked as tagged?
pdf_doc_language()
Get the document's declared language
pdf_doc_javascript()
Enumerate document-level JavaScript actions
pdf_install_unsupported_handler()
Install PDFium's unsupported-feature event handler
pdf_drain_unsupported_features()
Read and clear the PDFium unsupported-feature event buffer
pdf_doc_focusable_subtypes()
Annotation subtypes registered as keyboard-focusable
pdf_doc_viewer_preferences()
Read the document's viewer preferences
pdf_doc_viewer_preference_by_name()
Look up a /ViewerPreferences name-typed entry by key
pdf_doc_named_dests()
Enumerate the document's named destinations
pdf_doc_named_dest_by_name()
Resolve a named destination by name
pdf_doc_bookmarks()
List the bookmark outline (table of contents) of a PDF
pdf_doc_bookmark_find()
Find a bookmark by its title
pdf_page_label()
Read the logical page label of a PDF page
pdf_page_labels()
Read every page's logical label in one call

Attachments

pdf_attachments()
List the files attached to a PDF document
as_pdfium_attachment_list()
Coerce input to a pdfium_attachment_list
as_tibble(<pdfium_attachment_list>)
Tibble view of a pdfium_attachment_list
summary(<pdfium_attachment_list>)
Tibble-shaped summary of an attachment list
pdf_attachment_name()
Attachment file name
pdf_attachment_mime_type()
Attachment MIME / subtype
pdf_attachment_size_bytes()
Attachment decompressed size in bytes
pdf_attachment_data()
Read the raw bytes of an embedded file attachment
pdf_attachment_dict_value()
Look up an attachment-dictionary entry by key

Signatures

pdf_signatures()
List the digital signatures attached to a PDF document
as_pdfium_signature_list()
Coerce input to a pdfium_signature_list
as_tibble(<pdfium_signature_list>)
Tibble view of a pdfium_signature_list
summary(<pdfium_signature_list>)
Tibble-shaped summary of a signature list
pdf_signature_sub_filter()
Signature /SubFilter value
pdf_signature_reason()
Signature reason / comment text
pdf_signature_time()
Signing time (raw PDF date string)
pdf_signature_doc_mdp_permission()
Signature DocMDP permission level
pdf_signature_contents()
Read the raw bytes of a PDF signature's contents blob
pdf_signature_byte_range()
Read the signed byte ranges of a PDF signature

Bookmarks

as_pdfium_bookmark_list()
Coerce input to a pdfium_bookmark_list
as_tibble(<pdfium_bookmark_list>)
Tibble view of a pdfium_bookmark_list
summary(<pdfium_bookmark_list>)
Tibble-shaped summary of a bookmark list
pdf_bookmark_title()
Bookmark display title
pdf_bookmark_page_num()
Bookmark destination page number
pdf_bookmark_action_type()
Bookmark action type
pdf_bookmark_uri()
Bookmark URI (for URI actions)
pdf_bookmark_filepath()
Bookmark external file path
pdf_bookmark_dest_view()
Bookmark destination view mode
pdf_bookmark_dest_x()
Bookmark destination x coordinate
pdf_bookmark_dest_y()
Bookmark destination y coordinate
pdf_bookmark_dest_zoom()
Bookmark destination zoom factor

Pages

pdf_page_load()
Load a single page from an open PDF document
pdf_page_close()
Close a page handle
pdf_page_size()
Page dimensions in PDF points
pdf_page_rotation()
Page rotation in degrees
pdf_page_box()
Read a page's bounding box
pdf_pages_summary()
One-call summary of every page in a document
summary(<pdfium_page>)
Page-level summary
pdf_page_links()
List the clickable links on a page
pdf_link_at_point()
Hit-test for the link annotation under a point
pdf_link_annot_at_point()
Hit-test for a link annotation, returning the annotation handle
pdf_form_field_at_point()
Form-field hit-test for a point
pdf_page_actions()
Page additional actions (open / close handlers)
pdf_page_thumbnail()
Page embedded thumbnail
pdf_text_weblinks()
Auto-detected web links in a page's text

Annotations and form fields

pdf_annotations()
List the annotations on a PDF page
pdf_annot_at()
Construct a pdfium_annot handle for one annotation
as_pdfium_annot_list()
Coerce input to a pdfium_annot_list
as_tibble(<pdfium_annot_list>)
Tibble view of a pdfium_annot_list
summary(<pdfium_annot_list>)
Tibble-shaped summary of an annotation list
pdf_annot_subtype()
Annotation subtype (string)
pdf_annot_subtype_code()
Annotation subtype code (integer enum)
pdf_annot_flags()
Annotation flag bitmask
pdf_annot_flags_decoded()
Annotation flags decoded as named logicals
pdf_annot_bounds()
Annotation bounding rectangle
pdf_annot_contents()
Annotation /Contents text
pdf_annot_title()
Annotation /T title (author) text
pdf_annot_subject()
Annotation /Subj subject text
pdf_annot_color()
Annotation /C colour (RGBA, 0..1)
pdf_annot_interior_color()
Annotation /IC interior colour (RGBA, 0..1)
pdf_annot_border_width()
Annotation border width
pdf_annot_font_size()
Annotation font size (FreeText / Widget subtypes)
pdf_annot_font_color()
Annotation font colour (RGB, 0..1)
pdf_annot_dict_value()
Read an annotation-dict entry by key
pdf_annot_appearance()
Appearance-stream string for an annotation
pdf_annot_quad_points()
Annotation quad points (attachment points)
pdf_annot_vertices()
Annotation vertices (polygon / polyline)
pdf_annot_ink_paths()
Annotation ink paths (ink strokes)
pdf_annot_popup()
Annotation popup (/Popup linked annot)
pdf_annot_in_reply_to()
Annotation reply-to (/IRT linked annot)
pdf_annot_file_attachment_name()
Name of the file attached to a file-attachment annotation
pdf_form_fields()
Enumerate AcroForm fields across the whole document
as_pdfium_form_field_list()
Coerce input to a pdfium_form_field_list
as_tibble(<pdfium_form_field_list>)
Tibble view of a pdfium_form_field_list
summary(<pdfium_form_field_list>)
Tibble-shaped summary of a form-field list
pdf_form_field_type()
Form-field type (string)
pdf_form_field_type_code()
Form-field type code (integer enum)
pdf_form_field_page_num()
Form-field page number
pdf_form_field_name()
Form-field name (/T)
pdf_form_field_alternate_name()
Form-field alternate (tooltip) name (/TU)
pdf_form_field_value()
Form-field current value (/V)
pdf_form_field_export_value()
Form-field export value
pdf_form_field_flags()
Form-field flag bitmask (/Ff)
pdf_form_field_flags_decoded()
Form-field universal flag bits, decoded
pdf_form_field_is_checked()
Form-field checked state
pdf_form_field_control_count()
Number of controls in this radio group (or NA)
pdf_form_field_control_index()
1-based index of this control within its radio group
pdf_form_field_options()
Form-field option labels (combobox / listbox)
pdf_form_field_is_option_selected()
Form-field option selected-state (combobox / listbox)
pdf_form_field_additional_actions_js()
Form-field JavaScript additional-action sources

Page objects

pdf_page_objects()
Enumerate the objects on a page
as_pdfium_obj_list()
Coerce input to a pdfium_obj_list
as_tibble(<pdfium_obj_list>)
Tibble view of a pdfium_obj_list
summary(<pdfium_obj_list>)
Tibble-shaped summary of a page-object list
pdf_obj_type()
Report the type of a page object
pdf_obj_bounds()
Axis-aligned bounding box of a page object
pdf_obj_rotated_bounds()
Rotated bounding quadpoints of a page object
pdf_obj_matrix()
Transformation matrix of a page object
pdf_obj_has_transparency()
Does a page object use alpha blending?
pdf_obj_is_active()
Active flag of a page object
pdf_obj_marks()
Content marks attached to a page object
pdf_obj_marked_content_id()
Direct marked-content ID for a page object

Paths

pdf_path_segments()
Path segments of a path page-object
pdf_path_stroke()
Stroke style of a path page-object
pdf_path_fill()
Fill color of a path page-object
pdf_path_dash()
Dash pattern of a path page-object
pdf_path_line_cap()
Stroke line-cap style of a path page-object
pdf_path_line_join()
Stroke line-join style of a path page-object
pdf_path_draw_mode()
Path draw mode (fill rule + stroke flag)

Text

pdf_text_font_size()
Font size of a text page-object
pdf_text_content()
Text content of a text page-object
pdf_text_runs()
Extract every text run on a page
pdf_text_font()
Font metadata of a text page-object
pdf_text_font_metrics()
Font ascent and descent for a text page-object's font
pdf_text_chars()
Per-character text extraction
pdf_text_colors()
Per-character fill and stroke colors and text-index mapping
pdf_text_render_mode()
Text-rendering mode of a text page-object
pdf_text_search()
Find every occurrence of a query string in a PDF
pdf_text_char_at_point()
Locate the character index nearest a (x, y) point on a page
pdf_text_index_from_char() pdf_text_char_from_text_index()
Map between PDFium's "all characters" and "extractable text" indices
pdf_text_char_obj_index()
Reverse-map a character index to its page-object index
pdf_text_obj_rendered_bitmap()
Rendered bitmap of a single text page-object
pdf_glyph_path()
Glyph outline for a single glyph in a text page-object's font
pdf_glyph_width()
Width of a glyph in a text page-object's font

Rendering

pdf_render_page()
Render a PDF page to a bitmap
pdf_render_page_with_matrix()
Render a PDF page with an arbitrary affine transformation
pdf_render_to_png()
Render a PDF page directly to a PNG file
plot(<pdfium_bitmap>)
Plot a pdfium_bitmap
as.raster(<pdfium_bitmap>)
Convert a pdfium_bitmap to base R's "raster" (character hex)
as.array(<pdfium_bitmap>)
Convert a pdfium_bitmap to a 3D RGBA array of doubles in 0..1
as.matrix(<pdfium_bitmap>)
Convert a pdfium_bitmap to a hex-color matrix

Images

pdf_image_info()
Inspect metadata for an embedded image
pdf_image_size()
Pixel size of an embedded image
pdf_image_bitmap()
Decoded image bitmap
pdf_image_rendered()
Rendered image bitmap (page CTM applied)
pdf_image_data()
Raw bytes of an embedded image stream
pdf_image_filters()
Filter chain for an embedded image stream
pdf_image_icc_profile()
Decoded ICC color profile bytes for an embedded image

Form XObjects

pdf_form_objects()
List the page objects nested inside a Form XObject

Clip paths

pdf_obj_clip_path()
Get the clip path attached to a page object
pdf_clip_path_count()
Count sub-paths in a clip path
pdf_clip_path_segments()
Read all segments of a clip path as a tibble

Structure tree (tagged PDF / accessibility)

pdf_structure_tree()
Read the tagged-PDF structure tree for a page

One-call extraction

pdf_extract_paths()
Extract all path geometry on a page into a single tibble

Document creation and serialisation

pdf_doc_new()
Create a new, empty PDF document
pdf_save()
Save a PDF document to disk
pdf_save_to_raw()
Save a PDF document to a raw vector

Structural mutation

Open a document with readwrite = TRUE (or build one with pdf_doc_new()) to enable these. See ADRs 011-018 for the writer-surface conventions.

pdf_page_new()
Add a new blank page
pdf_page_delete()
Delete a page from the document
pdf_pages_reorder()
Reorder pages
pdf_docs_merge()
Merge documents into a new PDF
pdf_n_up()
Combine N pages of a document into one
pdf_page_set_rotation()
Set a page's rotation
pdf_page_set_box()
Set one of a page's named bounding boxes
pdf_doc_set_language()
Set the document's declared language
pdf_page_flush()
Force-flush a page's pending content edits

Page-object styling

Setters for page-object attributes. Each takes a pdfium_obj handle from pdf_page_objects() (parent doc must be readwrite) and marks the parent page dirty so pdf_save() / pdf_render_*() see the change.

pdf_obj_set_matrix()
Set the affine transformation matrix of a page object
pdf_obj_set_active()
Set whether a page object renders
pdf_obj_set_blend_mode()
Set the blend mode of a page object
pdf_path_set_stroke()
Set the stroke style of a path page object
pdf_path_set_fill()
Set the fill color of a path page object
pdf_path_set_line_cap()
Set the line cap style of a path stroke
pdf_path_set_line_join()
Set the line join style of a path stroke
pdf_path_set_dash()
Set the dash array + phase of a path stroke
pdf_path_set_draw_mode()
Set the draw mode of a path page object
pdf_text_set_content()
Replace the text content of a text page object
pdf_text_set_render_mode()
Set the render mode of a text page object
pdf_obj_add_mark()
Add a content mark to a page object
pdf_obj_remove_mark()
Remove a content mark from a page object

Path geometry

Appenders for path page-objects. PDFium’s public API is append-only — there is no segment-removal or -replacement symbol. Compose with pdf_path_new() and pdf_obj_delete() below for the full read → edit → write workflow.

pdf_path_move_to()
Append a MoveTo command to a path object
pdf_path_line_to()
Append a LineTo command to a path object
pdf_path_bezier_to()
Append a cubic Bezier curve to a path object
pdf_path_close()
Close the current subpath of a path object
pdf_path_append()
Append a sequence of path segments in one call

Page-object creation

Create fresh page-objects (paths, rectangles, text, JPEG images) on a page that’s been opened with readwrite = TRUE or built via pdf_doc_new(). Use pdf_obj_delete() for the inverse — remove + destroy a page-object. PNG / TIFF / raw- bitmap embedding stays deferred to a later release pending FPDF_BITMAP plumbing.

pdf_path_new()
Create a new path page-object on a page
pdf_rect_new()
Create a closed rectangle path on a page
pdf_text_new()
Create a new text page-object on a page
pdf_image_new()
Create a new image page-object from JPEG bytes
pdf_obj_delete()
Remove a page object and destroy it

Font loading

Load a font for use in pdf_text_new(). The 14 PDF standard fonts need no embedding; arbitrary TrueType / Type1 fonts get their bytes copied into the document via FPDFText_LoadFont. pdf_font_close() is idempotent and matches the explicit- release pattern of the other handle classes.

pdf_font_load_standard()
Load one of the 14 PDF standard fonts
pdf_font_load()
Load a TrueType or Type1 font from bytes
pdf_font_close()
Close a font handle

Annotation authoring

Create / delete annotations and mutate their properties. Mirrors the pdf_annot_* readers; each setter takes a pdfium_annot whose parent doc is readwrite. PDFium supports creating these subtypes: circle, fileattachment, freetext, highlight, ink, link, popup, square, squiggly, stamp, strikeout, text, underline.

pdf_annot_new()
Create a new annotation on a page
pdf_annot_delete()
Remove an annotation and invalidate the handle
pdf_annot_set_bounds()
Set the bounding rectangle of an annotation
pdf_annot_set_color()
Set the stroke / line color of an annotation
pdf_annot_set_interior_color()
Set the interior / fill color of an annotation
pdf_annot_set_flags()
Set the flags bitmask of an annotation
pdf_annot_set_contents()
Set the /Contents text of an annotation
pdf_annot_set_title()
Set the /T (title / author) of an annotation
pdf_annot_set_subject()
Set the /Subj (subject) of an annotation
pdf_annot_set_dict_value()
Set an arbitrary string-valued entry on an annotation dict
pdf_annot_append_quad()
Append a quad to an annotation's /QuadPoints array

Form filling

Write /V (the field value) on AcroForm widget annotations and flatten the page when the form-fill workflow finishes. pdf_form_field_set_value() dispatches by the field’s type: character for text / choice fields, logical-or-character for checkable fields. pdf_page_flatten() bakes both form widgets and annotations into the page’s content stream — irreversible and intended as the final step before saving a non-editable copy.

pdf_form_field_set_value()
Set the value of a form field
pdf_form_field_clear()
Clear a form field to its default value
pdf_form_reset()
Reset every form field in the document to its default value
pdf_page_flatten()
Flatten form fields and annotations into the page content stream

Attachment authoring

Add, delete, and mutate the document’s embedded-file attachments. The natural sequence for a fresh attachment is pdf_attachment_new()pdf_attachment_set_data() (to populate the file bytes and materialise the /Params subdict) → pdf_attachment_set_dict_value() (for any extra dictionary metadata).

pdf_attachment_new()
Add a new embedded file attachment to a document
pdf_attachment_delete()
Delete an embedded file attachment from a document
pdf_attachment_set_dict_value()
Set an entry in an attachment's /Params dictionary
pdf_attachment_set_data()
Set the raw bytes of an embedded file attachment

API-completion additions

The v0.1.0 “complete the relevant PDFium surface” pass picks up the last batch of single-call wrappers that pair with the existing readers + setters. Grouped by topic below; all live in R/api_completion.R.

pdf_doc_form_type()
Form-type flavour of the document
pdf_bookmark_child_count()
Number of children for a bookmark
pdf_page_has_transparency()
Does the page contain transparency?
pdf_page_bounding_box()
Page bounding box (cropbox ∩ mediabox)
pdf_page_transform_annots()
Transform every annotation on a page in one shot
pdf_annot_index()
Find an annotation's page-relative index by handle
pdf_device_to_page()
Convert device (screen) coordinates to PDF page coordinates
pdf_page_to_device()
Convert PDF page coordinates to device (screen) coordinates
pdf_text_rects()
Rectangles occupied by a character range
pdf_text_bounded()
Extract text inside a bounding rectangle
pdf_text_char_geometry()
Per-character geometry: transformation matrix, rotation angle, font weight
pdf_path_set_dash_phase()
Set just the dash phase of a path object
pdf_obj_mark_set_blob()
Set a binary-blob content-mark parameter
pdf_obj_mark_remove_param()
Remove a content-mark parameter
pdf_font_data()
Extract the bytes of an embedded font
pdf_font_load_cidtype2()
Load a CID Type 2 (composite TrueType) font with explicit mappings
pdf_text_set_charcodes()
Populate a text object with explicit glyph charcodes
pdf_annot_add_ink_stroke()
Append an ink stroke to an ink annotation
pdf_annot_remove_ink_list()
Remove all ink strokes from an ink annotation
pdf_annot_object_count()
Number of embedded page-objects inside an annotation
pdf_annot_objects()
Page-objects embedded inside an annotation
pdf_annot_append_object()
Append a page-object to an annotation
pdf_annot_remove_object()
Remove a page-object from an annotation
pdf_annot_update_object()
Update an embedded page-object after mutating it
pdf_annot_set_uri()
Set the URI of a link annotation
pdf_annot_set_appearance()
Set the appearance stream content for an annotation
pdf_annot_add_file_attachment()
Attach a file to a file-attachment annotation
pdf_annot_line()
Line endpoints of a line annotation
pdf_annot_link()
Link metadata for a link annotation
pdf_annot_set_border()
Set the border of an annotation
pdf_clip_path_new()
Create a clip path covering a rectangle
pdf_clip_path_close()
Release a clip-path handle
pdf_page_insert_clip_path()
Insert a clip path into a page
pdf_obj_transform_clip_path()
Transform the clip path of a page object
pdf_page_transform_with_clip()
Apply a transform to a page's content stream with an optional clip
pdf_xobject_from_page()
Create an XObject (reusable form) from a source-doc page
pdf_xobject_close()
Close an XObject handle
pdf_obj_form_from_xobject()
Instantiate an XObject as a form page-object on a page
pdf_form_obj_remove_object()
Remove a child page-object from a form-xobject
pdf_docs_import_pages()
Import page ranges from a source doc into a destination doc
pdf_docs_copy_viewer_preferences()
Copy /ViewerPreferences from one document to another
pdf_bitmap_new()
Create a fresh in-memory bitmap
pdf_bitmap_close()
Release a bitmap handle
pdf_bitmap_info()
Bitmap dimensions and format
pdf_bitmap_fill_rect()
Fill a rectangle of the bitmap with a solid color
pdf_bitmap_buffer() pdf_bitmap_set_buffer()
Read or write the bitmap's raw pixel bytes
pdf_image_set_bitmap()
Set a bitmap on an image page-object
pdf_image_new_from_bitmap()
Embed a bitmap as an image page-object
pdf_image_extract()
Extract an embedded image to a file, picking a sensible format
pdf_bitmap_to_page()
Convert bitmap pixel coordinates to PDF page-space points
pdf_bitmap_from_page()
Convert PDF page-space points to bitmap pixel coordinates
pdf_text_obj_at_char()
Look up the text page-object owning a given char by direct accessor
pdf_system_fonts_default_ttf_map()
PDFium's default charset → TTF substitution map
pdf_system_fonts_install_default()
Install PDFium's default system-font-info provider
pdf_annot_set_font_color()
Set the font color of an annotation
pdf_form_field_set_flags()
Set the form-field flag bitmask on a form-field widget
pdf_doc_set_focusable_subtypes()
Set the doc-wide list of annotation subtypes that participate in tab focus

Enum code <-> name helpers

Bidirectional converters between PDFium’s integer enum codes and their short string names. Paired _name() and _code() functions for each enum: annotation subtype, page-object type, path-segment type, form-field type, page / link action type, and named-destination view mode. Useful when filtering a tibble by code, passing programmatic input into a setter, or round-tripping codes through a CSV that’s lost the names.

pdfium_annot_subtype_name() pdfium_annot_subtype_code()
PDF annotation subtype codes <-> names
pdfium_obj_type_name() pdfium_obj_type_code()
PDF page-object type codes <-> names
pdfium_segment_type_name() pdfium_segment_type_code()
Path-segment type codes <-> names
pdfium_form_field_type_name() pdfium_form_field_type_code()
Form-field type codes <-> names
pdfium_action_type_name() pdfium_action_type_code()
Link / page action type codes <-> names
pdfium_dest_view_name() pdfium_dest_view_code()
Named-destination view-mode codes <-> names