pdf

NAME

pdf — PDF creation, manipulation, and text extraction in SLICC

DESCRIPTION

SLICC provides a complete PDF toolkit that runs entirely in-browser via WebAssembly. Agents can read, create, merge, split, rotate, burst, and extract text from PDF files without Python or native binaries. The primary commands are pdftk (manipulation and extraction) and create-pdf (generation from scratch). Page-to-image conversion is available via convert on tray runtimes.

PDFTK

pdftk is the main PDF toolkit. It uses positional syntax — input file(s) come before the operation.

Synopsis

pdftk <input.pdf> <operation> [options]
pdftk A=first.pdf B=second.pdf cat A B output merged.pdf

Page Ranges

Page ranges are 1-based. The keyword end refers to the last page.

3 — single page
1-5 — range of pages
3-end — from page 3 to the last page
1-endright — pages 1 to end, rotated 90° clockwise
3left — page 3 rotated 270° (counterclockwise)
1-5down — pages 1–5 rotated 180°

Extract Text

Dump the UTF-8 text content of every page in a PDF.

pdftk /mnt/file.pdf dump_data_utf8

Returns the full text content per page, suitable for piping into further analysis.

Extract Metadata

Print metadata including page count, title, author, and other info fields.

pdftk /mnt/file.pdf dump_data

Output includes NumberOfPages and InfoKey/InfoValue pairs.

Merge PDFs

Combine multiple PDFs into a single file by assigning handle labels.

pdftk A=/mnt/a.pdf B=/mnt/b.pdf C=/mnt/c.pdf cat A B C output /shared/merged.pdf

Handles (A, B, C…) allow you to interleave pages from different sources:

pdftk A=/mnt/report.pdf B=/mnt/appendix.pdf cat A1-5 B A6-end output /shared/combined.pdf

Split — Extract Page Ranges

Extract specific pages or ranges from a PDF.

# Extract pages 2–5
pdftk /mnt/file.pdf cat 2-5 output /shared/pages2to5.pdf

# Extract a single page
pdftk /mnt/file.pdf cat 3 output /shared/page3.pdf

# Extract from page 4 to the end
pdftk /mnt/file.pdf cat 4-end output /shared/from4.pdf

Split — Burst Into Individual Pages

Split every page into its own file. Use %02d or %03d for zero-padded numbering.

pdftk /mnt/file.pdf burst output /shared/page_%02d.pdf

Creates /shared/page_01.pdf, /shared/page_02.pdf, etc.

Rotate Pages

Rotate pages by 90°, 180°, or 270°. Directions: right (90° CW), left (90° CCW), down (180°).

# Rotate all pages 90° clockwise
pdftk /mnt/file.pdf rotate 1-end right output /shared/rotated.pdf

# Rotate only page 3
pdftk /mnt/file.pdf rotate 3 right output /shared/rotated.pdf

# Rotate pages 1–5 upside-down
pdftk /mnt/file.pdf rotate 1-5 down output /shared/flipped.pdf

CREATE-PDF

Generate a new PDF document from scratch using pdf-lib. Produces a formatted US Letter (612×792pt) document with headers, footers, and wrapped text.

Synopsis

create-pdf [title] [--output=/shared/out.pdf]

If called with no arguments, creates a sample document. The title argument sets both the document title and the filename (slugified).

Output Format

Centered gray header with document title and hairline rule on every page
24pt bold Helvetica titles, 13pt Helvetica body text
72pt margins on all sides
"Page N of M" footer centered at the bottom of every page
Automatic word-wrapping to fit content within margins

Programmatic Use

For custom PDF generation, use pdf-lib directly in a .jsh script:

var lib = await import('https://esm.sh/pdf-lib');
var doc = await lib.PDFDocument.create();
var page = doc.addPage([612, 792]);
var font = await doc.embedFont(lib.StandardFonts.Helvetica);
page.drawText('Hello, world!', { x: 72, y: 700, size: 18, font: font });
var bytes = await doc.save();
await writeFile('/shared/hello.pdf', bytes);

READING PDFs

Agents read PDF content by extracting text, then processing the result. The standard workflow:

# 1. Extract text from the PDF
pdftk /mnt/upload.pdf dump_data_utf8

# 2. Get page count and metadata
pdftk /mnt/upload.pdf dump_data

For visual inspection or OCR-like tasks, convert a page to an image and view it:

# Convert page 1 to PNG (0-indexed for convert)
convert /mnt/file.pdf[0] /shared/page1.png
open /shared/page1.png --view

Note: convert requires a tray runtime and is not available in browser-only mode. OCR and PDF form filling are not currently available.

EXAMPLES

Merge Monthly Reports

# Combine Q1 reports into a single document
pdftk A=/mnt/jan.pdf B=/mnt/feb.pdf C=/mnt/mar.pdf cat A B C output /shared/q1-report.pdf
open /shared/q1-report.pdf --download

Extract Text for Analysis

# Extract all text, then search for keywords
pdftk /mnt/contract.pdf dump_data_utf8 | grep -i "termination"

Split a Large Document

# Extract the executive summary (pages 1–3)
pdftk /mnt/whitepaper.pdf cat 1-3 output /shared/executive-summary.pdf

# Burst into individual pages for review
pdftk /mnt/whitepaper.pdf burst output /shared/wp_page_%02d.pdf

Fix Landscape Pages

# Rotate pages 4–6 which were scanned sideways
pdftk /mnt/scan.pdf rotate 4-6 left output /shared/fixed-scan.pdf

Create a Report and Deliver It

# Generate a titled PDF
create-pdf "Q2 Sales Summary" --output=/shared/q2-sales-summary.pdf
open /shared/q2-sales-summary.pdf --download

Combine and Reorder

# Take cover from one file, body from another, appendix from a third
pdftk A=/mnt/cover.pdf B=/mnt/body.pdf C=/mnt/appendix.pdf \
  cat A1 B C output /shared/final-document.pdf

NOTES

All output files should go to /shared/ or /mnt/ — not /tmp/.
The underlying libraries are @cantoo/pdf-lib and unpdf (for text extraction).
For PPTX-to-PDF conversion with full font embedding and layout fidelity, use the pptx2pdf skill.
There is no form-filling or digital signature support in the current environment.

pdf

NAME

DESCRIPTION

PDFTK

Synopsis

Page Ranges

Extract Text

Extract Metadata

Merge PDFs

Split — Extract Page Ranges

Split — Burst Into Individual Pages

Rotate Pages

CREATE-PDF

Synopsis

Output Format

Programmatic Use

READING PDFs

EXAMPLES

Merge Monthly Reports

Extract Text for Analysis

Split a Large Document

Fix Landscape Pages

Create a Report and Deliver It

Combine and Reorder

NOTES

SEE ALSO