NAME
pdf — PDF creation, manipulation, and text extraction in SLICC
DESCRIPTION
SLICC provides a complete PDF toolkit that runs entirely in-browser via WebAssembly. Agents can read, create, merge, split, rotate, burst, and extract text from PDF files without Python or native binaries. The primary commands are pdftk (manipulation and extraction) and create-pdf (generation from scratch). Page-to-image conversion is available via convert on tray runtimes.
PDFTK
pdftk is the main PDF toolkit. It uses positional syntax — input file(s) come before the operation.
Synopsis
pdftk <input.pdf> <operation> [options]
pdftk A=first.pdf B=second.pdf cat A B output merged.pdf
Page Ranges
Page ranges are 1-based. The keyword end refers to the last page.
3— single page1-5— range of pages3-end— from page 3 to the last page1-endright— pages 1 to end, rotated 90° clockwise3left— page 3 rotated 270° (counterclockwise)1-5down— pages 1–5 rotated 180°
Extract Text
Dump the UTF-8 text content of every page in a PDF.
pdftk /mnt/file.pdf dump_data_utf8
Returns the full text content per page, suitable for piping into further analysis.
Extract Metadata
Print metadata including page count, title, author, and other info fields.
pdftk /mnt/file.pdf dump_data
Output includes NumberOfPages and InfoKey/InfoValue pairs.
Merge PDFs
Combine multiple PDFs into a single file by assigning handle labels.
pdftk A=/mnt/a.pdf B=/mnt/b.pdf C=/mnt/c.pdf cat A B C output /shared/merged.pdf
Handles (A, B, C…) allow you to interleave pages from different sources:
pdftk A=/mnt/report.pdf B=/mnt/appendix.pdf cat A1-5 B A6-end output /shared/combined.pdf
Split — Extract Page Ranges
Extract specific pages or ranges from a PDF.
# Extract pages 2–5
pdftk /mnt/file.pdf cat 2-5 output /shared/pages2to5.pdf
# Extract a single page
pdftk /mnt/file.pdf cat 3 output /shared/page3.pdf
# Extract from page 4 to the end
pdftk /mnt/file.pdf cat 4-end output /shared/from4.pdf
Split — Burst Into Individual Pages
Split every page into its own file. Use %02d or %03d for zero-padded numbering.
pdftk /mnt/file.pdf burst output /shared/page_%02d.pdf
Creates /shared/page_01.pdf, /shared/page_02.pdf, etc.
Rotate Pages
Rotate pages by 90°, 180°, or 270°. Directions: right (90° CW), left (90° CCW), down (180°).
# Rotate all pages 90° clockwise
pdftk /mnt/file.pdf rotate 1-end right output /shared/rotated.pdf
# Rotate only page 3
pdftk /mnt/file.pdf rotate 3 right output /shared/rotated.pdf
# Rotate pages 1–5 upside-down
pdftk /mnt/file.pdf rotate 1-5 down output /shared/flipped.pdf
CREATE-PDF
Generate a new PDF document from scratch using pdf-lib. Produces a formatted US Letter (612×792pt) document with headers, footers, and wrapped text.
Synopsis
create-pdf [title] [--output=/shared/out.pdf]
If called with no arguments, creates a sample document. The title argument sets both the document title and the filename (slugified).
Output Format
- Centered gray header with document title and hairline rule on every page
- 24pt bold Helvetica titles, 13pt Helvetica body text
- 72pt margins on all sides
- "Page N of M" footer centered at the bottom of every page
- Automatic word-wrapping to fit content within margins
Programmatic Use
For custom PDF generation, use pdf-lib directly in a .jsh script:
var lib = await import('https://esm.sh/pdf-lib');
var doc = await lib.PDFDocument.create();
var page = doc.addPage([612, 792]);
var font = await doc.embedFont(lib.StandardFonts.Helvetica);
page.drawText('Hello, world!', { x: 72, y: 700, size: 18, font: font });
var bytes = await doc.save();
await writeFile('/shared/hello.pdf', bytes);
READING PDFs
Agents read PDF content by extracting text, then processing the result. The standard workflow:
# 1. Extract text from the PDF
pdftk /mnt/upload.pdf dump_data_utf8
# 2. Get page count and metadata
pdftk /mnt/upload.pdf dump_data
For visual inspection or OCR-like tasks, convert a page to an image and view it:
# Convert page 1 to PNG (0-indexed for convert)
convert /mnt/file.pdf[0] /shared/page1.png
open /shared/page1.png --view
Note: convert requires a tray runtime and is not available in browser-only mode. OCR and PDF form filling are not currently available.
EXAMPLES
Merge Monthly Reports
# Combine Q1 reports into a single document
pdftk A=/mnt/jan.pdf B=/mnt/feb.pdf C=/mnt/mar.pdf cat A B C output /shared/q1-report.pdf
open /shared/q1-report.pdf --download
Extract Text for Analysis
# Extract all text, then search for keywords
pdftk /mnt/contract.pdf dump_data_utf8 | grep -i "termination"
Split a Large Document
# Extract the executive summary (pages 1–3)
pdftk /mnt/whitepaper.pdf cat 1-3 output /shared/executive-summary.pdf
# Burst into individual pages for review
pdftk /mnt/whitepaper.pdf burst output /shared/wp_page_%02d.pdf
Fix Landscape Pages
# Rotate pages 4–6 which were scanned sideways
pdftk /mnt/scan.pdf rotate 4-6 left output /shared/fixed-scan.pdf
Create a Report and Deliver It
# Generate a titled PDF
create-pdf "Q2 Sales Summary" --output=/shared/q2-sales-summary.pdf
open /shared/q2-sales-summary.pdf --download
Combine and Reorder
# Take cover from one file, body from another, appendix from a third
pdftk A=/mnt/cover.pdf B=/mnt/body.pdf C=/mnt/appendix.pdf \
cat A1 B C output /shared/final-document.pdf
NOTES
- All output files should go to
/shared/or/mnt/— not/tmp/. - The underlying libraries are
@cantoo/pdf-libandunpdf(for text extraction). - For PPTX-to-PDF conversion with full font embedding and layout fidelity, use the
pptx2pdfskill. - There is no form-filling or digital signature support in the current environment.
SEE ALSO
pptx-lib— PowerPoint generationpptx2pdf— PPTX to PDF conversionconvert— ImageMagick (magick-wasm) for PDF-to-image conversionopen— view or download files from the agent environment