Skip to main content
Parse the data provided by the loader using PyMuPDF. PyMuPDF is a faster, simpler document parser that:
  • Processes PDF documents with basic structure preservation
  • Supports e-book formats like EPUB and MOBI
  • Is generally faster than docling for simpler documents
  • Works well for documents with straightforward layouts
Choose pymupdf when processing speed is more important than perfect structure preservation.

Samples

SELECT ai.create_vectorizer(
    'my_table'::regclass,
    parsing => ai.parsing_pymupdf(),
    -- other parameters...
);

Arguments

This function takes no arguments.

Returns

A JSON configuration object that you can use in ai.create_vectorizer.