An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
Read and extract text and other content from PDFs in C# (port of PDFBox)
翻译 - 在C#(PdfBox的端口)中读取和提取PDF中的文本和其他内容
Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (f...
Remove textual watermark of any font, any encoding and any language with pdf-unstamper now!
Boxable is a library that can be used to easily create tables in pdf documents.
(Java)A Method to Extract Tabular Content from PDF Files
pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.
Test area for public PDFBox v2 issues on stackoverflow etc
Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV
可以将word(doc、docx)、excel、pdf、ppt、csv、txt文件的文本内容提取出来,同时能够提取出word、pdf文件的目录
Java library for creating fluid page layouts with Apache PDFBox. Supporting multi-page tables, different page layouts etc.
Checks the PDFs submitted to a conference, e.g., for formatting violations and double anonymous violations
📄◻️ Create, Maniuplate and Extract Data from PDF Files (R Apache PDFBox wrapper)