×

Good News Everybody!

The new search engine is LIVE!

Please report any problems to david (at) midrange.com.




Hi Don,

it depends.

From the "command line" you can use Ghostscript. The latest PASE version from the IBM repository should be OK, with

gs
-DEVICE=txtwrite
-o output.txt
input.pdf

you should get an output - but you maybe have to experiment with the encoding, as this is not fixed in PDF documents.

From RPG I would always use PDFbox (https://pdfbox.apache.org/) - with this, you have complete control over the PDF processing.

But you can also use PDFbox from the command line using

java
-jar pdfbox-app-3.y.z.jar
export:text
[OPTIONS]
-i=<infile>

But make sure, to use a reasonable new 64-bit JVM - I'm using Java 21 64-bit, and it's quite fast - in fact after the initial JVM loading, Java is near native performance.

I had the task to split PDF files - up to 5 or 6 pages, Ghostscript (PASE) was faster - but with 10 or more pages, PDFbox (Java 21 65-bit) was always faster. And it got even better, if more than one file was to split in the same Job/Session - PDFbox was always faster, as the JVM stayed in memory and even the JAR file was kept loaded.

So as I said - it really depends on what you want to do exactly - and how. I.e. if this text should go into a database table, I would recommend going the RPG/Java/PDFbox way.

I'm in the process to write a bit about RPG, Java and PDFbox in the nexts weeks on my blog. If you like I can give you sneak peek of it. It's a bit overwhelming at the beginning with JNI, JVM initialization and RPG to Java prototypes - but once you got it, pack everting you need into a service program, and be happy.

HTH and kind regards,
Daniel


Am 16.06.2025 um 10:37 schrieb Patrik Schindler <poc@xxxxxxxxxx>:
Hello Don,

Am 16.06.2025 um 09:12 schrieb Don Brown via MIDRANGE-L <midrange-l@xxxxxxxxxxxxxxxxxx>:

1. Does anyone have a recommended solution to achieve converting a pdf to text. I am after a php or native rpg ish solution. Not python please.

I'd use the pdftotext command from the poppler-utils package in PASE. I assume the poppler-utils package is available for installation via yum.

https://en.wikipedia.org/wiki/Poppler_(software)

:wq! PoC

--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: https://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxxxxxxxx
Before posting, please take a moment to review the archives
at https://archive.midrange.com/midrange-l.

Please contact support@xxxxxxxxxxxxxxxxxxxx for any subscription related questions.

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.