× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Hi Vern,

Thank you for your responses.

PDFBox was suggested, a java solution.

I have downloaded the jar files, modified my classpath and woohoo I have a
text output from the PDF.

Interestingly, I needed to add the -sort=true command line option to get
usable text out of the PDF but this took less than an hour to install and
test.

Obviously, I have only tested with one pdf but the results are promising.

And for an experiment I passed the text to an AI API to return the
required information from the PDF text file in JSON and that worked very
nicely so now I have a PDF extracted to text and the text converted to
JSON providing only the requested fields.

Cheers
Don

 

Don Brown

Senior Consultant
 
[1]OneTeam IT Pty Ltd
P: 1300 088 400

-----Original Message-----
From: MIDRANGE-L <midrange-l-bounces@xxxxxxxxxxxxxxxxxx> On Behalf Of Vern
Hamberg via MIDRANGE-L
Sent: Tuesday, 17 June 2025 8:59 AM
To: midrange-l@xxxxxxxxxxxxxxxxxx
Cc: Vern Hamberg <vhamberg@xxxxxxxxxxxxxxx>
Subject: Re: Convert PDF to text

Hi Don

Probably I was the one who said Poppler will not install with yum - I
looked at the list of installed and available packages in Open Source
Package Management (OSPM) when connected with a 7.6 machine - Poppler does
not seem to be there, unless it uses a different name.

Someone else mentioned Ghostscript, as did I - it can do some of what you
need, I believe. We used it to generate PCL from PDF files, it did fine,
we didn't have any weird PDF-ish stuff. That person mentioned another
tool, I can't say if it's in the OSPM.

Jack Woehr mentioned pypdf - some Python component, I assume - I did not
see it listed separately in OSPM, maybe it's part of another python
package.

*Regards*

*Vern Hamberg*

IBM Champion 2025 <cid:part1.cW8DDgxI.raSwjvav@centurylink.net> CAAC
(COMMON Americas Advisory Council) IBM Influencer 2023

On 6/16/2025 4:26 PM, Don Brown via MIDRANGE-L wrote:
> Thanks Patrik,
>
> I thought I read somewhere that the Poppler tools would not install with
> yum on IBMi ?
>
> Thanks for the link, I will give it a go.
>
> Have you, or anyone installed and used these tools ?
>
> Thanks
> Don
>
> Â
>
> Don Brown
>
> Senior Consultant
> Â
> [1]OneTeam IT Pty Ltd
> P: 1300 088 400
>
> -----Original Message-----
> From: MIDRANGE-L<midrange-l-bounces@xxxxxxxxxxxxxxxxxx> On Behalf Of
> Patrik Schindler
> Sent: Monday, 16 June 2025 6:38 PM
> To: Midrange Systems Technical Discussion<midrange-l@xxxxxxxxxxxxxxxxxx>
> Subject: Re: Convert PDF to text
>
> Hello Don,
>
> Am 16.06.2025 um 09:12 schrieb Don Brown via MIDRANGE-L
> <midrange-l@xxxxxxxxxxxxxxxxxx>:
>
> > 1. Does anyone have a recommended solution to achieve converting a pdf
> to text. I am after a php or native rpg ish solution. Not python please.
>
> I'd use the pdftotext command from the poppler-utils package in PASE. I
> assume the poppler-utils package is available for installation via yum.
>
> [2][2]https://en.wikipedia.org/wiki/Poppler_(software)
>
> :wq! PoC
>
> --
> This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
> list To post a message email:MIDRANGE-L@xxxxxxxxxxxxxxxxxx To subscribe,
> unsubscribe, or change list options,
> visit: [3][3]https://lists.midrange.com/mailman/listinfo/midrange-l
> or email:MIDRANGE-L-request@xxxxxxxxxxxxxxxxxx
> Before posting, please take a moment to review the archives at
> [4][4]https://archive.midrange.com/midrange-l.
>
> Please contactsupport@xxxxxxxxxxxxxxxxxxxx for any subscription related
> questions.
>
> --
> Message protected by MailGuard: e-mail anti-virus, anti-spam and content
> filtering.
> [5][5]https://www.mailguard.com.au
>
> References
>
> Visible links
> 1.[6]https://www.oneteamit.com.au/
> 2.[7]https://en.wikipedia.org/wiki/Poppler_(software)
> 3.[8]https://lists.midrange.com/mailman/listinfo/midrange-l
> 4.[9]https://archive.midrange.com/midrange-l.
> 5.[10]https://www.mailguard.com.au/
>
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list To post a message email: MIDRANGE-L@xxxxxxxxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: [11]https://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxxxxxxxx
Before posting, please take a moment to review the archives at
[12]https://archive.midrange.com/midrange-l.

Please contact support@xxxxxxxxxxxxxxxxxxxx for any subscription related
questions.

--
Message protected by MailGuard: e-mail anti-virus, anti-spam and content
filtering.
[13]https://www.mailguard.com.au

References

Visible links
1. https://www.oneteamit.com.au/
2. https://en.wikipedia.org/wiki/Poppler_(software)
3. https://lists.midrange.com/mailman/listinfo/midrange-l
4. https://archive.midrange.com/midrange-l.
5. https://www.mailguard.com.au/
6. https://www.oneteamit.com.au/
7. https://en.wikipedia.org/wiki/Poppler_(software)
8. https://lists.midrange.com/mailman/listinfo/midrange-l
9. https://archive.midrange.com/midrange-l.
10. https://www.mailguard.com.au/
11. https://lists.midrange.com/mailman/listinfo/midrange-l
12. https://archive.midrange.com/midrange-l.
13. https://www.mailguard.com.au/

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.