× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Jack

I tried QSH and the command works. The only problem is that my output has unrecognizable characters instead of the text I was looking for. Is there something else I can do to get the text in a proper format?


David

I see your option to output the text to a database file. That could work and then I would write an RPG program to read the data and parse out what I need. I will keep that solution as an option. Thanks

Gord


-----Original Message-----
From: JAVA400-L [mailto:java400-l-bounces@xxxxxxxxxxxx] On Behalf Of java400-l-request@xxxxxxxxxxxx
Sent: January 20, 2017 11:00 AM
To: java400-l@xxxxxxxxxxxx
Subject: JAVA400-L Digest, Vol 15, Issue 13

Send JAVA400-L mailing list submissions to
java400-l@xxxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.midrange.com/mailman/listinfo/java400-l
or, via email, send a message with subject or body 'help' to
java400-l-request@xxxxxxxxxxxx

You can reach the person managing the list at
java400-l-owner@xxxxxxxxxxxx

When replying, please edit your Subject line so it is more specific than "Re: Contents of JAVA400-L digest..."


Today's Topics:

1. Output results from tika to a file (Gordon Schneider)
2. Re: Output results from tika to a file (Jack Woehr)
3. Re: Output results from tika to a file (David Gibbs)


----------------------------------------------------------------------

message: 1
date: Fri, 20 Jan 2017 17:29:18 +0000
from: Gordon Schneider <schneiderg@xxxxxxxxxxxxxxxxx>
subject: Output results from tika to a file

I have installed the tika java application from apache.org on our power server running IBM I version 7.2 and on my PC running Windows 10. We get a word document from one of our vendors. We need to extract the text from it to process the document.


I have this working on my PC by using the following command:

C:\Users\gord\Downloads>java -jar tika-app-1.14.jar -t "C:\Users\gord\Documents\Goralta Invoice.doc" > "C:\Users\gord\Documents\Goralta Invoice.txt"


The next step was to get the same result on our Power Server. I ran the following command first

JAVA CLASS('/java/Tika/tika-app-1.14.jar') PARM('-t' '/java/Tika/Goralta Invoice.doc')

It displays the text it extracted on the screen. Great. So I know the java program is working on our system. The last step is to get the results to be ported to a file.


I have tried many different combinations but I cannot get it to work. Here is an example of what I have tried and the error we are getting.

JAVA CLASS('/java/Tika/tika-app-1.14.jar') PARM('-t' '"/java/Tika/Goralta Invoice.doc" > "/java/Tika/Goralta Invoice.txt"')

Exception in thread "main" java.net.MalformedURLException: no protocol: "/java/Tika/Goralta Invoice.doc" > "/java/Tika/Goralta In voice.txt"
at java.net.URL.<init>(URL.java:609)
at java.net.URL.<init>(URL.java:506)
at java.net.URL.<init>(URL.java:455)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:472)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:145)
Java program completed with exit code 1


I am not a Java programmer. We use Java tools that make things easier for us as RPG programmers. Any help you can provide will be very much appreciated.



Gordon Schneider
403-236-0601
Trans Am Piping Products Ltd.



------------------------------

message: 2
date: Fri, 20 Jan 2017 10:34:39 -0700
from: Jack Woehr <jwoehr@xxxxxxxxxxxxxxxxxxxxxxxx>
subject: Re: Output results from tika to a file

Redirection with the > symbol is part of the shell, not part of java

Why not run it from QSH or QP2SHELL? then what you are doing will work.

On Fri, Jan 20, 2017 at 10:29 AM, Gordon Schneider < schneiderg@xxxxxxxxxxxxxxxxx> wrote:

I have installed the tika java application from apache.org on our
power server running IBM I version 7.2 and on my PC running Windows
10. We get a word document from one of our vendors. We need to extract
the text from it to process the document.


I have this working on my PC by using the following command:

C:\Users\gord\Downloads>java -jar tika-app-1.14.jar -t
"C:\Users\gord\Documents\Goralta Invoice.doc" >
"C:\Users\gord\Documents\Goralta Invoice.txt"


The next step was to get the same result on our Power Server. I ran
the following command first

JAVA CLASS('/java/Tika/tika-app-1.14.jar') PARM('-t'
'/java/Tika/Goralta
Invoice.doc')

It displays the text it extracted on the screen. Great. So I know the
java program is working on our system. The last step is to get the
results to be ported to a file.


I have tried many different combinations but I cannot get it to work.
Here is an example of what I have tried and the error we are getting.

JAVA CLASS('/java/Tika/tika-app-1.14.jar') PARM('-t'
'"/java/Tika/Goralta Invoice.doc" > "/java/Tika/Goralta Invoice.txt"')

Exception in thread "main" java.net.MalformedURLException: no protocol:
"/java/Tika/Goralta Invoice.doc" > "/java/Tika/Goralta In voice.txt"
at java.net.URL.<init>(URL.java:609)
at java.net.URL.<init>(URL.java:506)
at java.net.URL.<init>(URL.java:455)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:472)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:145)
Java program completed with exit code 1


I am not a Java programmer. We use Java tools that make things easier for
us as RPG programmers. Any help you can provide will be very much
appreciated.



Gordon Schneider
403-236-0601
Trans Am Piping Products Ltd.

--
This is the Java Programming on and around the IBM i (JAVA400-L) mailing
list
To post a message email: JAVA400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/java400-l
or email: JAVA400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/java400-l.





As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.