× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Hey Mike,
Can't you use some sort of style sheet to handle this? When you specify a
CSS style you can specify the media=print and have a special style sheet
for printing.

<link rel="stylesheet" type="text/css" media="print" href="print.css" />



Thanks
Bryce Martin
Programmer/Analyst I
570-546-4777



Mike Cunningham <mike.cunningham@xxxxxxx>
Sent by: rpg400-l-bounces@xxxxxxxxxxxx
09/22/2010 08:46 AM
Please respond to
RPG programming on the IBM i / System i <rpg400-l@xxxxxxxxxxxx>


To
RPG programming on the IBM i / System i <rpg400-l@xxxxxxxxxxxx>
cc

Subject
RE: Convert HTML to plain text






I wish but on. This will be something that will be with us for years. What
would really be nice is a printer that understands html. Then just spool
out the html generated using your favorite way to do html and send it to a
spool file and let the driver format and print it. For the printer
manufacture it would probably be any harder than PJL or PDF or Postscript

-----Original Message-----
From: rpg400-l-bounces@xxxxxxxxxxxx [mailto:rpg400-l-bounces@xxxxxxxxxxxx]
On Behalf Of John McKay
Sent: Tuesday, September 21, 2010 10:31 AM
To: RPG programming on the IBM i / System i
Subject: Re: Convert HTML to plain text

If this is a once-off request, life gets simpler ...

... display the HTML page as normal in a browser, copy the page using
ctrl-c then paste and the work is done for you.


Regards,
John McKay mba

----- Original Message -----
From: "Mike Cunningham" <mike.cunningham@xxxxxxx>
To: "RPG programming on the IBM i / System i" <rpg400-l@xxxxxxxxxxxx>
Sent: Tuesday, September 21, 2010 3:01 PM
Subject: RE: Convert HTML to plain text


Thanks Vern. Looked at the samples, might go that way although while it

does work I find it very difficult to get the code to work and coming
back
later to look at it in 6 months I get totally lost with what it is doing

and how/why it works. /<\s*\/?\s*span\s*.*?>/g makes no sense to my
brain.
But that is my brains problem and might be the best solution. I usually

prefer to go with code that is a bit more "wordy".

Here is a very simple sample of what is in the field in the database. It

is not a full html page from <html> to </html> it is just a snippet of
code in html format. This one only has <BR> to deal with but the code
could have <span>s and <li>s and a few other goodies if they format the
text to look nicer (indents, bullet lists, etc) It will never be the
full
suite of html, only a small subset which makes it easier.

"Students in baccalaureate programs will take coursework in each of the
following categories: Cultural Diversity,Science/Technology/Society and
Writing Enriched. Requirements are met in this program with:<br />DIV:
MGT
410<br />WEC: MGT 410, MGT 490<br />STS: Approved STS elective in
Liberal
Arts core (see catalog)<br />Directed Electives in this program are: ACC

230, 310, CIM 428, MGT 116, 247, 315, 320, 340, 350, MKT 243, 251, 260,
QAL 101<br />(** At least 3 credits of the 6 must be 300-400 level
courses)<br /><br />STUDENTS WILL ENTER THIS PROGRAM WITH AN ASSOCIATE
DEGREE WITH A TECHNOLOGY EMPHASIS. Current programs ineligible for this
degree are BM, RM, GS, &amp; IS. Courses used in the associate degree
cannot be used in the final 4 semesters."

I did the code already to put this into 80 characters chucks or a <br>,
whichever occurs first and wrote a line to the print file which resulted

in this.

Students in baccalaureate programs will take coursework in each of the
following
categories: Cultural Diversity,Science/Technology/Society and Writing
Enriched.
Requirements are met in this program with:
DIV: MGT 410
WEC: MGT 410, MGT 490
STS: Approved STS elective in Liberal Arts core (see catalog)
Directed Electives in this program are: ACC 230, 310, CIM 428, MGT 116,
247, 315
, 320, 340, 350, MKT 243, 251, 260, QAL 101
(** At least 3 credits of the 6 must be 300-400 level courses)

STUDENTS WILL ENTER THIS PROGRAM WITH AN ASSOCIATE DEGREE WITH A
TECHNOLOGY EMPH
ASIS. Current programs ineligible for this degree are BM, RM, GS, &amp;
IS. Co
urses used in the associate degree cannot be used in the final 4
semesters.

Can easily deal with the &amp; but I think the fun will be the word
wrap.
I may need to parse the string into words and join words together until
the length of what is joined plus the length of the next word exceeds
80,
then write a line.

-----Original Message-----
From: rpg400-l-bounces@xxxxxxxxxxxx
[mailto:rpg400-l-bounces@xxxxxxxxxxxx]
On Behalf Of Vern Hamberg
Sent: Tuesday, September 21, 2010 8:51 AM
To: RPG programming on the IBM i / System i
Subject: Re: Convert HTML to plain text

Mike

No code, just questions!

Is the HTML in a PF or in a STMF? The latter is preferable, methinks. I
can see a couple things to do - first look for closing tags - scan for
"</" - and the scan back for the matching opening tag. Then take on the
unary (my term) tags like <br>.

There's also the need, perhaps, to take out <html> and <head> and not
the
contents of <body>.

Or maybe, as I get from another site, it's enough to strip everything
between "<" and ">" in that order - unless you have comparison operators

in there! Sites in the google below discuss these issues.

I did a quick google on "strip html tags". One link -
http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx -
discusses
using regular expressions. Another -

http://nadeausoftware.com/articles/2007/09/php_tip_how_strip_html_tags_web_page

- discusses issues about text you still want inside some tags.

Looks as if grep or sed or the like could do the work, with an
appropriate
expression. And those are callable from RPG or CL through QSH.

HTH
Vern

On 9/21/2010 7:13 AM, Mike Cunningham wrote:
Would anyone happen to have RPG code to take HTML and strip off all the

tags and just have plain text that would be printed using normal print
files? I have a form that needs to be displayed on a web page and also
printed from an RPG application. Part of the form is data collected
using
a rich-text editor on a web page that is stored as HTML in a variable
length field. Works great when the form is on a webpage as it is a
what-you-see-is-what-you-get function. Any special editing put in the
rich-text editor shows on the web page exactly as entered. Problem is
taking that html code and printing it using a normal print file to an
outq then the printer. Stripping out the html tags might not be too
bad.
Dealing with<br> tags and<p> tags and<ul><li> can be a bit more
challenging but I think word wrap is going to be the hardest. The print

file line is 80 characters and I need to be sure to not break a word
between lines. Some tricky code and I thought I would just see if
anyon!

e !
might have done this already and would share their code.

Thanks

--
This is the RPG programming on the IBM i / System i (RPG400-L) mailing
list To post a message email: RPG400-L@xxxxxxxxxxxx To subscribe,
unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives at
http://archive.midrange.com/rpg400-l.

--
This is the RPG programming on the IBM i / System i (RPG400-L) mailing
list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.



As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.