× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Aaron,

Why then not simply:

start = 1

Repeat while not end of string
END = START + 40
CHECKR from END to the first space,
SUBSTR(start, found)
start =found + 1

or something similar?

I know, the check would be on all digits, letters (twice) and punctuatoin
(and some more)

With regards,
Carel Teijgeler

*********** REPLY SEPARATOR ***********

On 19-9-2008 at 7:35 Aaron Bartell wrote:

What is your pattern intended to do? I know you want to split the line
up
every 40 characters, but... why not use %subst() to do that? I assume
you
want to do something more, like split it up on a whitespace boundary,
right?


Thanks for calling that out Scott. After re-reading my post I do realize
I
left out key information. Making this code work is actually an attempt to
keep all the processing in RPG vs. Java (which my customer is thinking of
doing if I can't get this working)

The key criteria for what I need the regular expression to accomplish
would
be:

1) I want to split a long text field into 40 byte records.

2) I don't want splits to happen in the middle of a word and if the 40th
byte is in the middle of a word it should go back to the previous space.

3) If a carriage return is found, that should also break the string to
have
everything after the carriage return start on a new line.

Concerning the un-modern coding, I rarely use occurs so I didn't want to
change that line of code for fear of breaking something that would send me
on a wild good chase to fix.

I should also note that this same regular expression has been declared to
work from the Java environment. After digging around in the archives
though
I see a number of people have found the IBMi notation might not be the
same
as all other platforms (i.e. \w doesn't seem to work per others in the
archives, though I haven't tried it myself).


so it'll take the 40 characters rather than the one) followed by either
the end of the line ($) or the "s" character.

I believe the \s is short-hand for \t\r\n per this page:
http://www.regular-expressions.info/examples.html (look at the Trimming
Whitespace section). So maybe the short-hand doesn't work on the IBMi?
Here is the Java code that is working as expected on my PC:
http://code.midrange.com/db7764d7d5.html

That Java code produces the below output:
[[Begin]]
Compiled pattern:(\S\S{40,}|.{1,40})(\s+|$)
This is a line of text 111111. T2his i2s
al2so a2 lin2ewill eventuallyrunoverats
string is longer than normal. Somwersome
more text

over at some point simple stri.This is
the last sentence in the paragraph.
[[End]]

The modified RPG code (i..e %occur and option(*string) added) has the
following output for three MODS entries:
[[Begin]]
This is a line of text 111111. T2his i2s
This is a line of text 111111. T2his i2
s
[[End]]

Modified RPG code is here: http://code.midrange.com/492f7222d1.html

So given your evaluation of the regex results I am not sure which way to
go
here because it works in Java but not on the IBMi. That tells me there
are
some notation discrepancies between the two. So I am thinking I should
change the short hand stuff to be "long hand". Here is what I *believe*
to
be the long hand version:

([a-zA-Z0-9]{40,}|.{1,40})(^[a-zA-Z0-9]+|$)

I replace \S (note capital S is negating whitespace) with [a-zA-Z0-9].
Note
I couldn't do two of those sequences of negated bracketed expressions
because the regex compiler didn't like the second one.

I replaced \s (lower case s) with [a-zA-Z0-9].

Doing that and re-running my RPG program gave me the following results
(note
the blank last line is literal).
[[Begin]]
s is the last sentence in the paragraph.
s is the last sentence in the paragraph.

[[End]]



In the end I am after what I described in the three points at the top of
this email and any assistance is greatly appreciated as I am not well
versed
in regex outside of the simple ones I find in XSD's.

TIA,
Aaron Bartell
http://mowyourlawn.com
--
This is the RPG programming on the AS400 / iSeries (RPG400-L) mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.


No virus found in this incoming message.
Checked by AVG.
Version: 8.0.101 / Virus Database: 270.7.0/1679 - Release Date: 18-9-2008
17:03




As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.