× The internal search function is temporarily non-functional. The current search engine is no longer viable and we are researching alternatives.
As a stop gap measure, we are using Google's custom search engine service.
If you know of an easy to use, open source, search engine ... please contact support@midrange.com.



Hi again, Aaron:

string = 'This is a line of text 111111. T2his i2s al2so a2 lin2e' +
'will eventuallyrunoverats string is longer than normal. Somwer' +
'some more text'+ x'0D25' + 'over at some point simple stri.'+
'This is the last sentence in the paragraph.' + x'00';
pattern = '(\S\S{40,}|.{1,40})(\s+|$)' + x'00';

FWIW, if you want to know what your code is doing... The first element of the regmatch_t MODS contains where the regular expression matched against the "whole" expression.

I haven't tested it, but if I'm reading your pattern right, the first element in the array should point to this:

This is a line of text 111111. T2his i2s

why? Because your expression looks for any 40 characters (that's what .{1,40} does -- one to forty of any character. RE's are "greedy", so it'll take the 40 characters rather than the one) followed by either the end of the line ($) or the "s" character. So it's showing you a match with 39 characters followed by an "s". Since 39 is the most amount of characters it can grab (that's what I mean by "greedy") that are immediately followed by an "s".

The next element of the array matches your first subexpression. Subexpressions are the things in parenthesis. Your first one is (\S\S{40,}|.{1,40}) -- which means 40 or more of the letter S or 1-40 of any character.

Since, as we already know, the whole expression matched 39 characters followed by an s, the first subexpression will contain those 39 characters.

The 3rd (and final) element of your array should just be the 's'. Because your 2nd (and final) subexpression matches one or more of the letter s, or the end of the line. If I'm right about the whole expresssion matching 39 chars followed by an 's', this last element will just contain that 's'.

Since I see nowhere in your expression where multiple 'ssssss' exists, the ones that require multiple 'ssss' to match, won't match. So that last bit HAS to either be a single s, or the end of the string...

I have to admit, trying to figure out what your expression would match has given me a bit of a headache :) After thinking that hard, I'm sure there's smoke coming out of my ears :) But, anyway... let me know if I'm right, and if it makes sense to you now...

And then let me know what you're really trying to accomplish and I'll see if I can think of a better way.

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.