On Tue, May 19, 2015 at 11:08 AM, Dan <dan27649@xxxxxxxxx> wrote:
John: My motivation to do this is that I have to scan about 200 source
members, and there is the likelihood that this will turn into a system-wide
endeavor. (I.e. thousands of source members.)
OK. But (1) you asked on here and the RDi list, suggesting that it's
something that, if RDi could do it, would satisfy your use case; and
(2) no matter if it's tens of thousands, this kind of thing is still
*usually* something that humans (not programs) ultimately act on. For
example, when we were doing our Y2K conversion, we found who knows how
many instances where something had to be fixed. But we did all those
fixes by hand (yes, it was one of the most tedious things imaginable).
So for that project, it would not have made much of a difference
whether the searches ignored comments or not.
I wonder if IBM could be persuaded to share the code it uses to validate
code syntax, specifically how it distinguishes between comments and live
code.
I didn't fully understand Jon's response either, but from a purely
technical standpoint (regardless of IBM's "willingness"), I doubt
their code would be of particular use to us. I would expect it to be
in C, and I would expect it to be fairly tightly coupled to and
integrated with other code (for example, it might also parse strings,
keywords, numbers, operators, etc. in the same pass).
If this is a wheel that needs to be invented, I'd be happy to be involved
in an open source project to do it, but from the looks of the regular
expressions using grep / egrep, I'd be more of a cheerleader than a main
contributor.
I wouldn't judge the difficulty of "regular coding" based on regex
coding. Regular expressions are both denser and less capable than
most "Turing complete" languages. (APL is a Turing complete language
which is arguably even denser than regex.)
Honestly, writing an RPGLE comment parser (in almost any programming
language) doesn't strike me as particularly difficult. The trickiest
thing, in my mind, would be NOT interpreting the double-slashes as
comment-initiators when they appear inside string literals, which can
contain quotes and span multiple lines.
All in all, it's actually quite a manageable programming project.
It's just nontrivial enough that most people seem to not bother. And
also just nontrivial enough that it's a little frustrating that most
editors don't provide easy hooks to read their syntax highlighting.
John Y.
As an Amazon Associate we earn from qualifying purchases.