Hi Jim,
<snip>
This is an XML file that I am getting via the httpapi via http_url_post
function [Thanks Scott!], in an RPG program. The program I'm trying to
replace is a Delphi program that took a block of data and did a replace
in each block of data with some Delphi function that would replace "<"
with 'CRLF' +"<" so that it could be later copied into a fix length
record format to process on the iSeries. (the xml is all one long
string, yuck!)
</snip>
Firstly, does it matter that the xml is one long string? Surely it is being processed by a program and the program wont care. If you want it formatted for you to read - open it with a browser, it will format it for humans to read. In fact they will provide a tree structure to make it really easy to read. Or maybe there are some side-effects of your process that are used in the fixed-length record format build. I say 'side-effects' because the process does more than 'format' the xml:
This brings me onto the second point:
We are talking xml. the '<' character is part of the xml markup language and is found EVERYWHERE in an xml file. The solution of replacing a '<' with a CRLF + '<' will put CRLF in places you may not expect.
Consider this simple document:
<?xml version="1.0" ?>
<!DOCTYPE PUBLIC "
http://some.dtd.url" >
<!-- This is a comment -->
<root>
<element1>Value1</element1>
<element2>Value2</element2>
<element3>Value3</element3>
</root>
In the 'yukky' state it would look like this:
<?xml version="1.0" ?><!DOCTYP PUBLIC "
http://some.url" ><!-- This is a comment --><root><element1>Value1</element1><element2>Value1</element2><element3>Value1</element3></root>
Which is perfectly fine, if not pretty for humans to read.
Using the replace scenario above would make it look like this:
CRLF<?xml version="1.0" ?>CRLF<!DOCTYP PUBLIC "
http://some.url"
CRLF<!-- This is a comment
-->CRLF<root>CRLF<element1>Value1CRLF</element1>CRLF<element2>Value2CRLF</element2>CRLF<element3>Value3CRLF</element3>CRLF</root>
Replacing CRLF with actual carriage returns and line feeds:
<?xml version="1.0" ?>
<!DOCTYPE PUBLIC "
http://some.dtd.url" >
<!-- This is a comment -->
<root>
<element1>Value1
</element1>
<element2>Value2
</element2>
<element3>Value3
</element3>
</root>
The important point is to look at the values INSIDE the elements within the root element. they have a training CRLF in the data:
<element1>Value1CRLF</element1>
<element2>Value2CRLF</element2>
<element3>Value3CRLF</element3>
The close tag also starts with a '<' and will be replaced with CRLF + '<'. Thus you are putting whitespace into the data content of the elements. element1 now has a value of Value1CRLF.
This may, or may not, have an impact on your software (I'm guessing as the original Delphi program does this it is OK for you - the trailing CRLF may have been used for delimiting the data in the fixed-length record format: Who knows!). But we must be aware that this is not necessarily a 'good' solution for all instances and that this should be stated for the archive.
Good luck!
Cheers
Larry Ducie
_________________________________________________________________
View photos of singles in your area! Browse profiles for FREE
http://clk.atdmt.com/NMN/go/150855801/direct/01/
As an Amazon Associate we earn from qualifying purchases.