|
I have a file with over a million records.
The file has three fields that can contain various values (yes the values
can be in any of the three fields)
Example:
Record 1 Field 1 = ABC Field 2 = ABC Field 3 = 123
Record 2 Field 1 = Field 2 = XYZ Field 3 =
Record 3 Field 1 = 123 Field 2= ABC Field 3 =
Record 4 Field 1 = 456 Field 2 = Field 3 =
I need to display a list of unique values from the combined three fields
such as this for the 4 record example above:
Blank
ABC
XYZ
123
456
This file will have over a million records in it
and the resulting list will usually have less then 500 unique values
I am trying to determine what is the best way to get the list of unique
values into a list (in an interactive job) to be displayed to the user.
I am pretty sure reading a million + records and looking up every value in
an array and having the array only contain values that are unique will be
VERY SLOW
I thought about keeping a secondary file containing a separate list of the
unique values as they are entered into the primary file. But then I have to
maintain this file by removing values that are removed from the primary
file
and determining when a value is no longer in the primary file and time to
remove it from the secondary file would become an issue in itself.
Anybody have any suggestions?
Thanks
John
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.