QParse
for
MS-DOS
Download
QParse for DOS v.0.15 (67,496 bytes)
What's in the zip
-
QParse.exe
-
The MS-DOS executable.
-
QParse.Ini
-
EQLog.Ini
-
SrchEngs.Ini
-
The example *.ini files for QParse for DOS
-
This html file.
Updates
Currently no updates to this version of the program.
Description
QParse is a programmable text-based (ascii) file
parser. It is used to filter data from a text file and put that data in
a text-based spreadsheet (in CSV format).
The main *.Ini file (Qparse.Ini) contains two sections:
[Options]
and [Routines]. The [Options]
section can contain settings to be used instead of the command line options.
NOTE: Currently the Qparse.ini has precedence over the command line switches.
The [Routines] section lists
all the routine names along with the source file to find the routines code
in.
Command Line Syntax
QPARSE src_file [dst_file]
/F:func_name
[[/s:val]
[/s:val]
| src_file |
The text-file to be processed. |
| Dst_file |
The optional output file
[Generally with be a *.CSV file] |
| /F:func_name |
The routine, as listed
in the *.ini file(s), to use to process the source file |
| /s:val |
Switches which may be
one of the following: |
| /A:0|1 |
Overwrite [0] or Append
[1] the specified output file |
| /C:0|1 |
Specifies whether or not
the searches should be Case-Sensitive [0::Ignore Case, 1::Case-Sensitive] |
| /E:0-0 |
Specifies the Encoding
function to use. Currently no Encoding functions are available, there the
only option is 0 |
| /H:0|1 |
Specifies whether or not
to print a header line. |
| /I:0|1 |
Specifies whether or not
to Ignore blank lines. |
| /L:num |
Specifies the size in
characters of Fixed-width format output fields. |
| /O:0-3 |
Specifies the Output Format.
[0::CSV, 1::Fixed, 2::Deliminated, 3::Custom] |
| /Q:cstr |
Specifies the String to
use instead of the Double Quote character. The string can use the C Escape
sequence format. |
| /S:cstr |
Specifies the String to
use instead of the Comma for the field separator character. |
| /T:cstr |
Specifies the String to
use instead of "\r\n" for the Input Line Termination marker. |
| /W:cstr |
Specifies the String to
be ignored (deleted before processing). The default is "\r". |
| /X0:cstr |
Specifies the String to
be used for the False conditions of the XIST command. The default is "0". |
| /X1:cstr |
Specifies the String to
be used for the True condition of the XIST command. The default is "1". |
src_file is any filespec not including wildcards,
that Qparse will search through to extract information.
dst_file is optional, and if not specified, the
screen is used for output. It too can not contain wildcards
Example QParse.Ini
file
[Options]
[Routines]
Functions=2
Func1=MarchantItemPrice,Test.Ini
Func2=HotBot,Test.Ini
Example Test.Ini file
[MerchantItemPrice]
LAST, "You have entered ", "You have entered ", "."
EACH, "tells you, 'That'll be", "] ", " tells you, 'That'll be "
FROM, "tells you, 'That'll be ", " platinum"
UPTO, " gold"
UPTO, " silver"
UPTO, " copper"
APND, 2
FROM, "per ", "'."
FROM, "for the ", "'."
EEND, ""
REND, ""
[HotBot]
SKIP, "<DIV", 1
LINC
UNTL, ">next</a>"
FROM, "&rsource=INK>", "</a></b><br>"
UPTO, "<br><font size=1><i>"
UPTO, "</i> "
UPTO, "<br>"
UEND
REND
Description of the above
*.Ini files
The abouve files are used just as examples, the
actual *.Ini files included in the *.zip file are much more complete, and
even the MerchantItemPrice function is more coplex.
In the example routine MerchantItemPrice, we first
want to know what zone we are in, so we use the LAST
command to find that out. The following is such a line in the eqlog.txt
file:
[Sat Aug 12 00:13:33 2000] You have entered West Freeport.
Next, we want to know who the merchant is, so we use
the EACH command, since each
line with a merchant telling you how much an item costs contains the rest
of the data we need (merchant name, platinum pieces, gold pieces, silver
pieces, copper pieces, and the item name). The following is such a line
in the eqlog.txt file:
[Sat Aug 12 01:07:18 2000] Innkeep Juna tells you, 'That'll be 3 silver 1 copper per Ration'.
We find the merchant name using the EACH
command, looking for any line contains "tells you, 'That'll be "
and extracting the characters of the "] " string, and up to but
not including the " tells you, 'That'll be " string. Next, we need to find
out the cost of the item. Since the cost begins immediately following the
last part of the previous EACH
command, we use an UPTO command
to find the number of each coins we need. Finally, note that we use the
APND
command since items which are stackable are reported as price per
item,
where as items which are not stackable are reported as price for
the item. The following is a line from the eqlog.txt file showing
the format used for a non-stackable item:
[Sat Aug 12 01:07:23 2000] Innkeep Juna tells you, 'That'll be 6 silver 3 copper for the Small Lantern'.
The example routine HotBot
converts an Html page generated from a search at http://www.hotbot.com
in to a spreadsheet of the relevant information for each record. Note that
because of the first web page reported also contains a ">next</a>"
string, this routine does not include the first web page reported. This
can be easily accomplished by using a FIND
… FEND command prior to the UNTL
command. I may show how to do this in the next update of this document.
First, we start off by getting past all the header
information. This is done by SKIPping
to the "<DIV" string, which also contains the first reported
web page. To get around the ">next</a>" string that is found
on that line, we use the LINC
command to advance to the next one.
Since every reported web page is on its own line,
we use the UNTL command to process
every one of the following lines. The rest you can pretty figure out by
looking at a web page generated by the hotbot engine.
QParse.Ini [Options]
| Option |
Command Line Switch |
OPTS Comand switch |
| CASESENSITIVE=0|1 |
/C:0|1 |
"/C",0|1 |
| IGNOREBLANKLINES=0|1 |
/I:0|1 |
"/I",0|1 |
| OUTPUTFORMAT=0|1|2|3 |
/O:0|1/2/3 |
"/O",0|1/2/3 |
| ENCODEFORMAT=0 |
/E:0 |
"/E",0 |
| MAXFIELDSIZE=num |
/L:num |
"/L",num |
| LINETERMINATOR=c-string |
/T:c-string |
"/T",c-string |
| FIELDSEPARATOR=c-string |
/S:c-string |
"/S",c-string |
| QUOTEMARK=c-string |
/Q:c-string |
"/Q",c-string |
| WHITESPACE=c-string |
/W:c-string |
"/W",c-string |
| APPEND=0|1 |
/A:0|1 |
n/a |
Command Properties
|
Syntax
|
|
| Cmnd |
Parm1 |
Parm2 |
Parm3 |
Parm4 |
Type |
| AFTR |
after match |
up to match |
[heading] |
|
Output |
| APND |
fields |
[heading] |
|
|
Output Control |
| BFOR |
before match |
back to match |
[heading] |
|
Output |
| DATE |
format |
[heading] |
|
|
Format |
| EACH |
line contaiins |
after match |
up to match |
[heading] |
Output Conditional |
| EEND |
|
|
|
|
Control |
| FALS |
|
|
|
|
Output |
| FEND |
|
|
|
|
Control |
| FIND |
occurances |
match |
[heading] |
|
Conditional |
| FROM |
after match |
up to match |
[heading] |
|
Output |
| FRST |
fields |
[heading] |
|
|
Output Control |
| HOME |
|
|
|
|
Cursor |
| IGNR |
fields |
|
|
|
Output Control |
| LAST |
contains |
after match |
up to match |
[heading] |
Output |
| LDEC |
|
|
|
|
Cursor |
| LEND |
|
|
|
|
Cursor |
| LINC |
|
|
|
|
Cursor |
| LINE |
contains |
[heading] |
|
|
Output |
| LKUP |
return field |
lookup field |
csv filespec |
[heading] |
Format |
| MSTR |
start position |
end position |
[heading] |
|
Output |
| NEXT |
lines ahead |
after match |
up to match |
[heading] |
Output |
| OPTS |
option switch |
option value |
|
|
Control |
| ONGO |
new line |
|
|
|
Control |
| ONEA |
new line |
|
|
|
Control |
| ONFI |
new line |
|
|
|
Control |
| ONST |
[heading] |
|
|
|
Output |
| ONUN |
new line |
|
|
|
Control |
| NOGO |
new line |
|
|
|
Control |
| NOEA |
new line |
|
|
|
Control |
| NOFI |
new line |
|
|
|
Control |
| NOST |
[heading] |
|
|
|
Output |
| NOUN |
new line |
|
|
|
Control |
| PREV |
lines behind |
after match |
up to match |
[heading] |
Output |
| REND |
|
|
|
|
Control |
| SKIP |
occurances |
match |
|
|
Control |
| SLCT |
start |
non-match |
string array |
[heading] |
Format |
| TRUE |
[heading] |
|
|
|
Output |
| UEND |
|
|
|
|
Control |
| UNTL |
match |
[heading] |
|
|
Conditional Loop |
| UPTO |
up to match |
[heading] |
|
|
Output |
| VALU |
non-number |
number type |
[heading] |
|
Format |
| XIST |
exists |
start line |
end line |
[heading] |
Output |
Commands in Detail
-
A F T R
-
Use the AFTR command to find a string that starts after
the first matching after match and upto but not including the following
upto
match, where a non-existant upto match returns everything to the end
of the line. This command returns a Null (or empty string) if Parm1 (after
match) is Not found.
-
A P N D
-
Use APND to merge multiple routine fields into a single
output field. This is useful when the source file has two or more different
formats for the same data. See the Test.ini file
routines, MerchantItemPrice and HotBot
above. NOTE: For APND to work correctly, you must include
even non-output fields in the # of Cmds to Merge parameter.
-
B F O R
-
The BFOR command is used when the required data is presented
before the data's specifier. This command returns a Null (or empty string)
if Parm1 (before match) is Not found. If Parm2 (back to match)
is not found, then it goes back to the beginning of the line. This command
searches the line from the end of the line backwards for the first occurrence
of Parm1 (before match) and from the beginning of the found Parm1
(before match) until the end of the first ocurrence of Parm2 (back
to match).
-
D A T E
-
Use the DATE command to reformat the following field into
a standard date formatted field of the form "Mn/Dy/Yr Hr:MN:SC". The structure
of the format field (Parm1) is a literal string which may contain the following
escape sequences to specify where the various components of the date can
be found:
-
D1 = day of month (no leading zero);
-
D2 = 2-digit day of month (leading zero);
-
D3 = 3-character day of week;
-
D4 = day of week (unabbreviated);
-
M1 = numeric month (no leading zero);
-
M2 = 2-digit numeric month (leading zero);
-
M3 = 3-character month abbreviation;
-
M4 = unabbreviated month name;
-
Y2 = 2-digit year (numbers below 80 are assumed to be 20##, while
numbers 80 or above are assumed to be 19##);
-
Y4 = 4-digit year;
-
H1 = twelve-hour date format (no leading zero);
-
H2 = 2-digit 24-hour format; N1 = minutes (no leading zero);
-
N2 = 2-digit minutes (leading zero);
-
S1 = seconds (no leading zero);
-
S2 = 2-digit seconds (leading zero);
-
S3 = floating point value of seconds (no leading zero);
-
S4 = 2-digit mantissa floating point value of seconds (leading zero);
-
AP = am/pm without periods (i.e. AM, PM, am, pm);
-
AM = a.m./p.m. with periods (i.e. a.m., p.m., A.M., P.M.)
-
E A C H
-
The EACH command is an Output Conditional command. This
means that it is used to extract data for output and is a condition tester
for whether or not the record can be displayed. Whenever a line contains
the same string as Parm1 (line contains), then an internal flag
is set, and the complete record is outputted whenever the next EEND,
FEND,
or REND is encountered. This function along with FIND
and UNTL are the main trigger functions that allow a record
to outputted. If Parm2 (after match) is not found, then the extracted
data starts from the beginning of the line, while if Parm3 (up to match)
is not found, a Null, or empty string is returned.
-
E E N D
-
EEND marks the end of an EACH loop. Actually,
it marks the end of either an EACH or a FIND
loop. If a previous EACH or FIND command
had a match to Parm1 (line contains), then the complete record from
the beginning command to the current command is displayed, the flags are
reset and command execution continues if there are any commands that follow.
-
F A L S
-
FALS returns an Xist False String (see /X0:
command line switch). Currently both the TRUE and FALS
commands are fairly useless except for when used within an APND
or FRST command to actually have something other than a
Null string returned. Under all circumstances currently both the TRUE
and FALS commands return values. Note that the ON??
and NO?? commands do not have any control during the output
phase (refer to the ONGO and/or NOGO command).
-
F E N D
-
EEND marks the end of a FIND loop. Actually,
it marks the end of either an EACH or a FIND
loop. If a previous EACH or FIND command
had a match to Parm1 (line contains), then the complete record from
the beginning command to the current command is displayed, the flags are
reset and command execution continues if there are any commands that follow.
-
F I N D
-
The FIND command is a Conditional command. It is used primarily
as a flag to indicate if a record should be outputted or not. Secondly,
it is used skip several occurances of a string.
-
F R O M
-
Use the FROM command to scan an entire line and extract
the data between the two matching parameter string, Parm1 (after match)
and Parm2(up to match). This command is basically the same as the
EACH
command except it doesn't enable output and doesn't have a line contains
paramter. The only other difference is that
FROM returns
a Null string if either Parm1 (after match) or Parm2 (up to match)
are not found.
-
F R S T
-
Similar to APND, the FRST command converts
several fields in to a single field. However, instead of simply appending
each field, FRST outputs only the first field that is not
empty. The exception being, if all the following fields are empty, then
FRST
will return a Null string.
-
H O M E
-
Use the HOME command to move the cursor to the beginning
of the line. This command modifies the character position only.
-
I G N R
-
You can use the IGNR command to block the output of Parm1
(fields)
following commands used for testing for the ON?? and NO??
commands.
-
L D E C
-
Use the LDEC command to go back to the previous line. This
command modifies the line position only.
-
L E N D
-
Use the LEND command to move the cursor to the end of the
line. This command modifies the character position only.
-
L I N C
-
Use the LINC command to go to the next line. This command
modifies the line position only.
-
L I N E
-
The LINE command is used to output the entire line as a
single record when the line contains Parm1 (line contains). This
is useful in a routine as the only Output command to extract specific lines.
This command does not modify either the line position or column position.
-
L K U P
-
Use the LKUP command to format the following field. This
is done by finding the first match of the following command's return value
in the Parm3 (csv filespec) specified *.csv file, under the Parm2
(lookup field) column, and returning the corresponding Parm1 (return
field) value. If no match can be found, then the LKUP
command returns a Null (empty) string.
-
L N L T
-
Use the LNLT command to move the cursor Parm1 (spaces)
character(s) to the left. This command modifies the character position
only.
-
L N R T
-
Use the LNRT command to move the cursor Parm1 (spaces)
character(s) to the right. This command modifies the character position
only.
-
M S T R
-
When working with fixed or preformatted data, such as a DOS directory listing,
use the MSTR command. Parm1 (start position and
Parm2(end position) are numeric values that represent character
positions from the beginning of the line when positive, or from the end
of the line when negative. This command does not modify either the line
position or column position.
-
N E X T
-
The NEXT command is just like the FROM
command except that it doesn't test the current line, but rather the Parm1
(lines ahead) line. Secondly, if Parm1 (lines ahead) is negative,
it will still test positive (lines ahead) line, and also if Parm1
(lines ahead) is negative and Parm3 (up to match) is not
found, then command returns a Null string.This command does not modify
either the line position or column position. If Parm2 (after match)
is not found, then it starts from the beginning of the line. If Parm3 (up
to match) is not found, then it extracts to the end of the line.
-
O P T S
-
Use the OPTS command to automatically set or change the
program options while the function is running. The Parm1 (option switch)
parameter takes the same form as the command line switches without the
colon or value following the colon. For example, to set the option for
case-sensitive searching, use OPTS, "/C", 1.
-
There are a couple of differences in format of OPTS command
and the command line switches. The first of these is that the /H: print
header switch is ignored as an OPTS option switch. Furthmore,
the command line switches /T:, /S:, and /W: that normally take c-encoded
strings as parameters, under the OPTS command only take
numeric values representative of the ASCII code for the character(s). If
the value is between 0 and 255, inclusive, then it is considered to be
a single character string. Values less than 0, down to -32,768, and values
from 256 to 32,767 are considered a 2-character string, where the first
character's ASCII value is equal to ((Parm2/256) AND 255) and the second
character's ASCII value is equal to (Parm2 AND 255). For most cases where
a 2-character string is required you will probably want to write the paramter
value in BASIC hexadecimal format of &H????, where ? is a single hexadecimal
digit. Using this format a string of return (&H0D) character followed
by the new line (&H0A) character would be represented as &H0D0A.
In other words, Parm2 (option value) must be an integer value between
-32,768 and 32,767, and for those option switch cases where a string
is expected, the value of Parm2 is interpreted as single character string,
if and only if, the value of Parm2 is between 0 and 255, otherwise is interpreted
as a dual-character string with the high-byte being the first character.
-
A couple of switches not included on the command line are: /END
which when set allows commands that normally return a Null string when
the up to or back from parameters is not found, will instead use the end
of the line as the match. Commands which this switch affects are: BFOR,
EACH,
FROM,
and UPTO.
-
Similarly, there is also the /START switch which when set
allows commands that normally return Null when the after parameter is not
found, will instead start from the beginning of the line. Commands affected
by this switch include: AFTR and FROM.
-
O N G O
-
Use the ONGO command to change the current line new
line lines, when the previous Non-Control command in a function is
successful (has a Non-Null return string). New line is relative
to the current line, so that a negative new line will change the
current line to a previous, while a positive new line will skip
lines forward.
-
O N E A
-
Use the ONEA command just like the ONGO
command. The only difference is that ONEA tests to see
if any previous EACH command was successful. That is if
any previous EACH command has a Non-Null return value,
then ONEA will change the current line the relatively specified
new
line.
-
O N F I
-
Use the ONFI command is just like the ONEA
command. The only difference is that ONFI tests all previous
FIND
commands.
-
O N S T
-
Use the ONST command similar to the ONGO
command. The main difference is that ONST instead of changing
the current line, will instead set its return value to Xist True, otherwise
it will return an Xist False value (refer to the /X0:
and /X1: command line switches for Xist values).
-
O N U N
-
Use the ONUN command is just like the ONEA
command. The only difference is that ONUN tests all previous
UNTL
commands.
-
N O G O
-
Use the NOGO command to change the current line new
line lines, when the previous Non-Control command in a function is
Not successful (has a Null return string). New line is relative
to the current line, so that a negative new line will change the
current line to a previous, while a positive new line will skip
lines forward.
-
N O E A
-
Use the NOEA command just like the NOGO
command. The only difference is that NOEA tests to see
if any previous EACH command was Not successful. That is
if any previous EACH command has a Null return value, then
NOEA
will change the current line the relatively specified
new line.
-
N O F I
-
Use the NOFI command is just like the NOEA
command. The only difference is that NOFI tests all previous
FIND
commands.
-
N O S T
-
Use the NOST command similar to the NOGO
command. The main difference is that NOST instead of changing
the current line, will instead set its return value to Xist True, otherwise
it will return an Xist False value (refer to the /X0:
and /X1: command line switches for Xist values),
whenever the previous Non-Control command was Not successful (i.e returns
a Null value).
-
N O U N
-
Use the NOUN command is just like the NOEA
command. The only difference is that NOUN tests all previous
UNTL
commands.
-
P R E V
-
The PREV command is just like the NEXT
command above, except that it tests Parm1 (lines back) line.
-
R E N D
-
REND marks the end of the record and acts just like EEND
and/or FEND commands. Actually execution may continue after
the REND command, if there any other commands which follow
it. The reason behind REND's operation, is that it allows
a file to contain multiple records, that have different but definable structures.
-
S K I P
-
SKIP behaves exactly like the FIND command except that
it doesn't set an 'Ok to output the record' flag. In order for record to
outputted, the routine must contain at least one of the following groups:
((EACH or FIND) and (EEND
or FEND or REND)) or (UNTL
and UEND)
-
S L C T
-
SLCT is used to translate the numeric value of the following
command's return value into something else. Parm1 (start) is the
starting value for Parm3 (string array). For most cases 0 or 1 should
be the value used for Parm1 (start), depending upon whether or not
the following command's return value begins at 0 or 1, respectively. If
the following command's return value is a Null string, then Parm2 (non-match)
is returned, otherwise a value from Parm3 (string array) is returned.
Parm3 (string array) is a semi-colon seperated array of strings.
The index of the string that is returned is equal to the following command's
return value minus Parm1 (start) plus one. Thus if the following
command's return value is 0 and the SLCT command's Parm1
(start) value is 0, then the first sub-string of Parm3 (string
array) is returned.
-
T R U E
-
TRUE returns an Xist True String (see /X1:
command line switch). Currently both the TRUE and FALS
commands are fairly useless except for when used within an APND
or FRST command to actually have something other than a
Null string returned. Under all circumstances currently both the TRUE
and FALS commands return values. Note that the ON??
and NO?? commands do not have any control during the output
phase (refer to the ONGO and/or NOGO command).
-
U E N D
-
The UEND command checks to see if the previous UNTL
command test had been satisified. If not, then it outputs the record data
between the first UNTL command and the current UEND
command, then reads the next line and restarts the routine execution from
the first UNTL command.
-
U N T L
-
The UNTL command is a Conditional Loop. Output is enabled
for all the commands between the first UNTL and the first
UEND,
until a line is found containing Parm1 (match).
-
U P T O
-
The UPTO command extracts all the data from the current
position to the start of a matching Parm1 (up to match). If a match
is not found then a Null string is returned.
-
V A L U
-
The VALU command converts the return value of the following
command to a standard csv text-based numeric value. This is useful when
extracting data that is in one of the following formats: BASIC formats:
(&H? for hexadecimal values; &O? for octal values #.####E## for
floating point values); C formats: (0x? for hexadecimal values; 0? for
octal values). No other formats are currently supported, however, at the
minimum ASM formats (i.e. 0?h for hexadecimal values; 0?o for octal values;
and 0?b for binary values) is planned. If you have ideas for other formats
to support, email me at NookieMonster@MailAndNews.com.
-
X I S T
-
For Yes/No or True/False types of data, use the XIST command.
It searches relative to the current line, all lines between Parm1 (start
line) and Parm2 (end line) in ascending order, and returns the
ExistTrue value (see /X1: switch) if any of the
lines contains Parm3 (exists) or the ExistFalse value (see /X0:
switch).
Got comments, suggestions, bug reports, cool *.ini
functions you have come up with, or just something to say, then send them
to NookieMonster@MailAndNews.com
The original page is at http://thunder.prohosting.com/~nmonster/qparse/qparsed.html.