AS/400 Work Management Problem Solving

In this section, we will look at some common operations issues relating to Work Management.

1. Where the heck is my report?

It is easy to lose reports on the AS/400, or as we call them in AS/400-speak, "spooled files". One of the most frequent support calls I hear, usually starts like this "I ran a report and now I can't find it...". This is a common complaint especially among the Windows users who are used to hardcopy popping out on their local printer.

Here are a few tips to track down spooled files.

1. Sometimes there is no report to find.  Is the user sure he submitted the request.  Different commercial application software packages require different actions to submit a report.  Some require F10 to confirm, others will print after simply hitting the <Enter> key.  If the user submitted it by using the F3(cancel) key, there is good chance that he will be waiting forever!  If he requested sales data for year 2020, the extraction program may have found no data and shut down without a single page in the spool file. Good application software should explicitly tell the user that his report has been submitted, and give him at least one page indicating that no data was found for his request.

2. On the user's session try "Work with All Spooled Files" -  WRKSPLF *CURRENT  to determine if the user has any files in his out queue waiting to be printed.  Most packaged applications have a menu options for this.  To look at another users' spooled files, use WRKSPLF username, (this may not be permitted in some security environments). As a last resort, try WRKSPLF *ALL which will show you all spooled files, but be prepare to wait a while for the results of this command.      

3. If you know the job name,  then you can look at recent occurrences of the running of that job. For example, if the job is called CRDNOTE, you can type WRKJOB CRDNOTE and see all recents runs of job CRDNOTE. Note this will only show jobs that generate a spool file or job log.

4. You can use  WRKACTJOB to see if the job is still running (Status = RUN), or has a message waiting (Status = MSGW). A MSGW (message waiting) is a problem. See below for more information on MSGW.

5.Use  WRKUSRJOB username to see if the job is still queued up to run. Look for Type = Batch and  Status = JOBQ

6. If the job is still on the Job Queue, then check WRKJOBQ to see if there are many jobs waiting, and check WRKACTJOB again to see what is "holding up traffic".

2. Level check errors

The AS/400 comes with an interesting built-in mechanism to detect file integrity problems.  When AS/400 programs are compiled, the compiled program object encapsulates information about files that it references.  If the structure of the file changes, then the program will refuse to run.  This can prevent messy file corruption. (Just ask a UNIX/Basic programmers about messy file corruptions). This feature can be disabled, but that is considered to be a bad practice. 

Sometimes it is necessary to change the structure of a file - that is add/change field names, fields type and length, or record name.  For instance, Y2K projects in many companies involved widespread adding or expanding of existing fields.  When the structure of a file has been changed, then every program that touches it has to be re-compiled.  If a program is run, that  has not been re-compiled, then the operating system will halt the program and issue a message such as -

 
Error message CPF4131 appeared during OPEN for file NEWFILE (C S D F).

 

This tells you the message ID and the file in question.  If you press F1 to get more details you may find the offending program and library:

 
Message . . . . : Error message CPF4131 appeared during OPEN for file NEWFILE (C S D F). 
Cause . . . . . : RPG procedure OLDPGM in program MYLIB/OLDPGM received 
the message CPF4131 while performing an implicit OPEN operation on file NEWFILE

 

Want to learn more about this error message? Just type - 

DSPMSGD CPF4131

You will learn that you are dealing with 

 

Message ID . . . . . . . . . : CPF4131 
Message file . . . . . . . . : QCPFMSG 
Library . . . . . . . . . .  : QSYS 
Message text . . . . . . . . : Level check on file &2 in library &3 with member &4.

This error typically occurs a day or so after your programmer goes on vacation and means that she was doing a bunch of quick fixes in the production system just before leaving.  When she comes back, she will likely recompile program OLDPGM in MYLIB library and everything will work fine next time.

Level check errors may also occur when a library list gets mixed up combining production programs and test data. 

3.  Decimal data error

The AS/400 operating system will become unhappy when a program tries to perform numeric type operations on a non-numeric character such as a letter or blank space. When this occurs, the program will halt with an Error message.  This usually occurs when there is garbage in the file or if the program is doing something silly.  

The 'garbage in the file' situation is typically found in older application software migrated from the old System36 where it was easier to write bad data to a file.  Also new EDI applications and data feeds from external systems have the tendency of pushing bad data into your system.  Usually the problem is when the field is packed numeric but wasn't initialized for all the records. 

The error message for numeric type operations on a non-numeric character  will look something like this for an on-line application:

 

Decimal data error. Decimal-data error occurred (C G D F). 
? C 
Decimal-data error occurred (C G D F). 
? C 
Application error. MCH1202 unmonitored by AP9012 at statement 0000000045, 
instruction X'0000'. 

and like this on a batch job:

 

Additional Message Information 

Message ID . . . . . . : RNQ0907 Severity . . . . . . . : 99 
Message type . . . . . : Inquiry 
Date sent . . . . . . : 27/10/01 Time sent . . . . . . : 21:48:56 

Message . . . . : Decimal-data error occurred (C G D F). 
Cause . . . . . : RPG procedure AP9012 in program QPGMR/AP9012 found a 
decimal-data error at statement 45. A packed or zoned value does not contain 
valid numeric data. A digit and/or sign is not valid. 
Recovery . . . : Contact the person responsible for program maintenance to 
determine the cause of the problem. A response of 'D' will cause an RPG 
formatted dump which may be useful in determining which field has the 
decimal-data error. However, it may be an intermediate value that has the 
error, if the error occurred in an expression. 
Possible choices for replying to message . . . . . . . . . . . . . . . : 
D -- Obtain RPG formatted dump. 
More... 
Type reply below, then press Enter. 
Reply . . . . 


The error message will give you some indication of the problem. You can tell that the program that failed was called AP9012. You can even see that the problem occurred at line 45.  There is a good chance that I can go to source code of AP9012 and check out what line 45 was trying to do.  If AP9012 is an RPGLE type program, you may have to re-compile the program to QTEMP or a test library to determine what line 45 is doing.  This is  because IBM, in their wisdom, decided that RPGLE program source line numbers and compiled program line numbers did not need to be the same.

If the error is not obvious, you might check the file/field for bad data or go to the program dump (spooled file: QPPGMDMP), if I have one and check the values of the variables that you found in line 45.

Some programmers will overcome this problem by compiling the program by adding a header in their program "FIXNBR(*INPUTPACKED)" or  with the IGNDECERR(*YES) option. These solutions are usually  bad ideas because they will cause the program to ignore records with bad data. This may include a critical field in a $100,000  EDI order from JC Penney.  In such as case  'Ignore'  may not have been what the user had it mind when he came down yelling "Just fix it Now!"

4. Unable to allocate a record/Lock Waiting

The AS/400 operating system has built in mandatory record locking.  If a record is being held for update, the AS/400 operating system will prevent a second job from reading the same record for update. For example, if  'User A'  is updating a customer's phone number, 'User B' cannot update the same customer's address at the same time (assuming this information is held within the same record).

This error would generate the following error:

 

Job 418816/USERNAME/QPADEV001M started on 09/16/02 at 18:08:07 in subsystem 
Message queue QUEUENAME is allocated to another job. 
Unable to allocate a record in file DMNU01 (R C G D F). 

 

A look at the job log will show something like this    

 

Record 10 in use by job 418777/USERNAME/QPADEV001J. 
? C 
Record 10 in use by job 418777/USERNAME/QPADEV001J. 
? C 
Unable to allocate a record in file DMNU01. 
Function check. RNX1218 unmonitored by MYPROGRM at statement 0000000008, 
instruction X'0000'. 
Unable to allocate a record in file DMNU01 (R C G D F). 
Unable to allocate a record in file DMNU01 (R C G D F).

 

 

5. File not found

6. no records found in file

7. update without  read

...more to come when I find the time...

 I appreciate your feedback and comments.  ( [email protected]  )

 

Link to the AS/400 Learnin' Project     [Tracked by Hitmatic]

Hosted by www.Geocities.ws

1 1