Preface
INFOPAK IMS allows compression of DL1 databases running under MVS. The DL1 user EXIT provides compression of HDAM, HISAM or HIDAM data bases.
This document is intended primarily for database administrators and system administrators who want to :
-
evaluate the space savings they can expect from the use of INFOPAK, and
-
use the INFOPAK compression system.
The installation tape included with this manual contains all the modules necessary to install INFOPAK. The modules are date protected on test tapes.
note
The installation of INFOPAK IMS is described in the "Installation manual" HP106.
An introduction to INFOPAK IMS
Compression overview
Within the DL1 Database System, a user exit is provided for segment compression during the transfer between the application program I/O area and the data base buffer pool.
When a user program wants to retrieve or store segments, it issues a request to DL1. DL1 examines the request from the program and, based on this request, goes out to the physical storage and finds or stores the information for the user program.
While examining the request from the user program, DL1 also checks to see if this segment is coded for compression. If compression is indicated, DL1 calls the compression routine and hands the segment over to this routine to compress or decompress accordingly. After processing the routine hands the segment back to DL1 and DL1 continues.
The segment data as being seen by the user includes the key of the segment. The key is also user data and can optionally be compressed if the user wants to do so. Be careful if optional key compression is used, because performance could be affected if DL1 has to decompress each key of every segment involved in a sequential search. The DBA should carefully consider the performance factor if he decides to compress keys.
Fixed length segments
If your segments are described as fixed length in your DBD, they will remain fixed length as seen by the user program. Compression is transparent to the user program. DL1 physically stores compressed segments in variable length format. In the compression exit, DL1 takes care of the transition from fixed to variable. If a fixed length segment is being compressed, DL1 hands the fixed length segment to the compression routine for compression. The routine performs the compression and places a 2 byte variable length field at the front of the segment and then hands the compressed segment back to DL1 for physical storage. Decompression works just the opposite way.
Variable length segments
If the segment is variable to begin with, the same process is used. The existing variable length field value is just changed for compression and restored during decompression. In the compression exit, if INFOPAK does not compress the segment for some reason, it adds a one byte indicator at the front of the segment data before it hands the segment to DL1 for storage. During decompression this one byte indicator is stripped off and the segment is handed back in its original form.
INFOPAK Overview
INFOPAK and the application programs
The operation of the compression routine is transparent to the application programs. After INFOPAK is installed and the data base is compressed, no further maintenance is necessary. INFOPAK does not use compression tables, but rather is a reentrant module that performs data compression through an algorithm independent of the type of data contained in each segment. INFOPAK does not require or use valuable CSA Virtual Storage Space.
Any physically defined segment of a data base may be specified by the DBDGEN process as being compressible. DL1 does NOT allow compression for INDEX, HSAM, SHSAM, or SHISAM data bases.
The routine to be used for segment compression is named by the operand COMPRTN= in the SEGM statement in the DBDGEN operation for the data base. Multiple specifications of the same compression routine result in its being loaded only once. INFOPAK is reusable and re-entering for online use.
Because INFOPAK uses the standard DL1 compression exit facility, all DL1 utilities will function without errors. During a normal UNLOAD cycle, the segments will be decompressed and then compressed during RELOAD.
A listing of the IMS Data Base Utilities and how they handle compressed data bases is included in the back of this manual.
Performance of INFOPAK
In most cases total CPU resources will not be affected. The CPU compression overhead being offset by gains in CPU I/O, reduced EXCP's, etc. Some users have reported 10-15% better response times with their online systems.
During compression, INFOPAK uses high speed scanning to determine what type of data is in the segment before it compresses. Fields of the same type are grouped for global packing and unpacking of at least 4 characters at a time. Different data types are compressed in different ways. Up to 31 consecutive fill characters, such as blanks, decimal or binary zeros can be encoded on a single byte. Empty packed decimal fields will be encoded on a single byte. Internal compression tables are used to encode other data types such as numerics and alphanumerics.
INFOPAK always considers performance versus compression when it scans the segment. In some cases, fields within the segment may not be compressed, if the overhead involved does not justify the space savings. Small segments 10-20 bytes long may not be compressed at all if the space saving is not justified.
TESTPAK
TESTPAK is a utility program that runs under DL1, in pure batch mode or in a BMP IMS environment.
TESTPAK executes a total or partial sequential read of one or more databases, and compresses each of the segments read. TESTPAK prints reports showing the compression percentage by segment type and for total data base.
The TESTPAK report can then be used to determine the amount of DASD space to allocate for the compressed data base when you RELOAD using INFOPAK.
The installation procedure copies the TESTPAK module into the library referenced by the MVSLOAD ddname (See "PRODUCT INSTALLATION JCL" supplied with the tape).
Using TESTPAK
TESTPAK can run either in pure batch by executing the standard DL1 batch procedure (example: DLIBATCH), or from a BMP Region of IMS.
Two parameters are required:
-
MBR=TESTPAK where TESTPAK is the name of the program to load under control of DL1.
-
PSB=XXXXXXX where XXXXXXX is the name of a physical PSB referencing the data bases on which the efficiency will be tested. The PCB's used for this PSB must be a reading PCB (PROCOPT=G or A). TESTPAK will default to the first PCB (01) unless you tell it to use another PCB by supplying a PCBNUM DD statement outlined below.
The macro PSBGEN of the PSB must specify COBOL or ASSEMBLER (parameter LANG=COBOL or ASSEM) and can be written for pure batch or to be compatible with IMS DC (parameter CMPAT=NO or CMPAT=YES).
TESTPAK requires the presence of the following DD statements:
-
PRINT
TESTPAK produces a report output record and the PRINT DD card must be supplied according to the following format:
//PRINT DD SYSOUT=* -
AUXDBD
Information about the segments is retrieved by TESTPAK from the DBD library referenced by the AUXDBD DD statement.
//AUXDBD DD DSN=IMSVS.DBDLIB,DISP=SHR -
PCBNUM (option)
PCBNUM is an optional DD statement that can be used to direct TESTPAK to use a PCB other than the first one, (default=01) and also permits you to introduce one or more parameter cards defining the data bases and how many segments to read for each data base. The format of the parameter card is as follows:
----+----1----+----2----
//PCBNUM DD *
nn dddddddd sssssssstttwhere:
{wrapper="1" role="DL"}
-
nn
This must be the number of a valid data base PCB within the PSB being used for this job.
If you are working in BMP, or with the parameter CMPAT=YES, nn must at least be equal to (02).
{wrapper="1" role="DL"}
-
Position :
Column 1
-
Length :
2 digits
-
Note :
Optional
-
-
dddddddd :
The name of the physical DBD in the DBDLIB, if the relative number nn has not been used.
{wrapper="1" role="DL"}
-
Position :
Column 4
-
Length :
8 characters
-
Note :
Optional
ssssssss :
Represents the number of segments to be read by TESTPAK for this data base. If this parameter is not present, TESTPAK reads all the segments in sequence and stops at the end of the data (return code GB).
{wrapper="1" role="DL"}
-
Position :
From column 12
-
Length :
maximum 8 digits
-
Note :
Required when CPU is specified
ttt :
Represents the KEYWORD "CPU" to obtain mean CPU time for segment compression and decompression.
{wrapper="1" role="DL"}
-
Position :
Start at least 8 columns after the first digit of the number of segments
-
Note :
When CPU is specified, the number of segments must also be present as a 8 digits number.
example
Example :
This example shows the use of PCBNUM cards for three different data bases.
//PCBNUM DD *
03 50
BASEXYZ
04 750
05 00002500 CPU
/*
For the above example:
-
On the first data base, of which the PCB is third in the PSB, TESTPAK will read 50 segments.
-
On the second data base, DBD named BASEXYZ, TESTPAK will read all the segments.
-
On the third data base, of which the PCB is in fourth relative position in the PSB, TESTPAK will read 750 segments.
-
On the fourth data base, of which the PCB is in fifth relative position in the PSB, TESTPAK will read 2500 segments and give the mean CPU time for compression and decompression of this segment.
A JCL example for executing TESTPAK is as follows:
//INFTSTPK JOB .................
//EXECIMS2 EXEC DLIBATCH,MBR=TESTPAK,PSB=name-of-PSB
//GO.DFSVSAMP DD DSN=INFO.PARM.DATA(VSAMP),DISP=SHR
//GO.PRINT DD SYSOUT=*
//GO.AUXDBD DD DSN=INFO.DBDLIB,DISP=SHR
//GO.PCBNUM DD *
02 5
/*
//GO.ddbase DD DSN=dsn-data-base,DISP=SHR
note
According to the mode of use (batch or BMP), the presence of data base DD statements might be required. However, if you use dynamic allocation of IMS while working in BMP, the presence of DD cards for the data bases is not required.
The file referenced by the ddname DFSVSAMP describes the size and the number of buffers used by DL1. Its contents must be suitable to the size of blocks and/or CI's for the referenced data bases.
TESTPAK results
TESTPAK furnishes for the Data Base(s), and for the number of segments requested, four reports showing respectively the gains obtained within INFCMP00, INFCMPPE, INFCMPS0 and INFCMPZE. The reports have the same format, but the name of the compression routine is printed in the heading to tell them apart.
note
INFCMPZE is provided by INFOPAK hardware option.
The TESTPAK reports are as follows:
***************************************************************************+-+******************************************************
* INFOTEL I N F O P A K D L 1 MODULE INFCMPS0 |1| 02/25/88 : 14 H 57 PAGE : 1 *
***************************************************************************+-+*********************************************+-+******
* COMPRESSION STATISTICS DATA BASE NAME : IPOST2 |2| *
***************************************************************************************************************************+-+******
! SEGMENT ! SEGMENT LG ! NUMBER OF !NB OF BYTES IN! NUMBER OF BYTES OUT ! COMPRESSION GAINS !
! NAME ! MIN / MAX ! OCCURRENCES! ! W/O KEY / WITH KEY ! W/O KEY / WITH KEY !
!IPRIMP ! 4 / 16 ! 753 ! 12 048 ! 12 048 / 9 487 ! 0.00 % / 21.25 % !
!IPOSTE ! 4 / 46 ! 3 955 ! 181 930 ! 139 140 / 136 174 ! 23.52 % / 25.15 % !
!IPOSTT ! 4 / 50 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IPRIMT ! 4 / 47 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IDIVER ! 4 / 160 ! 292 ! 46 720 ! 8 255 / 6 723 ! 82.33 % / 85.61 % !
!ICARCO ! 4 / 71 ! 0 ! ! / ! 0.00 % / 0.00 % !
! ! ! ! / ! / !
!**** TOTAL BASE *** ! 5 000 ! 240 698 ! 157 272 / 155 635 ! 34.66 % / 35.34 % !
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
. |3| . |4| . |5| . |6| . |7| |8| . |9| |10| .
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
************************************************************************************************************************************
* INFOPAK PROPERTY OF INFOTEL *
************************************************************************************************************************************
***************************************************************************+-+******************************************************
* INFOTEL I N F O P A K D L 1 MODULE INFCMP00 |1| 02/25/88 : 14 H 57 PAGE : 1 *
***************************************************************************+-+*********************************************+-+******
* COMPRESSION STATISTICS DATA BASE NAME : IPOST2 |2| *
***************************************************************************************************************************+-+******
! SEGMENT ! SEGMENT LG ! NUMBER OF !NB OF BYTES IN! NUMBER OF BYTES OUT ! COMPRESSION GAINS !
! NAME ! MIN / MAX ! OCCURRENCES! ! W/O KEY / WITH KEY ! W/O KEY / WITH KEY !
!IPRIMP ! 4 / 16 ! 753 ! 12 048 ! 12 048 / 8 283 ! 0.00 % / 31.25 % !
!IPOSTE ! 4 / 46 ! 3 955 ! 181 930 ! 126 343 / 123 471 ! 30.56 % / 32.14 % !
!IPOSTT ! 4 / 50 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IPRIMT ! 4 / 47 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IDIVER ! 4 / 160 ! 292 ! 46 720 ! 5 852 / 4 388 ! 87.48 % / 90.61 % !
!ICARCO ! 4 / 71 ! 0 ! ! / ! 0.00 % / 0.00 % !
! ! ! ! / ! / !
!**** TOTAL BASE *** ! 5 000 ! 240 698 ! 144 243 / 136 142 ! 40.08 % / 43.44 % !
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
. |3| . |4| . |5| . |6| . |7| |8| . |9| |10| .
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
************************************************************************************************************************************
* INFOPAK PROPERTY OF INFOTEL *
************************************************************************************************************************************
***************************************************************************+-+******************************************************
* INFOTEL I N F O P A K D L 1 MODULE INFCMPPE |1| 02/25/88 : 14 H 57 PAGE : 1 *
***************************************************************************+-+*********************************************+-+******
* COMPRESSION STATISTICS DATA BASE NAME : IPOST2 |2| *
***************************************************************************************************************************+-+******
! SEGMENT ! SEGMENT LG ! NUMBER OF !NB OF BYTES IN! NUMBER OF BYTES OUT ! COMPRESSION GAINS !
! NAME ! MIN / MAX ! OCCURRENCES! ! W/O KEY / WITH KEY ! W/O KEY / WITH KEY !
!IPRIMP ! 4 / 16 ! 753 ! 12 048 ! 12 048 / 7 079 ! 0.00 % / 41.25 % !
!IPOSTE ! 4 / 46 ! 3 955 ! 181 930 ! 99 952 / 94 385 ! 45.06 % / 48.12 % !
!IPOSTT ! 4 / 50 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IPRIMT ! 4 / 47 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IDIVER ! 4 / 160 ! 292 ! 46 720 ! 5 457 / 3 462 ! 88.32 % / 92.59 % !
!ICARCO ! 4 / 71 ! 0 ! ! / ! 0.00 % / 0.00 % !
! ! ! ! / ! / !
!**** TOTAL BASE *** ! 5 000 ! 240 698 ! 117 457 / 104 926 ! 51.20 % / 56.41 % !
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
. |3| . |4| . |5| . |6| . |7| |8| . |9| |10| .
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
************************************************************************************************************************************
* INFOPAK PROPERTY OF INFOTEL *
************************************************************************************************************************************
***************************************************************************+-+******************************************************
* INFOTEL I N F O P A K D L 1 MODULE INFCMPZE |1| 02/25/88 : 14 H 57 PAGE : 1 *
***************************************************************************+-+*********************************************+-+******
* COMPRESSION STATISTICS DATA BASE NAME : IPOST2 |2| *
***************************************************************************************************************************+-+******
! SEGMENT ! SEGMENT LG ! NUMBER OF !NB OF BYTES IN! NUMBER OF BYTES OUT ! COMPRESSION GAINS !
! NAME ! MIN / MAX ! OCCURRENCES! ! W/O KEY / WITH KEY ! W/O KEY / WITH KEY !
!IPRIMP ! 4 / 16 ! 753 ! 12 048 ! 12 048 / 8 334 ! 0.00 % / 30.82 % !
!IPOSTE ! 4 / 46 ! 3 955 ! 181 930 ! 109 723 / 105 792 ! 39.68 % / 41.18 % !
!IPOSTT ! 4 / 50 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IPRIMT ! 4 / 47 ! 0 ! ! / ! 0.00 % / 0.00 % !
!IDIVER ! 4 / 160 ! 292 ! 46 720 ! 8 237 / 6 813 ! 82.36 % / 85.41 % !
!ICARCO ! 4 / 71 ! 0 ! ! / ! 0.00 % / 0.00 % !
! ! ! ! / ! / !
!**** TOTAL BASE *** ! 5 000 ! 240 698 ! 130 008 / 120 939 ! 45.98 % / 49.75 % !
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
. |3| . |4| . |5| . |6| . |7| |8| . |9| |10| .
. +-+ . +-+ . +-+ . +-+ . +-+ +-+ . +-+ +--+ .
************************************************************************************************************************************
* INFOPAK PROPERTY OF INFOTEL *
************************************************************************************************************************************
-
Name of the compression routine
-
Name of DBD of the data base used (DBDNAME of DBDGEN)
-
Name of each segment read (SEGM of DBD)
-
Length of the segment:
-
Fixed length segments will have one number for segment length.
-
Variable segments will have two numbers, minimum and maximum lengths found.
-
-
Number of occurrences of each segment type read by TESTPAK and the total number of segments in the data base.
-
Number of bytes that each segment type uses. Variable segments were added up as TESTPAK read them.
-
Number of bytes after data only compression (without key compression). This is the actual result of INFOPAK on segment data when keys are not compressed (default) and for the entire base.
-
Number of bytes after full compression (including key compression). This is the actual result of INFOPAK when key compression is used and for the entire base.
-
Data only compression gains: this is for non-key compression, the result of INFOPAK on segment data when keys are not compressed:
[9] = ([6] - [7]) / [6] -
Full compression gains: this is for key compression, the result of INFOPAK when key compression is used:
[10] = ([6] - [8]) / [6]
The last item expresses the percentage of compression savings attained for each segment and for the entire base. This only takes into account segment data (what can be compressed) and not pointers, anchor points, free space etc.
The exact level of compression can be precisely measured by compressing an actual data base using the test version of the compression routine.
Error messages
TESTPAK will produce error messages:
{wrapper="1" role="DL"}
-
TPKI101
DBD NAME NOT FOUND IN DBDLIB
If a DBD name is not found in the DBDLIB.
-
TPKI102
RETURN CODE DL1 --> ABNORMALLY ENDED
In case of erroneous return code on one of the data bases.
-
TPKI103
DBD CONTAINS UNAUTHORIZED LOGICAL RELATION
If the DBD is a logical DBD.
EXPIRATION DATE EXCEEDED, EXECUTION CANCELED
Contact INFOTEL.
INFOPAK for IMS
Using INFOPAK
To use INFOPAK on an existing data base you must do the following:
-
Modify the DBD to include the compression routine.
-
Unload - reload the data base.
note
The special case of compression in place is discussed under Compression in place for DEDB databases.
Modifying the DBD
The module INFCMPPE is used with the following examples. To change to other compression levels, simply substitute INFCMP00 or INFCMPS0 for INFCMPPE in the examples.
Modify the DBD to include the routine at the physical segment level. Different routines can be used within the same data base. Any physically defined segment of the data base may be specified for compression in the DBDGEN process. Modify the access method if necessary - segment compression is not supported by DL1 for INDEX, HSAM, SHSAM, or SHISAM data bases.
Specify the COMPRTN keyword parameter in each SEGM macro for each segment that you want to compress:
SEGM NAME=sssss,BYTES=xx,COMPRTN=(INFCMPPE,DATA or KEY),...
Where:
{wrapper="1" role="DL"}
-
COMPRTN
Specifies the compression module.
The COMPRTN key word must only be used for physical segments. It is not compatible with the use of the SOURCE key word within the same SEGM macro.
-
INFCMPPE
The INFOPAK compression module.
-
DATA
Shows that only the data fields and secondary keys are compressed. The key and the data fields before the key are not compressed. (Default value)
-
KEY
Shows that all the fields are compressed. The KEY compression option should be used with restraint, since in case it is used, the routine will be invoked for partial decompression for every segment involved in a sequential search (such as direct access to a dependent segment). The key compression option is not allowed for the root segment of a HISAM database.
note
Fixed length segments :
If the compressed segments are defined as fixed length, DL1 will automatically store them as variable length in the data base. No modification to the DBD is necessary in this case and your application program will continue to see them as fixed length segments.
note
Variable length segments:
If any of the compressed segments are already defined as variable length, you must ensure that the maximum length of the segment (as seen by the application program) does not exceed the DBD-defined maximum length minus one. This one byte is generated by INFOPAK when it fails to compress data and space should be provided for it by adding one byte to the maximum length value (BYTES=MAX+1,MIN).
example
Examples :
Existing DBD
DBD NAME=DBD1,ACCESS=........
DATASET DEVICE=....,DD1=....,SIZE=....
SEGM NAME=SEGFIX,PARENT=....,BYTES=60,...
SEGM NAME=SEGVAR,PARENT=.....,BYTES=(99,10),....
DBD to compress the data base
DBD NAME=DBD1,....
DATASET DEVICE=....,DD1=....,SIZE=....
SEGM NAME=SEGFIX,PARENT=...,BYTES=60,COMPRTN=INFCMP00,...
SEGM NAME=SEGVAR,PARENT=....,BYTES=(100,10),
COMPRTN=(INFCMP00,KEY),...
Unloading - reloading the data base
Once the DBD has been modified to include INFOPAK, unload the data base with the old DBD using a standard UNLOAD Utility (DFSURGU0, for example).
Now reload the data base with the new DBD (using a new DEFINE Cluster, if necessary) with a PSB containing a PCB authorizing the initial load (PROCOPT=L or LS). The RELOAD Program can be the standard RELOAD Utility (DFSURGL0) or an application program.
The operation of the compression routine is transparent to the application programs. After INFOPAK is installed and the data base is compressed, no further maintenance is necessary. INFOPAK is DL1 release independent.
Because INFOPAK uses the standard DL1 compression exit facility, all DL1 utilities will function normally. During a normal UNLOAD cycle, the segments will be decompressed and then compressed during RELOAD. Some of the DL1 utility programs will copy the data base compressed. Reference the DL1 utilities section in the back of this manual for a listing of the utilities and how they handle compressed data bases.
The modules of INFOPAK
INFOPAK offers three levels of software compression depending on which module is used.
INFCMPPE is a high compression option that can be used with INFCMP00 and INFCMPS0 on the same DL1 data base to maximize compression and performance.
Employing Artificial Intelligence techniques, INFCMPPE uses high speed scanning to perform a 3-dimensional analysis of IMS data and then makes compression decisions based on the type of data in each segment. The scanning program may decide to use several different encoding methods within a segment for optimum performance/compression.
With INFCMPPE, most data is compressed using a modified Huffman Encoding Technique, (i.e., high frequency characters are represented by small bit strings). Other data encoding methods are used to maximize compression for numeric data, repeating characters, packed decimals, etc.
Although INFCMPPE uses a modified Huffman Encoding Technique, it does not require externally defined compression tables to describe the data to compress. Internal frequency character sets are used for each native language : English, German, French, etc.
INFCMPS0 must be used only for segments which contain binary data and fillers essentially.
INFCMP00 provides good compression with lower CPU use than INFCMPPE.
Choices
The choice of compression modules depends on:
-
The results of TESTPAK (Compression rates)
-
The number of modifications compared with the number of accesses on each segment.
-
The critical resource of the system (CPU or I/O).
Benchmark tests with INFCMPPE indicate up to 20% better compression than INFCMP00 with a corresponding increase in CPU overhead. In the same way, tests with INFCMPS0 indicate a lower compression rate than INFCMP00 and a corresponding decrease in CPU overhead. So, choose for each segment the best compressing module and make the coding on the macro SEGM of the DBD.
Choose the maximum compression rate when the space gain is important and the CPU overhead is not a major concern.
INFCMPS0 will be used for segments which contain binary data and fillers essentially.
Compression results
Tests with a typical data base, (name and address file) yielded the following:
Routines INFCMPS0 INFCMP00 INFCMPPE Machine instructions/byte
Compression percentage 40% 58% 70%
compression 6.2 8.7 11.0
decompression 2.5 3.5 6.8
Computing data base parameters
INFOPAK modifies physical segment size and therefore changes the length of the data base records. This may make some adjustments of the DBD parameters necessary. Following are some clues on how to compute these new parameters prior to reloading a compressed data base.
Computing the new DBR size
TESTPAK gives you an easy way to compute the average length of each segment by dividing the number of occurrences into the total size after compression.
To compute the new DBR size proceed as follows:
-
Compute new average segment size according to which compression module (if any) is used.
-
Compute average physical length by adding to each segment the length of the prefix.
-
Use the number of occurrences read by INFOPAK to compute the typical length of the DBR. TESTPAK results will indeed enable you to compute precisely how many of each segment type occurs to one root, but you may need to make adjustments as the database evolves.
-
The DBD parameters are now a function of this typical DBR size and the data base organization.
HISAM organization
The best choice for HISAM is to manage to hold as many DBR's as possible in the primary record (HISAM or KSDS).
This computation should take into account the rate of update between reorganizations as deleted segments remain in the physical record.
warning
HISAM root only data bases or HISAM data bases which have only the root segment in the primary record will not benefit from compression since physical record sizes are fixed (see next figure).
HDAM organization
In the majority of cases the strategy is to have all the RAPs (Root Anchor Points) in the RAA (root addressable area) distributed evenly. The number of RAPs in the RAA will not change with compression, but more RAPs will fit in a physical block. The size of the RAA must be adjusted with a ratio equal to the compression ratio for the RAP.
Where :
RMNAME=(mod,#RAPS,#BLKS,#BYTES)
| | | |
| | | +--The maximum number of bytes of a database
| | | record to be put in the root addressable
| | | area when records are inserted consecutively.
| | +--The number of blocks or CIs in the root
| | addressable area.
| +-The number of Root Anchor points in a block or CI.
|
|
+-The name of the randomizing module chosen.
and the new randomizing parameters are computed as follows:
#BYTES=average size of compressed Data Base Record from TESTPAK
#BLKS=(precompressed #blks)*(1-compression percent from TESTPAK)
#RAPS=(precompressed #RAPS)/(1-compression percent from TESTPAK)
Performance is also improved if the following is true:
-
The number of RAPs in a block or CI is equal to the number of roots in the block or CI.
-
Root segments are stored in key sequence.
-
All frequently-used dependent segments are in the RAA and in the same block or CI as the root.
HIDAM organization
Initial loading of a HIDAM data base will take into account compression with no changes in the DBD. One may adjust the FRSPC parameter of the DATASET macro to reflect smaller average segment length.
Compression in place for DEDB databases
Overview of INFFPCPE
The module INFFPCPE employs the same algorithm as INFCMPPE, but is able to make the difference between compressed rows and uncompressed rows. It is designed to provide:
-
a compress in place function for DEDB databases or for variable length segments in other DL1 databases.
-
a user exit function able to prevent compression under user-chosen circumstances.
The module INFFPCPE cannot be used with fixed length segments.
The compress in place function of INFFPCPE
With compress in place, you need not unload/reload the database. The compress in place module recognizes uncompressed segments and compresses them when an update is performed.
This design allows compression of large databases without the need to unload/reload. It can also be used to activate compression on large DEDB databases, an area at a time, utilizing existing planning for area reorganization.
If the module "INFFPCPE" does not invoke a user exit and after all segments have been compressed - you must be positively certain that all segments have been compressed - you can replace the compress in place module, "INFFPCPE", with the standard high compression module, "INFCMPPE", without performing an unload/reload.
The user exit function for INFFPCPE
A user exit is provided for "INFFPCPE" in order to prevent compression of select segments according to criteria known to the user exit.
If you add a user exit to "INFFPCPE" you must never change to another standard INFOPAK compression module, without a unload reload because the user exit can force some segments to be uncompressed.
TESTPAK will not report the gains for this module, because they are the same as "INFCMPPE" as long as the user exit is not activated.
Using Compress in Place
Only one step is necessary:
modifying the DBD.
Change the SEGM macro instruction defining each variable length segment you want to compress in the following way:
SEGM NAME=sssss,BYTES=(xx+10,yy),COMPRTN=(INFFPCPE,DATA or KEY),....
-
The COMPRTN parameter specifies the compression module INFFPCPE.
-
The BYTES parameter must have 10 bytes added to the previous maximum length of the segment:
BYTES=(xx,yy) changes to BYTES=(xx+10,yy) in order to provide a place to put INFOPAK's control information.
Using INFFPCPE is forbidden for fixed length segments!
Changing to standard compression
The standard compression modules INFCMPPE, INFCMP00, and INFCMPS0 do not provide user exits. If you wish to change to a standard compression module, you cannot use the user exit. You can change to any standard INFOPAK module by unloading and reloading the database.
You can also change from INFFPCPE (with no user exit) to INFCMPPE (no other module can be used) without an unload as long as you are positively certain that all segments are compressed.
The advantage of changing to INFCMPPE from INFFPCPE is better compression due to the elimination of the compress in place tag on each segment.
The change is very easy: simply replace INFFPCPE with INFCMPPE in the COMPRTN parameter of the SEGM macro instruction of the DBD for each segment that has been 100% compressed by INFFPCPE or INFFPCZE.
If a user exit has been used with the compress in place module, you must unload and reload the database to change compression modules.
Adding a user exit to INFFPCPE
The compress in place module INFFPCPE can be link-edited to a user exit with an entry point called INFFPUSR. The resulting routine is a compress in place routine able to prevent compression of some segments according to criteria known to the user exit.
Specifically, the compress in place module INFFPCPE calls the entry point INFFPUSR before compression takes place. The return code from INFFPUSR is tested:
-
if the return code is zero, compression takes place,
-
if the return code is not zero, no compression occurs.
The user exit must follow the following rules:
-
the entry point name is INFFPUSR
-
register 15 contains the address of entry point upon entry to exit routine
-
the program must be re-entering
-
a non zero return code prevents compression
-
register 14 upon entry contains the return address; it must be kept
-
registers 0 through 13 contain the values passed to the compression module from IMS. They must be restored at the end of processing
-
register 13 points to the IMS save area, where the IMS registers are already saved; this area must not be overwritten; it contains values for registers 0 through 12 from which those registers can be restored.
If you add a user exit to the compress in place module INFFPCPE, you should never change to the standard module without a unload reload of the database, since at any time a segment occurrence might be uncompressed.
The INFCMP macro - segment splitting
Once you have compressed your data base, if you experience performance problems after frequent updating because of segment splits, you may have to declare a minimum length to get free space in the segment to avoid severe segment splitting. Naturally, imposing a minimum length on compressed segments will affect your DASD savings.
Even though your segment is defined as fixed length in your DBD, IMS physically stores the compressed segment in a variable format.
The INFCMP macro, furnished with INFOPAK, permits declaration of a segment minimum length since the segments are defined as fixed length in the DBD. In order to install the macro, copy the module INFCMP from your installation tape to the MACLIB of IMS used to generate the DBD.
Estimating minimum length
The value for MINLEN can be calculated using the information in the TESTPAK report, which provides for each segment, the total compressed bytes of data after compression. Dividing the total bytes after compression by the number of segments will yield an average segment size for each segment type.
You should try to use a MINLEN value slightly larger than average segment size, so that the majority of compressed segments will be of shorter length than that given by MINLEN. If your update programs are adding new information to previously blank fields, you may find that you will have to increase your MINLEN value even more. For example, INFOPAK will compress 31 blank bytes down to one byte. If your update program is adding 20 bytes of numeric characters to a previously blank field, the updated segment will be at least 10 bytes longer. Fine tuning for the ideal MINLEN depends on the nature of the file and what type of data changes your update programs are doing.
Using the INFCMP macro
Code the INFCMP macro for each fixed segment for which you wish to define a minimum length.
-
Code these macros in any order whatever after the macro DBDGEN and before the macro END.
-
Never use this macro for segments declared in your DBD as variable or non-compressed.
-
Check to be sure that the minimum length is SMALLER than the fixed segment size specified in your DBD.
-
Use this macro only for PHYSICAL DBD's.
Coding of the INFCMP macro
INFCMP SEGM=segname,MINLEN=length
where:
{wrapper="1" role="DL"}
-
segname
Name of the segment that you want to assign a minimum length.
-
length
Minimum length for this segment (Pointers not included).
Examples
Assume for example, a DBD as follows:
DBD NAME=DBD1,....
DATASET DEVICE=....,DD1=....,SIZE=...
SEGM NAME=SEG001,PARENT=...,BYTES=60,COMPRTN=INFCMP00,...
SEGM NAME=SEG002,PARENT=...,BYTES=80,COMPRTN=INFCMP00,...
DBDGEN
FINISH (optional from DL1 release 1.30)
END
If you want to assign a minimum length of 30 bytes to SEG001 and of 20 bytes to SEG002, enter:
DBD NAME=DBD1,....
DATASET DEVICE=....,DD1=....,SIZE=....
SEGM NAME=SEG001,PARENT=...,BYTES=60,COMPRTN=INFCMP00,...
SEGM NAME=SEG002,PARENT=...,BYTES=80,COMPRTN=INFCMP00,...
DBDGEN
FINISH (optional from DL1 release 1.30)
INFCMP SEGM=SEG001,MINLEN=30
INFCMP SEGM=SEG002,MINLEN=20
END
The INFUMSE macro
The standard utility to unload sequential dependent segments of DEDB databases does not decompress data (DBFUMSC0). To allow decompression, the macro INFUMSE generates a decompression exit to be used during the unload step. This macro has been placed into the library with DDNAME IMSMAC during installation.
Using INFUMSE
-
Compile the INFUMSE macro by specifying the name of the compression routine that was used to compress the sequential dependent segment.
-
Create a load-module by linkediting the object file generated by the first step.
-
Declare this module as an exit when unloading the database.
Compile the INFUMSE macro
The assembler source must contain the two following lines:
csect-name INFUMSE COMPRTN=routine,NXTEXIT=next-exit
END
where:
-
'csect-name' indicates the name of the CSECT to be generated by the macro. This name is optional, in general, indicate here the name of the load-module.
-
'routine' indicates the name of the compression routine used to compress the sequential dependent segment (declared in the DBD).
The default value this parameter is INFCMP00.
-
'next-exit' This parameter may be used to call any user exit. For example, an exit that existed on that database, other than compression.
This parameter is optional.
Example
Compilation of the exit to decompress segments compressed by the routine INFCMPPE.
SYSIN DD *
INFUCPPE INFUMSE COMPRTN=INFCMPPE
END
/*
Creating the exit load-module.
The load module of the decompression exit can be created as follows:
//LKED EXEC PGM=IEWL,PARM='RENT'
//SYSPRINT SYSOUT=*
//SYSLMOD DD DISP=SHR,DSN=Load-module-library
//OBJLIB DD DISP=SHR,DSN=Object-library
//SYSUT1 DD .....
//SYSLIN DD *
INCLUDE OBJLIB(compilation-result)
NAME INFOPAK exit-name
If the INFOPAK exit has to call a user exit, it is necessary to include that exit in the SYSLIN of linkedit step.
SYSLIN DD *
INCLUDE OBJLIB(compilation-result)
INCLUDE MODLIB(user-exit-name)
NAME INFOPAK-exit-name
Unloading using the INFOPAK exit
The load-module created by the preceding step must be accessible by the unload JOB. It must be found in LINKLIST or in the STEPLIB containing DBFUMSC0.
The SYSIN of DBFUMSC0 must indicate the name of the INFOPAK exit to be used.
example
//SCAN EXEC FPUTIL,RGN=500K,DBD=.....
//SCANCOPY DD DISP=(NEW,CATLG),DSN=....................
//SYSIN DD *
TYPE SCAN
EXIT infopak-decompression-exit-name
Using INFCMR00 and INFSMR00
The module INFCMRE0 provided by INFOPAK has an entry point INFCMR00 to perform the compression or decompression of any data area. The module is reentrant and can be used by COBOL or ASSEMBLER programs to compress or decompress data sets.
The module INFSMR00 provided by INFOPAK has an entry point INFSMR00 to perform the compression or decompression of any data area. This module gives lower CPU consumption (but with a good compression rate) than INFCMR00.
Calling INFCMR00 or INFSMR00
The call is made with five parameters:
-
Function code
This word (PIC S9(8) COMP in COBOL) indicates whether this is compression or decompression. This word must be initialized with zero for compression and 4 for decompression.
-
Input area
Name of the field which contains the input data to compress or decompress depending on the function code.
-
Length of the input area
This word (PIC S9(8) COMP in COBOL) contains the length of the input area.
-
Output area
Depending on the function code, this area contains the compressed or the decompressed data after the call.
-
Length of the output area
This word (PIC S9(8) COMP in COBOL) indicates the required length of the output area. When returning control to the calling program INFCMR00 or INFSMR00 places the actual length of the compressed or decompressed output into this field. If the length specified here initially was insufficient (i.e. the resultant compressed or decompressed data is longer than the output length that was specified), INFCMR00 or INFSMR00 fills the output area with the compressed or decompressed data up to the insufficient length and issues a return code (RETURN-CODE in COBOL) of 4.
note
Each parameter is mandatory.
Cobol example with INFCMR00
*
* CALL PARAMETERS
*
01 INPLNGTH PIC S9(08) COMP.
01 INPAREA PIC X(100).
01 OUTLNGTH PIC S9(08) COMP.
01 OUTAREA PIC X(101).
01 FUNC PIC S9(08) COMP.
*
* 1) COMPRESSION
*
MOVE 0 TO FUNC. COMPRESSION
MOVE 100 TO INPLNGTH. LENGTH OF DATA TO BE COMPRESSED
MOVE 101 TO OUTLNGTH. LENGTH OF COMPRESSED DATA
CALL \'INFCMR00\' USING FUNC INPAREA INPLNGTH OUTAREA OUTLNGTH.
*
* 2) DECOMPRESSION
*
MOVE 4 TO FUNC. DECOMPRESSION
MOVE OUTLNGTH TO INPLNGTH. INITIALIZE INPUT LENGTH
MOVE OUTAREA TO INPAREA. AND INPUT AREA
MOVE 101 TO OUTLNGTH. LENGTH OF DECOMPRESSED DATA
CALL \'INFCMR00\' USING FUNC INPAREA INPLNGTH OUTAREA OUTLNGTH.
Assembler example with INFCMR00
*
* CALL PARAMETERS
*
INPLNGTH DS F
INPAREA DC CL100\' \'
OUTLNGTH DS F
OUTAREA DS 0CL101
FUNC DS F
*
* 1) COMPRESSION
*
LA 7,0
ST 7,FUNC FUNCTION CODE
LA 7,100
ST 7,INPLNGTH LENGTH OF DATA TO BE COMPRESSED
LA 7,101
ST 7,OUTLNGTH LENGTH OF COMPRESSED DATA
CALL INFCMR00,(FUNC,INPAREA,INPLNGTH,OUTAREA,OUTLNGTH),VL
*
* 2) DECOMPRESSION
*
LA 7,4
ST 7,FUNC FUNCTION CODE
L 7,OUTLNGTH
ST 7,INPLNGTH INITIALIZE INPUT LENGTH
BCTR 7,0
EX 7,EXMOVE AND INPUT AREA
LA 7,101
ST 7,OUTLNGTH LENGTH OF DECOMPRESSED DATA
CALL INFCMR00,(FUNC,INPAREA,INPLNGTH,OUTAREA,OUTLNGTH),VL
BR 14
*
EXMOVE MVC INPAREA(0),OUTAREA
note
The two preceding examples illustrate the compression of a data area followed by its decompression. When calling for decompression, the example reinitializes the length of the input area with the output length that was provided by INFCMP00 when it performed the compression and reinitializes the input area with the compressed data from the output area. The length of the output area was reinitialized to be the same as it was for compression.
note
Notice that for compression the length of the output area is one byte more than the input area. This guarantees, in compression, that the output length will be sufficient.
Some questions about INFOPAK
INFOPAK and MVS/XA, ESA
INFOPAK routines can operate in AMODE 31 and RMODE ANY. Because of the loading requirements of early versions of IMS they are link-edited with RMODE 24.
INFOPAK with HSSR
HSSR is a old software product allowing the reading of DL1 data bases without using the DL1 interface, thus allowing high speed sequential retrieval of data.
HSSR2 allows decompression of input segments and is therefore compatible with INFOPAK.
INFOPAK with FSU
FSU is a old software product allowing the reading of DL1 data bases without using the DL1 interface, thus allowing high speed sequential retrieval of data. Some caution must be exerted when using it with compressed data bases.
FSU2 provides an exit facility at the control card level. This exit may be used to decompress data. INFOTEL can supply an exit (INFFSX00) if you are using FSU2. The INFFSX00 exit requires that the output format for FSU be "UL".
Unload/Reload utilities
There are two types of such utilities:
-
physical level utilities (recovery utilities):
-
DFSUDMP0 (IMAGE COPY)
-
DFSUICP0 (ONLINE IMAGE COPY)
-
DFSURDB0 (DATABASE RECOVERY)
-
DFSBBO00 (BATCH BACKOUT RECOVERY)
-
DFDSS
-
-
logical level utilities (reorganization load)
-
DFSURUL0 (HISAM UNLOAD)
-
DFSURRL0 (HISAM RELOAD)
-
DFSURGU0 (HD UNLOAD)
-
DFSURGL0 (HD RELOAD)
-
Physical utilities work on a physical block or CI basis. Compressed data remains compressed. There is no CPU overhead associated with data compression and elapsed time is reduced in proportion to the compression ratio. They are meant to copy a data base for backup purposes.
note
Because INFOPAK does not use any external tables and that all versions of a given module are guaranteed upward and downward compatible, you cannot lose data integrity from restoring a saved data base, even if you have installed a new version in the meantime.
Logical utilities work at the segment level. Compressed segments are decompressed on unload and compressed on reload. They will be used for the initial compression of the data base, for physical organization changes (for instance HISAM to HIDAM) or to change any DBD parameters such as numbers of RAA blocks.
REORG without the need to Decompress and Compress
This is a method of reorganizing a IMS data base using normal IMS utilities without decompression-compression cpu overhead.
MAXIMUM CARE must be used to validate its usefulness in your environment. The following steps must be accomplished within listed rules:
-
Keys must not be compressed! The data base must NOT have secondary indexes.
-
The reorg must be done with a special DBD without compression specified matching the data base physically. The physical compressed data base is always variable length, even if the application programs use fixed length segments.
-
For fixed length segments and for variable length segments add eight bytes to maximum length. The fixed length segments will become variable length segments.
-
for example:
-
segm name=p010,parent=0,bytes=60,comprtn=INFCMP00
-
field name=(p010a,seq,u),start=1,bytes=6
-
segm name=p020,parent=p010,bytes=(40,10),comprtn=INFCMP00
-
field name=(p020a,seq,u),start=3,bytes=8
-
-
becomes
-
segm name=p010,parent=0,bytes=(68,4)
-
field name=(p010a,seq,u),start=3,bytes=6
-
segm name=p020,parent=p010,bytes=(48,10)
-
field name=(p020a,seq,u),start=3,bytes=8
-
-
FOR FIXED LENGTH SEGMENTS THE KEY POSITION MUST BE INCREASED by 2 BYTES
-
For a fixed length segment keyword BYTES=11 becomes BYTES=(19,4) Because of the addition of the RDW length.
-
Tests to validate the new DBD must be done!
-
YOU MUST TEST to VALIDATE the NEW DBD!!
-
SMU and DBA
These utilities provide in depth information on the physical use of space within a data base. When a segment is compressed it appears to them as variable length. They are therefore compatible with INFOPAK and may assist you in obtaining statistics on the actual length of compressed segments.
INFOPAK and FAST PATH
FAST PATH is the high performance option of DL1. Available as a standard option for DL1 version 1.3 onwards, FAST PATH possesses its own data structures: MSDB and DEDB.
-
MSDB, data organization in memory, demands fixed length segments, eliminating the possibility of compression.
-
DEDB, data organization comparable to HDAM, has variable length segments, but does not authorize the COMPRTN parameter in the DBD unless you are on version 3.1 or above. With IMS version 3.1 and above you can compress the data with the standard exit. IMS 3.1 requires PTF 90489 to use the compression exit.
If it is desired to compress data of a DEDB segment and you are using a release of IMS prior to version 3 then use entry point INFCMR00 (see the chapter Using INFCMROO) by calling it before each ISRT and after each GET.
Error messages and return codes
{wrapper="1" role="DL"}
-
Abend U4010
This user abend occurs during decompression if INFOPAK finds a segment with a decompressed length that is greater than allowed by the IMS definition (of the current DBD). This is for example, the result of an update to the IMS compressed data base by a program using a DBD that is not defined for use with compression. This abend is also produced when accessing a non compressed data base with a DBD defined as compressed. The segment name is in registers 2 and 3 at time of failure.
-
Abend U4040
This user abend occurs if CPU serial number is invalid for production system.