Monday, July 13, 2009

A brief COBOL (COmmon Business Oriented Language) Introduction

History.
Developed by 1959 by a group called COnference on Data Systems Language (CODASYL). First COBOL compiler was released by December 1959.
First ANSI approved version – 1968
Modified ANSI approved version – 1974 (OS/VS COBOL)Modified ANSI approved version – 1985 (VS COBOL 2).

Speciality.
First language developed for commercial application development, which can efficiently handle millions of data.
Procedure Oriented Language - Problem is segmented into several tasks. Each task is written as a Paragraph in Procedure Division and executed in a logical sequence as mentioned.
English Like language – Easy to learn, code and maintain.

Coding Sheet.
7 12 72 80
COL-A COLUMN-B
1-6 Page/line numbers – Optional (automatically assigned by compiler)
7 Continuity (-), Comment (*), Starting a new page (/)Debugging lines (D)
8-11 Column A –Division, Section, Paragraph, 01,77 declarations must begin here.
12-72 Column B –All the other declarations/statements begin here.
73-80 Identification field. It will be ignored by the compiler but visible in the source listing.

Divisions in COBOL.
There are four divisions in a COBOL program and Data division is optional.
1.Identification Division.
2.Environment Division.
3.Data Division.
4.Procedure Division.

Identification Division.
This is the first division and the program is identified here. Paragraph PROGRAM-ID followed by user-defined name is mandatory. All other paragraphs are optional and used for documentation. The length of user-defined name for IBM COBOL is EIGHT.

IDENTIFICATION DIVISION.
PROGRAM-ID. PROGRAM NAME.
AUTHOR. COMMENT ENTRY.
INSTALLATION. COMMENT ENTRY.
DATE-WRITTEN. COMMENT ENTRY.
DATE-COMPILED. COMMENT ENTRY.
SECURITY. COMMENT ENTRY.
Environment Division.
Only machine dependant division of COBOL program. It supplies information about the hardware or computer equipment to be used on the program. When your program moves from one computer to another computer, the only section that may need to be changed is
ENVIRONMENT division.

Configuration Section.
It supplies information concerning the computer on which the program will be compiled (SOURCE-COMPUTER) and executed (OBJECT-COMPUTER). It consists of three paragraphs – SOURCE COMPUTER, OBJECT-COMPUTER and SPECIAL-NAMES.
This is OPTIONAL section from COBOL 85.

SOURCE-COMPUTER. IBM-4381 (Computer and model # supplied by manufacturer)
WITH DEBUGGING MODE clause specifies that the debugging lines in the program (statements coded with ‘D’ in column 7) are compiled.

OBJECT-COMPUTER. IBM-4381 (Usually same as source computer)

SPECIAL-NAMES. This paragraph is used to relate hardware names to user-specified mnemonic names.
1. Substitute character for currency sign. (CURRENCY SIGN IS litearal-1)
2. Comma can be used as decimal point. (DECIMAL-POINT IS COMMA)
3. Default collating sequence can be changed. It will be explained later.
4. New class can be defined using CLASS keyword. (CLASS DIGIT is “0” thru “9”)

Input-Output Section.
It contains information regarding the files to be used in the program and it consists of two paragraphs FILE-CONTROL & I-O CONTROL.
FILE CONTROL. Files used in the program are identified in this paragraph.
I-O CONTROL. It specifies when check points to be taken and storage areas that are shared by different files.

Data Division.
Data division is used to define the data that need to be accessed by the program. It has three sections.
FILE SECTION describes the record structure of the files.
WORKING-STORAGE SECTION is used to for define intermediate variables.
LINKAGE SECTION is used to access the external data.
Ex: Data passed from other programs or from PARM of JCL.

Literals, Constants, Identifier,
1. Literal is a constant and it can be numeric or non-numeric.
2. Numeric literal can hold 18 digits and non-numeric literal can hold 160 characters in it. (COBOL74 supports 120 characters only)
3. Literal stored in a named memory location is called as variable or identifier.
4. Figurative Constant is a COBOL reserved word representing frequently used constants. They are ZERO/ZEROS/ZEROES, QUOTE/QUOTES, SPACE/SPACES, ALL, HIGH-VALUE/HIGH-VALUES, LOW-VALUE/LOW-VALUES.

Example: 01 WS-VAR1 PIC X(04) VALUE ‘MUSA’.
‘MUSA ‘ is a non-numeric literal. WS-VAR1 is a identifier or variable.

Declaration of variable
Level# $ Variable $ Picture clause $ Value clause $ Usage Clause $ Sync clause.
Level# It specifies the hierarchy of data within a record. It can take a value from the set of integers between 01-49 or from one of the special level-numbers 66 77 88
01 level. Specifies the record itself. It may be either a group item or an
Elementary item. It must begin in Area A.
02-49 levels. Specify group or elementary items within a record. Group level items
must not have picture clause.
66 level. Identify the items that contain the RENAMES clause.
77 level. Identify independent data item.
88 level. Condition names.

Variable name and Qualifier
Variable name can have 1-30 characters with at least one alphabet in it.
Hyphen is the only allowed special character but it cannot be first or last letter of the name. Name should be unique within the record. If two variables with same name are there, then use OF qualifier of high level grouping to refer a variable uniquely.
Ex: MOVE balance OF record-1 TO balance OF record-2.

FILLER
When the program is not intended to use selected fields in a record structure, define them as FILLER. FILLER items cannot be initialized or used in any operation of the procedure division.
PICTURE Clause
It Describes the attributes of variable.
Numeric: 9 (Digit), V (Implied decimal point), S (Sign)
Numeric Edited : + (Plus Sign), - (Minus Sign), CR DB (Credit Debit Sign) . (Period), b (Blank), ‘,’(comma), 0 (Zero), / (Slash)
BLANK WHEN ZERO (Insert blank when data value is 0), Z (ZERO suppression), * (ASTERISK), $(Currency Sign)
Non Numeric A (alphabet), B (Blank insertion Character), X(Alpha numeric), G(DBCS)
Exclusive sets + - CR,DB,V ‘.’,$ + - Z * (But $ Can appear as first place and * as floating. $***.**)
DBCS (Double Byte Character Set) is used in the applications that support large character sets. 16 bits are used for one character. Ex: Japanese language applications.
Refreshing Basics
Nibble. 4 Bits is one nibble. In packed decimal, each nibble stores one digit.
Byte. 8 Bits is one byte. By default, every character is stored in one byte.
Half word. 16 bits or 2 bytes is one half word. (MVS)
Full word. 32 bits or 4 bytes is one full word. (MVS)
Double word. 64 bits or 8 bytes is one double word. (MVS)
Usage Clause
DISPLAY Default. Number of bytes required equals to the size of the data item.
COMP Binary representation of data item.
PIC clause can contain S and 9 only.
S9(01) – S9(04) Half word.
S9(05) – S9(09) Full word.
S9(10) - S9(18) Double word.
Most significant bit is ON if the number is negative.
COMP-1 Single word floating point item. PIC Clause should not be specified.
COMP-2 Double word floating-point item. PIC Clause should not be specified.
COMP-3 Packed Decimal representation. Two digits are stored in each byte.
Last nibble is for sign. (F for unsigned positive, C for signed positive
and D for signed negative)
Formula for Bytes: Integer ((n/2) + 1)) => n is number of 9s.
INDEX It is used for preserve the index value of an array. PIC Clause should
not be specified.
VALUE Clause It is used for initializing data items in the working storage section. Value of item must not exceed picture size. It cannot be specified for the items whose size is variable.
Syntax:
VALUE IS literal.
VALUES ARE literal-1 THRU THROUGH literal-2
VALUES ARE literal-1, literal-2
Literal can be numeric without quotes OR non-numeric within quotes OR figurative constant.

SIGN Clause
Syntax SIGN IS (LEADING) SEPARATE CHARACTER (TRAILING).
It is applicable when the picture string contain ‘S’. Default is TRAILING WITH NO SEPARATE CHARACTER. So ‘S’ doesn’t take any space. It is stored along with last digit.
REDEFINES
The REDEFINES clause allows you to use different data description entries to describe the same computer storage area. Redefining declaration should immediately follow the redefined item and should be done at the same level. Multiple redefinitions are possible. Size of redefined and redefining need not be the same.

Example:
01 WS-DATE PIC 9(06).
01 WS-REDEF-DATE REDEFINES WS-DATE.
05 WS-YEAR PIC 9(02).
05 WS-MON PIC 9(02).
05 WS-DAY PIC 9(02).

RENAMES
It is used for regrouping of elementary data items in a record. It should be declared at 66 level. It need not immediately follows the data item, which is being renamed. But all RENAMES entries associated with one logical record must immediately follow that record's last data description entry. RENAMES cannot be done for a 01, 77, 88 or another 66 entry.
01 WS-REPSONSE.
05 WS-CHAR143 PIC X(03).
05 WS-CHAR4 PIC X(04).
66 ADD-REPSONSE RENAMES WS-CHAR143.

CONDITION name
It is identified with special level ‘88’. A condition name specifies the value that a field can contain and used as abbreviation in condition checking.
01 SEX PIC X.
88 MALE VALUE ‘1’
88 FEMALE VALUE ‘2’ ‘3’.
IF SEX=1 can also be coded as IF MALE in Procedure division.
‘SET FEMALE TO TRUE ‘ moves value 2 to SEX. If multiple values are coded on VALUE clause, the first value will be moved when it is set to true.

JUSTIFIED RIGHT
This clause can be specified with alphanumeric and alphabetic items for right justification. It cannot be used with 66 and 88 level items.

OCCURS Clause
OCCURS Clause is used to allocate physically contiguous memory locations to store the table values and access them with subscript or index. Detail explanation is given in Table Handling section.

LINKAGE SECTION
It is used to access the data that are external to the program. JCL can send maximum 100 characters to a program thru PARM. Linkage section MUST be coded with a half word binary field, prior to actual field. If length field is not coded, the first two bytes of the field coded in the linkage section will be filled with length and so there are chances of 2 bytes data truncation in the actual field.
01 LK-DATA.
05 LK-LENGTH PIC S9(04) COMP.
05 LK-VARIABLE PIC X(08).
Procedure Division.
This is the last division and business logic is coded here. It has user-defined sections and paragraphs. Section name should be unique within the program and paragraph name should be unique within the section.
Procedure division statements are broadly classified into following categories.
Statement Type - Meaning
Imperative - Direct the program to take a specific action. Ex: MOVE ADD EXIT GOTO Conditional - Decide the truth or false of relational condition and based on it, execute different paths. Ex: IF, EVALUATE
Compiler Directive - Directs the compiler to take specific action during compilation.
Ex: COPY SKIP EJECT
Explicit Scope terminator - Terminate the scope of conditional and imperative statements.
Ex: END-ADD END-IF END-EVALUATE
Implicit Scope terminator- The period at the end of any sentence, terminates the scope of
all previous statements not yet terminated.
MOVE Statement
It is used to transfer data between internal storage areas defined in either file section or working storage section.

Syntax:
MOVE identifier1/literal1/figurative-constant TO identifier2 (identifier3)Multiple move statements can be separated using comma, semicolons, blanks or the keyword THEN.
Numeric move rules:
A numeric or numeric-edited item receives data in such a way that the decimal point is aligned first and then filling of the receiving field takes place.
Unfilled positions are filled with zero. Zero suppression or insertion of editing symbols takes places according to the rules of editing pictures.
If the receiving field width is smaller than sending field then excess digits, to the left and/or to the right of the decimal point are truncated.
Alphanumeric Move Rules:
Alphabetic, alphanumeric or alphanumeric-edited data field receives the data from left to right. Any unfilled field of the receiving filed is filled with spaces.
When the length of receiving field is shorter than that of sending field, then receiving field accepts characters from left to right until it is filled. The unaccomodated characters on the right of the sending field are truncated.
When an alphanumeric field is moved to a numeric or numeric-edited field, the item is moved as if it were in an unsigned numeric integer mode.
CORRESPONDING can be used to transfer data between items of the same names belonging to different group-items by specifying the names of group-items to which they belong.
ROUNDED option
With ROUNDED option, the computer will always round the result to the PICTURE clause specification of the receiving field. It is usually coded after the field to be rounded. It is prefixed with REMAINDER keyword ONLY in DIVIDE operation. ADD A B GIVING C ROUNDED.
ON SIZE ERROR
If A=20 (PIC 9(02)) and B=90 (PIC 9(02)), ADD A TO B will result 10 in B where the expected value in B is 110. ON SIZE ERROR clause is coded to trap such size errors in arithmetic operation.
If this is coded with arithmetic statement, any operation that ended with SIZE error will not be carried out but the statement follows ON SIZE ERROR will be executed.
ADD A TO B ON SIZE ERROR DISPLAY ‘ERROR!’.
COMPUTE
Complex arithmetic operations can be carried out using COMPUTE statement. We can use arithmetic symbols than keywords and so it is simple and easy to code.
+ For ADD, - for SUBTRACT, * for MULTIPLY, / for DIVIDE and ** for exponentiation.
Rule: Left to right – 1.Parentheses
2.Exponentiation
3.Multiplication and Division
4.Addition and Subtraction
Caution: When ROUNDED is coded with COMPUTE, some compiler will do rounding for every arithmetic operation and so the final result would not be precise.
77 A PIC 999 VALUE 10
COMPUTE A ROUNDED = (A+2.95) *10.99
Result: (ROUNDED(ROUNDED(12.95) * ROUNDED(10.99)) =120 or
ROUNDED(142.3205) = 142
So the result can be 120 or 142.Be cautious when using ROUNDED keyword with COMPUTE statement.

All arithmetic operators have their own explicit scope terminators. (END-ADD, END-SUBTRACT, END-MULTIPLY, END-DIVIDE, END-COMPUTE). It is suggested to use them.
CORRESPONDING is available for ADD and SUBTRACT only.
INITIALIZE
VALUE clause is used to initialize the data items in the working storage section whereas INITIALIZE is used to initialize the data items in the procedure division.
INITIALIZE sets the alphabetic, alphanumeric and alphanumeric-edited items to SPACES and numeric and numeric-edited items to ZERO. This can be overridden by REPLACING option of INITIALIZE. FILLER, OCCURS DEPENDING ON items are not affected.
Syntax: INITIALIZE identifier-1
REPLACING (ALPHABETIC/ALPHANUMERIC/ALPHA-NUMERIC-EDITED
NUMERIC/NUMERIC-EDITED)
DATA BY (identifier-2 /Literal-2)
ACCEPT
ACCEPT can transfer data from input device or system information contain in the reserved data items like DATE, TIME, DAY.
ACCEPT WS-VAR1 (FROM DATE/TIME/DAY/OTHER SYSTEM VARS).
If FROM Clause is not coded, then the data is read from terminal. At the time of execution, batch program will ABEND if there is no in-stream data from JCL and there is no FROM clause in the ACCEPT clause.

DATE option returns six digit current date in YYYYMMDD
DAY returns 5 digit current date in YYDDD
TIME returns 8 digit RUN TIME in HHMMSSTT
DAY-OF-WEEK returns single digit whose value can be 1-7 (Monday-Sunday respectively)

DISPLAY
It is used to display data. By default display messages are routed to SYSOUT.
Syntax: DISPLAY identifier1 literal1 (UPON mnemonic name)

STOP RUN, EXIT PROGRAM & GO BACK
STOP RUN is the last executable statement of the main program. It returns control back to OS.
EXIT PROGRAM is the last executable statement of sub-program. It returns control back to main program.
GOBACK can be coded in main program as well as sub-program as the last statement. It just gives the control back from where it received the control.
PERFORM STATEMENTSPERFORM will be useful when you want to execute a set of statements in multiple places of the program. Write all the statements in one paragraph and invoke it using PERFORM wherever needed. Once the paragraph is executed, the control comes back to next statement following the PERFORM
1.SIMPLE PERFORM.
PERFORM PARA-1.
DISPLAY ‘PARA-1 executed’
STOP RUN.
PARA-1.
Statement1
Statement2.
It executes all the instructions coded in PARA-1 and then transfers the control to the next instruction in sequence.

2.INLINE PERFORM.
When sets of statements are used only in one place then we can group all of them within PERFORM END-PERFORM structure. This is called INLINE PERFORM.
This is equal to DO..END structure of other languages.
PERFORM
ADD A TO B
MULTIPLE B BY C
DISPLAY ‘VALUE OF A+B*C ‘ C
END-PERFORM

3. PERFORM PARA-1 THRU PARA-N.
All the paragraphs between PARA-1 and PARA-N are executed once.

4. PERFORM PARA-1 THRU PARA-N UNTIL condition(s).
The identifiers used in the UNTIL condition(s) must be altered within the paragraph(s) being performed; otherwise the paragraphs will be performed indefinitely. If the condition in the UNTIL clause is met at first time of execution, then named paragraph(s) will not be executed at all.

5. PERFORM PARA-1 THRU PARA-N N TIMES.
N can be literal defined as numeric item in working storage or hard coded constant.

6. PERFORM PARA-1 THRU PARA-N VARYING identifier1
FROM identifier 2 BY identifier3 UNTIL condition(s)
Initialize identifier1 with identifier2 and test the condition(s). If the condition is false execute the statements in PARA-1 thru PARA-N and increment identifier1 BY identifier3 and check the condition(s) again. If the condition is again false, repeat this process till the condition is satisfied.

7.PERFORM PARA-1 WITH TEST BEFORE/AFTER UNTIL condition(s).
With TEST BEFORE, Condition is checked first and if it found false, then PARA-1 is executed and this is the default. (Functions like DO- WHILE)
With TEST AFTER, PARA-1 is executed once and then the condition is checked. (Functions like DO-UNTIL)
EXIT statement.
COBOL reserved word that performs NOTHING. It is used as a single statement in a paragraph that indicate the end of paragraph(s) execution.
EXIT must be the only statement in a paragraph in COBOL74 whereas it can be used with other statements in COBOL85.
GO TO Usage:
In a structured top-down programming GO TO is not preferable. It offers permanent control transfer to another paragraph and the chances of logic errors is much greater with GO TO than PERFORM. The readability of the program will also be badly affected.
But still GO TO can be used within the paragraphs being performed. i.e. When using the THRU option of PERFORM statement, branches or GO TO statements, are permitted as long as they are within the range of named paragraphs.
PERFORM 100-STEP1 THRU STEP-4
..
100-STEP-1.
ADD A TO B GIVING C.
IF D = ZERO DISPLAY ‘MULTIPLICATION NOT DONE’
GO TO 300-STEP3
END-IF.
200-STEP-2.
MULTIPLY C BY D.
300-STEP-3.
DISPLAY ‘VALUE OF C:’ C.
Here GO TO used within the range of PERFORM. This kind of Controlled GO TO is fine with structured programming also!
CALL statement (Sub-Programs)
When a specific functionality need to be performed in more than one program, it is best to write them separately and call them into each program. Sub Programs can be written in any programming language. They are typically written in a language best suited to the specific task required and thus provide greater flexibility.

Main Program Changes:
CALL statement is used for executing the sub-program from the main program. A sample of CALL statement is given below:
CALL ‘PGM2’ USING BY REFERENCE WS-VAR1, BY CONTENT WS-VAR2.
PGM2 is called here. WS-VAR1 and WS-VAR2 are working storage items.
WS-VAR1 is passed by reference. WS-VAR2 is passed by Content. BY REFERENCE is default in COBOL and need not be coded. BY CONTENT LENGTH phrase permits the length of data item to be passed to a called program.

Sub-Program Changes:
WS-VAR1 and WS-VAR2 are working storage items of main program.
As we have already mentioned, the linkage section is used for accessing external elements. As these working storage items are owned by main program, to access them in the sub-program, we need to define them in the linkage section.

LINKAGE SECTION.
01 LINKAGE SECTION.
05 LK-VAR1 PIC 9(04).
05 LK-VAR2 PIC 9(04).

In addition to define them in linkage section, the procedure division should be coded with these data items for address-ability.

PROCEDURE DIVISION USING LK-VAR1,LK-VAR2

There is a one-one correspondence between passed elements and received elements (Call using, linkage and procedure division using) BY POSITION. This implies that the name of the identifiers in the called and calling program need not be the same (WS-VAR1 & LK-VAR1) but the number of elements and picture clause should be same.

The last statement of your sub-program should be EXIT PROGRAM. This returns the control back to main program. GOBACK can also be coded instead of EXIT PROGRAM but not STOP RUN. EXIT PROGRAM should be the only statement in a paragraph in COBOL74 whereas it can be coded along with other statements in a paragraph in COBOL85.

PROGRAM-ID. IS INITIAL PROGRAM.
If IS INITIAL PROGRAM is coded along with program-id of sub program, then the program will be in initial stage every time it is called (COBOL85 feature).
Alternatively CANCEL issued after CALL, will set the sub-program to initial state.
If the sub program is modified then it needs to be recompiled. The need for main program recompilation is decided by the compiler option used for the main program. If the DYNAM compiler is used, then there is no need to recompile the main program. The modified subroutine will be in effect during the run. NODYNAM is default that expects the main program recompilation.
COBOL COMPILATION
COMPILATION JCL:
//SMSXL86B JOB ,'COMPILATION JCL', MSGCLASS=Q,MSGLEVEL=(1,1),CLASS=C
//COMPILE1 EXEC PGM=IGYCRCTL, PARM=’XREF,APO,ADV,MAP,LIST),REGION=0M
//STEPLIB DD DSN=SYS1.COB2LIB,DISP=SHR
//SYSIN DD DSN=SMSXL86.TEST.COBOL(SAMPGM01),DISP=SHR
//SYSLIB DD DSN=SMSXL86.COPYLIB,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSLIN DD DSN=&&LOADSET, DCB=(RECFM=FB,LRECL=80,BLKSIZE=3200),
// DISP=(NEW,PASS),UNIT=SYSDA,SPACE=(CYL,(5,10),RLSE),
//SYSUT1 DD UNIT=&SYSDA,SPACE=(CYL,(1,10)) => Code SYSUT2 to UT7
//LINKEDT1 EXEC PGM=IEWL,COND=(4,LT)
//SYSLIN DD DSN=&&LOADSET, DISP=(OLD,DELETE)
//SYSLMOD DD DSN=&&GOSET(SAMPGM01),DISP=(NEW,PASS),UNIT=SYSDA
// SPACE=(CYL,1,1,1))
//SYSLIB DD DSN=SMSXL86.LOADLIB,DISP=SHR
//SYSUT1 DD UNIT=SYSDA,SPACE=(CYL,(1,10))
//SYSPRINT DD SYSOUT=*

//*** EXECUTE THE PROGRAM ***
//EXECUTE1 EXEC PGM=*.LINKEDT1.SYSLMOD,COND=(4,LT),REGION=0M
//STEPLIB DD DSN=SMSXL86.LOADLIB,DISP=SHR
// DD DSN=SYS1.SCEERUN,DISP=SHR
//SYSOUT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
Compiler Options
The default options that were set up when your compiler was installed are in effect for your program unless you override them with other options. To check the default compiler options of your installation, do a compile and check in the compilation listing.

Ways of overriding the default options

1.Compiler options can be passed to COBOL Compiler Program (IGYCRCTL) through the PARM in JCL.

2.PROCESS or CBL statement with compiler options, can be placed before the identification division.
3.If the organization uses any third party product or its own utility then these options can be coded in the pre-defined line of the utility panel.

Precedence of Compiler Options
(Highest precedence). Installation defaults, fixed by the installation.
Options coded on PROCESS /CBL statement
Options coded on JCL PARM parameters
(Lowest Precedence). Installation defaults, but not fixed.
Here i have posted most importance concepts that i feel a beginner should know. There is lot to understand in COBOL. Please let me the topic that you want to understand. I will soon post it. Thanks for reading.

No comments:

Post a Comment