DB2

Sunday, September 5, 2010

Db2 isolation level in bind

isolation level

An attribute that defines the degree to which an application process is isolated from other concurrently executing application processes.

uncommitted read (UR)

An isolation level that allows an application to access uncommitted changes of other transactions. The application does not lock other applications out of the row it is reading, unless the other application attempts to drop or alter the table.

cursor stability (CS)

An isolation level that locks any row accessed by a transaction of an application while the cursor is positioned on the row. The lock remains in effect until the next row is fetched or the transaction is terminated. If any data is changed in a row, the lock is held until the change is committed to the database.

read stability (RS)

An isolation level that locks only the rows that an application retrieves within a transaction. Read stability ensures that any qualifying row that is read during a transaction is not changed by other application processes until the transaction is completed, and that any row changed by another application process is not read until the change is committed by that process. Read stability allows more concurrency than repeatable read, and less than cursor stability.

repeatable read (RR)

An isolation level that locks all the rows in an application that are referenced within a transaction. When a program uses repeatable read protection, rows referenced by the program cannot be changed by other programs until the program ends the current transaction.

phantom row

A table row that can be read by application processes that are executing with any isolation level except repeatable read. When an application process issues the same query multiple times within a single unit of work, additional rows can appear between queries because of the data being inserted and committed by application processes that are running concurrently.

Can we better understand by the below table....

	UR	CS	RS	RR
Can the application see uncommitted changes made by other application processes?	Yes	No	No	No
Can the application update uncommitted changes made by other application processes?	No	No	No	No
*Can the re-execution of a statement be affected by other application processes? See phenomenon P3 (phantom) below.*	Yes	Yes	Yes	No
Can "updated" rows be updated by other application processes?	No	No	No	No
Can "updated" rows be read by other application processes that are running at an isolation level other than UR?	No	No	No	No
Can "updated" rows be read by other application processes that are running at the UR isolation level?	Yes	Yes	Yes	Yes
*Can "accessed" rows be updated by other application processes? See phenomenon P2 (nonrepeatable read) below.*	Yes	Yes	No	No
Can "accessed" rows be read by other application processes?	Yes	Yes	Yes	Yes
*Can "current" row be updated or deleted by other application processes? See phenomenon P1 (dirty-read) below.*	See Note below	See Note below	No	No

Examples of Phenomena:

Dirty Read. Unit of work UW1 modifies a row. Unit of work UW2 reads that row before UW1 performs a COMMIT. If UW1 then performs a ROLLBACK, UW2 has read a nonexistent row.

Nonrepeatable Read. Unit of work UW1 reads a row. Unit of work UW2 modifies that row and performs a COMMIT. If UW1 then re-reads the row, it might receive a modified value.

Phantom. Unit of work UW1 reads the set of n rows that satisfies some search condition. Unit of work UW2 then INSERTs one or more rows that satisfies the search condition. If UW1 then repeats the initial read with the same search condition, it obtains the original rows plus the inserted rows.

Monday, August 23, 2010

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

Collection Ideas

said that a `COLLECTION` is simply a way of grouping `PACKAGE`s into meaningful (for you) groups. You could use `COLLECTION`s to separate programs for different application areas, such as payroll and inventory. Another use might be to separate programs bound with `ISOLATION UR` from programs bound with `ISOLATION CS`. `COLLECTION`s are simply high-level grouping names to designate that this group of packages share something, anything, in common.

COLLECTIONs enable you to organize your PACKAGEs into like-kind groups. In DB2's younger days, with multiple DBRMs being bound into PLANs, all the DBRMs in a single PLAN had to be bound with the same BIND parameters. However, today you can BIND each PACKAGE into a COLLECTION that has a customized set of BIND parameters associated with it. An example would be to ISOLATE all programs using REOPT(VARS) in one COLLECTION and all programs using OPTHINT in another. DEGREE(ANY) is another BIND parameter that you may want to be a bit more vigilant in monitoring. An easy way of keeping an eye on programs (or children) is to put them in a room together. In other words, BIND parameters are much more granular today than they were when DB2 was young.

With the advent of BIND PACKAGE and the one-to-one relationship of a program to a package, we were given the ability to name the high-level qualifier for the tables accessed by the program. Therefore, the DBRM for one program could be bound into two different COLLECTIONs. The DBRM for program ABC123 could be bound into a COLLECTION called colcorp01, using corp01 as the table high-level qualifier. The same DBRM could be bound into a COLLECTION called colcorp02, using corp02 as the high-level qualifier. Or, you could BIND the same DBRM into colstress to run it against stress test tables and BIND it into colstress to run against regular test tables. Or, you could BIND a DBRM into a COLLECTION called colur to use when you access read-only decision support tables using ISOLATION UR and into a COLLECTION called colcs when you use active production data. There are dozens of examples. Just use your imagination.

So, at runtime, whichever approach you chose, you now have a PACKAGE with the exact same tattoo/timestamp/consistency-token in two different COLLECTIONs. How do you tell DB2 in which collection to search for Danny? Normally, DB2 would search through all of the COLLECTIONs in the named PLAN. But, if you want to search only one COLLECTION, you simply tell DB2 in your program. You can specify which COLLECTION to search by using an SQL statement called SET CURRENT PACKAGESET. PACKAGESET is simply a synonym for COLLECTION. Therefore, if you set the current PACKAGESET to colcorp01, you will access corp01's tables. If you set the current PACKAGESET to colcorp02, you will access corp02's tables. And the beauty of this is that you only have to maintain one program.

Versions

Suppose program A is changed and moved back into production. Before the program was changed it ran in 10 minutes and never bothered anyone. After the change, all the other programs running at the same time are experiencing `-911` timeouts. How do you fall back gracefully and rapidly to the prior version of the program?

At precompile time you can specify a VERSION ID. If the VERSION ID is the same as the current version, BIND will overlay the PACKAGE in its COLLECTION. But, if the VERSION ID is different from the current VERSION ID, BINDing the DBRM will produce a new PACKAGE that won't overlay the prior PACKAGE for the program. You'll have two PACKAGEs for the same program in the same COLLECTION. If you also move the current LOAD module with its old tattoo timestamp into a different loadlib (COBOL.BACKUP), the compile will not overlay it. Then, when you compile the modified source code, you'll have a LOAD module with the new tattoo/timestamp in the current loadlib. If you execute the new LOAD module, you'll find the new PACKAGE. If the system suffers, you can cancel the job and move the old LOAD module back into production simply by pointing to COBOL.BACKUP.

Copies

There are too many other nuances and possibilities to mention; however, one feature that may be useful is the ability to copy a package from one collection to another. If the statistics on your table vary greatly from daytime to evening or beginning of the month to end of the month, you can BIND a PACKAGE in a COLLECTION called colday or colbegin when the statistics in the CATALOG are representative of your daytime or beginning of the month table. You can then COPY that PACKAGE into another COLLECTION called colnight or colend when the statistics in the CATALOG are representative of your nighttime or end of the month table. COPY does a REBIND and uses the DBRM in the CATALOG as its input. Therefore, the tattoo/timestamp doesn't change. If you check the time of day or the day of month at the beginning of the program, you can SET CURRENT PACKAGESET to the appropriate COLLECTION for DB2 to search for Danny.

Thursday, August 19, 2010

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

Packages and Collections

In V2R3, the DB2 developers solved the seven problems I listed (and many more) by introducing another layer in the program preparation procedure. The precompile step still split the program into tattooed twins, and the treatment of the modified source code stayed the same. But the DBRM could now be bound either into a PLAN (the old way) or into a PACKAGE. Although the relationship between a DBRM and a PLAN was one-to-many, the relationship between a DBRM and a PACKAGE was always one-to-one. Most of the work of BIND PLAN was moved into BIND PACKAGE. Therefore, if PGMA changed, only PACKAGE A would have to be bound.
If only one DBRM could be bound into a PACKAGE, but PGMA could still CALL PGMB, then a structure was needed to gather all of the PACKAGEs into a searchable list. This structure became a packagelist, rather than a memberlist, bound into a PLAN.

This tiny change solved many, but not all, of the problems inherent in the "memberlist of DBRMs bound into a PLAN" technique. To solve a few more problems, IBM introduced the concept of COLLECTIONs. A collection is simply a way of grouping packages into meaningful (for you) groups. You could BIND all of the packages that must run against Corporation 1's tables into one COLLECTION and all of the packages that must run against Corporation 2's tables into another. Or, you could use COLLECTIONs to separate programs for different application areas, such as payroll and inventory.

Another use might be to separate programs bound with ISOLATION UR from programs bound with ISOLATION CS. COLLECTIONs are simply high-level grouping names to designate that this group of packages share something, anything, in common.
So, now, with the introduction of COLLECTIONs, our BIND PLAN process (think search chain) can now include a packagelist of fully qualified package names such as COLLPAYROLL.PGMA, COLLPAYROLL.PGMB, and COLLCOMMON.PGMX. Or, if we choose, we can just substitute an asterisk for the program name and list (COLLPAYROLL.*, COLLCOMMON.*). Then any program bound into the COLLECTION will be accessible by the named PLAN.

Where Are Packages and Plans?

When you BIND a DBRM into a PLAN, or BIND a PACKAGE into a COLLECTION and then the COLLECTION(s) into a PLAN, the information must be stored somewhere safe inside DB2 until it's needed at run time. These items (PLANs and PACKAGEs) aren't stored in the DB2 Catalog. Rather, they're stored in the DB2 Directory. In fact, you can think of the Directory as the DB2 loadlib for the Danny portion of your program, complete with tattoo.

Run Time

At run time, the load module starts up and eventually hits a paragraph containing a CALL to DB2. This CALL contains information such as a description of the tattoo, the content of your SQL host variables (now populated), the statement number, and so on. The CALL invokes the COBOL-DB2 interface program, which connects to DB2. And if the run-time code necessary to execute your SQL isn't currently resident inside DB2 (in the EDMPOOL), we go to the buffer pool (BP0) assigned to the DB2 Directory and look there. If we don't find Danny there, we go to VSAM to disk to look in the COLLECTIONs named in the PLAN for the PACKAGE with the same name and the same tattoo, also known as the consistency token or timestamp.

And if you don't find the twin anywhere in DB2 (not that this has ever happened to anyone reading this column), you get a -805 error. If you're still using the older technique of binding DBRMs directly into PLANs via a memberlist, then an unsuccessful search for Danny will result in a -818 error code.

In my next column, I'll give you some ideas about the various ways of using the concept of the COLLECTION. I'll also write about VERSIONs and how they're used.

Saturday, August 14, 2010

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

The DBRM (Danny)

At BIND time, DB2 created run-time executable instructions for the SQL portion of the program. But where are those instructions, and what are they called now that the term DBRM no longer applies?

The truth is, you have a choice. You can BIND the instructions for the SQL that was in the DBRM into a PLAN (the old way), or you can BIND the instructions into a PACKAGE (the not-so-old-but-no-longer-new way). The reason for this choice is historical. Back when knights were bold, dragons walked the Earth, and DB2 and some of us were young, DBRMs were bound directly into PLANs. In today's DB2 (since V2R3), there are two ways of doing BINDs. You may continue to BIND DBRMs into PLANs, or you may BIND DBRMs into PACKAGEs. With the second option, you keep your PLAN but use it only as a search chain. This column explains why things changed in V2R3. A future column will explain how things have continued to change in v.7 and v.8.

In the early releases of DB2, the DBRM (SQL originally embedded in our COBOL program but separated at precompile time into a PDS member) was bound into a PLAN (PLANA). This method worked just fine as long as the program was a standalone program. You coded the JCL to execute program PGMA naming the PLAN "PLANA," and at run time the twins found each other. However, things got a bit complicated when PGMA needed to CALL PGMB. Because only one PLAN could be named in an execute statement, the PLAN had to contain run-time instructions for both PGMA and PGMB. This problem was solved by having the BIND instruction for PLANA name a memberlist; the DBRMs for both PGMA and PGMB were listed as members. And if PGMB called PGMC, then the three would be listed as members. And if C called D, which could call E, F, G, or H, which could call I, J, K, L, or M, which could .... Well, you get the idea.
Memberlists got longer and longer. What were (and still are, if you cling to the old technique) the drawbacks of having a very long list?

Remember that DB2 authorization and SQL syntax are checked at BIND time. Access path alternatives are also weighed, the least-cost path is chosen (based on current statistics and system resources), and run-time instructions are created for the chosen path. Well, if the PLAN contains one member, A, this process should be quick. But what if there are 20 or 50 or 500 DBRMs in the memberlist? The BIND could take hours.
So, the PLAN, which took more than a while to BIND, contains 500 members. What if one of the programs, PGMA, changes? When the source code changes, the program must be precompiled. And precompile changes the tattoo and creates a new DBRM. That new DBRM must be bound into the PLAN. When the PLAN is bound, all 500 DBRMs (not just the modified PGMA) will be rebound. It could take hours to BIND a PLAN, even though 499 of the 500 programs haven't changed.
Also, remember that BIND is an opportunity to reassess and change the access paths for not only the modified program, but also every single program in the memberlist. If one program changes, every program in the list will go through BIND.
What if you want to modify PGMB to call a new program, PGMZ? You must not only precompile modified B and new Z, you also must add new PGMZ to the memberlist and BIND the whole list — all 501 DBRMs.
You want to remove PGMT? Edit the memberlist and then BIND the PLAN again with the remaining 500 members in the list and wait impatiently while DBRMs that haven't changed go through the BIND process.
Okay, modified program A turns your processor over on its back, casters up. You want to fall back to the original version of PGMA. And exactly how would you do that quickly? If (and it's a big if) you have the old DBRM with its old tattoo, you could move it into the DBRMlib and BIND the entire 500-member list (even though only A had regressed) and replace the new loadlib member A with the prior loadlib member A. Or, if you have the old source code for A, you could precompile it to recreate both the modified source code and the DBRM, and then COMPILE, LINK, and BIND, which would BIND all 500 DBRMs in the member list.
Program Q has to run against two sets of tables, one for Corporation 1 and a second for Corporation 2. The sets of tables have identical names but different high-level qualifiers. You could use synonyms, but they're unwieldy; binding during a time when a synonym points to the wrong set of tables could cause disaster.

These (and other) quandaries faced many DBAs in the days before V2R3 and the advent of PACKAGEs. what's is PACKAGE...... chek out next

Thursday, August 12, 2010

Modified source+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

till now , we had done a PRECOMPILE to separate our COBOL program into twins: Arnold (that's Governor Arnold now) as the modified source code (the SQL commented out and DCLGENs now INCLUDEd) and Danny as the DBRM containing the SQL that used to be in the source code. The COBOL source code (without any SQL) was compiled into run-time executable instructions for the COBOL portion of the program; DBRM Danny (all the SQL that used to be in the source module) went through BIND to generate run-time instructions for the SQL in our COBOL program.

but some question still running in mind ...like how Arnold, over in the COBOL loadlib, will find his long-lost twin — exactly what we were binding and where we would put it when we finished.

so ...study continue..

The Modified Source Code (Arnold)

At precompile time, when the SQL was stripped out of our program and moved into the DBRM (Danny) leaving only COBOL in the modified source code, you must have wondered how COBOL Arnold would ever find DB2 SQL Danny; in other words, how the COBOL would ever execute any SQL.

The explanation is simple. All of the executable SQL (and not all SQL is executable — a DECLARE CURSOR isn't, for example) in the COBOL program was replaced with a CALL statement (in our example, a COBOL CALL). The modified source code, complete with its calls to DB2 SQL Danny, was compiled and "linked" into a LOAD module. When this LOAD module is executed and hits a paragraph that once contained SQL (but now contains a CALL to Danny), there will be run-time executable code that knows how to link to a COBOL-DB2 interface module and connect to DB2, where it will find Danny and the run-time executable code for the SQL statement that previously had been in the paragraph.

Remember that we tattooed the twins? Well, this CALL contains the information DB2 needs to confirm not only that this LOAD MODULE is Arnold (complete with requisite tattoo), but also that it's the exact same Arnold that came out of the exact same precompile step as Danny. The CALL looks at the PLAN named in the execute statement of the job control language (JCL) and searches for the Danny out in DB2 with the same tattoo.

next post about most important member of this function called DBRM

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

ALL ABOUT BIND

BIND connects to the DB2 in which the program's LOAD module will run, reads the DBRM serially, and then performs three tasks.
The first of the BIND tasks is an authorization check. DB2 must make sure that the programmer has the BIND authority and the SQL authority to perform the requested SQL task (for example, updating the payroll master). When using standard authorization procedures, DB2 won't let you BIND a DBRM if you don't have the authority to execute the SQL that's in the DBRM. This is why you may have the authorization to BIND in development (accessing development tables) but don't have authorization to BIND in production, where the SQL accesses production tables. The second BIND task is a bit redundant. BIND, like precompile, must also check the syntax of the SQL, but the BIND check is more sophisticated. Instead of using the top, DECLARE TABLE portion of the DCLGEN, BIND uses the DB2 CATALOG table information to make sure that the column names are valid, that comparisons are numeric-to-numeric, and so on. This second syntax check occurs because you can't trust the one done by the precompiler because the precompiler check used the DCLGEN. You could have a DCLGEN and not have the DB2 table.

The third, and most important, BIND task is to come up with run-time instructions for the SQL in the DBRM. Each SQL statement is parsed and all of the possible (realistic) methods for retrieving the desired columns and rows from the table are weighed, measured, and evaluated based on possible estimated I/O, CPU, and SORT overhead. A ton of information is used as input to the BIND process, not just CATALOG information put there by running the RUNSTATS utility. BIND input includes, for example:

Indexes (what columns are in the indexes?)
Columns (how long is this column and how much room will it occupy in a SORT record?)
System resources (how big are the system resources, buffer pool, and RIDPOOL?)
Processors (how big are they and how many engines do they have?)
DB2 (what release is running?)
Parameters (what are the values of the BIND parameters?)

After all that input (and more) is weighed and compared, the cheapest, most costeffective access path is chosen, and the runtime instructions for that one path are created. (Interestingly, DB2 BIND sometimes generates instructions for more than one path.) This process is called optimization, and it's repeated for each SQL statement in the DBRM until all access paths are decided and the run-time instructions are created for each. As the optimizer decides on each path, writes are done to DB2.

BIND checks to see if you bound with the parameter EXPLAIN(YES); if so, it writes documentary evidence about the chosen path to the PLAN_TABLE and to the DSN_STATEMNT_TABLE for your edification.
BIND also writes a lot of information to multiple CATALOG tables, documenting the fact that the BIND did occur. In fact, the tattooed DBRM, which is not used at run time, is moved into the CATALOG. Objects chosen by the optimizer are documented in the CATALOG in cross-reference tables. And BIND parameters are recorded in the CATALOG also.

WHERE ARE THE INSTRUCTIONS?

It's interesting that the actual instructions for the access path are not written to the CATALOG. You can't look at information in the DB2 CATALOG to figure out whether a query will do synchronous or asynchronous reads at run time. You can't tell if the query will match on three columns of an index or five. The actual run-time instructions aren't stored in the CATALOG. They're definitely not in the DBRM, which is input to the BIND. So, where are they stored?
This question is one of the many reasons that this column grew into two parts. Where in the heck are the run-time instructions? Should you ever use the PLAN_TABLE at run time? Are the run-time instructions in a package, a plan, or a version? All of the above? The infamous "it depends"? Where are the instructions stored while you wait for the LOAD module to run? Like many question come in mind.... so don't worry ....and
Stay tuned for answers to these and other questions..........................

Wednesday, August 11, 2010

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

No matter how long programmers have worked with DB2 for z/OS and OS/390 , they still ask the difference between a plan and a package — and what in the heck a collection is. I planned to write post on this topic so i also remember it...... one by one ....

THE PRECOMPILER

The DB2 Precompiler does not need DB2 to run. It carries out three primary tasks as it reads the program serially, top-to-bottom, looking for DB2 delimiters.
First, if the delimiters surrounded an INCLUDE statement, the Precompiler goes to the INCLUDE library named in the job control language data definition statement and pulls the included MEMBERNAME into the program. This function is the same as a COBOL COPY MEMBERNAME, but the timing is different. COBOL COPYBOOKs get copied in at COMPILE time; DB2 INCLUDEs get copied in at precompile time. The only difference between an SQL INCLUDE and a COBOL COPY is timing. The most common item INCLUDEd in a program was (and is) a DCLGEN. DCLGENs are structures that describe a table. One DCLGEN is usually included for each table that the program will access at run time. Each DCLGEN is a two-part structure consisting of a DECLARE TABLE statement, which describes the table in DB2 SQL language, and a COBOL structure that describes the table using an 01-Level COBOL working storage structure (much like a typical copybook for a VSAM file).

Second, if the delimiters surround an SQL statement, the precompiler does a very basic syntax check to make sure that the column and table names are valid (that they're spelled correctly and that the columns and the table exist). Many DBAs and programmers think that this validation is done by reading the DB2 CATALOG, but they're wrong. Remember, the precompiler doesn't need DB2 or its CATALOG. DB2 might not even be installed on the machine. The DB2 Precompiler uses the top part of the DCLGEN to validate the SQL syntax.

The third, and most important, task performed by the DB2 Precompiler is to split the program into two parts: a COBOL and a DB2 part. All of the SQL that the programmer carefully embedded is stripped out of the program and put into its own partitioned data set (PDS) member, called a DBRM. A single program containing two languages, COBOL and SQL, goes into the DB2 Precompiler and two pieces come out. Twins, but fraternal twins — much like Arnold Schwarzenegger and Danny DeVito. Arnold looks just like his COBOL mother, and Danny looks just like his DB2 father. COBOL Arnold, with all of the SQL commented out, goes down one path in life. SQL Danny, containing only SQL, goes down a different path in life.

The twins, separated at birth, have a tendency to lose each other. To help the twins find each other later in life (in other words, at run time), the precompiler engraves each with identical tattoos. The tattoo is carried forward with COBOL Arnold, through compile and link edit, into the LOAD module in the LOAD library. The tattoo is part of the run-time executable code of the LOAD module. The same tattoo is carried forward with SQL Danny through BIND. BIND is to SQL what COMPILE is to COBOL. The purpose of COBOL COMPILE is to come up with run-time code for the COBOL. The purpose of BIND is to come up with run-time executable code for the SQL.

Both sets of code bear identical tattoos (timestamps or consistency tokens).
So, the COBOL twin becomes a transportable load module in the COBOL LOADLIB and the SQL becomes a transportable DBRM in the DBRMLIB. Just as the COBOL twin had to be compiled, the DBRM twin has to go through BIND to create the run-time executable code for the DB2 portion of the COBOL program and put that executable code into the "right" DB2 subsystem.