DB2: bind

Showing posts with label bind. Show all posts

Sunday, September 5, 2010

Db2 isolation level in bind

isolation level

An attribute that defines the degree to which an application process is isolated from other concurrently executing application processes.

uncommitted read (UR)

An isolation level that allows an application to access uncommitted changes of other transactions. The application does not lock other applications out of the row it is reading, unless the other application attempts to drop or alter the table.

cursor stability (CS)

An isolation level that locks any row accessed by a transaction of an application while the cursor is positioned on the row. The lock remains in effect until the next row is fetched or the transaction is terminated. If any data is changed in a row, the lock is held until the change is committed to the database.

read stability (RS)

An isolation level that locks only the rows that an application retrieves within a transaction. Read stability ensures that any qualifying row that is read during a transaction is not changed by other application processes until the transaction is completed, and that any row changed by another application process is not read until the change is committed by that process. Read stability allows more concurrency than repeatable read, and less than cursor stability.

repeatable read (RR)

An isolation level that locks all the rows in an application that are referenced within a transaction. When a program uses repeatable read protection, rows referenced by the program cannot be changed by other programs until the program ends the current transaction.

phantom row

A table row that can be read by application processes that are executing with any isolation level except repeatable read. When an application process issues the same query multiple times within a single unit of work, additional rows can appear between queries because of the data being inserted and committed by application processes that are running concurrently.

Can we better understand by the below table....

	UR	CS	RS	RR
Can the application see uncommitted changes made by other application processes?	Yes	No	No	No
Can the application update uncommitted changes made by other application processes?	No	No	No	No
*Can the re-execution of a statement be affected by other application processes? See phenomenon P3 (phantom) below.*	Yes	Yes	Yes	No
Can "updated" rows be updated by other application processes?	No	No	No	No
Can "updated" rows be read by other application processes that are running at an isolation level other than UR?	No	No	No	No
Can "updated" rows be read by other application processes that are running at the UR isolation level?	Yes	Yes	Yes	Yes
*Can "accessed" rows be updated by other application processes? See phenomenon P2 (nonrepeatable read) below.*	Yes	Yes	No	No
Can "accessed" rows be read by other application processes?	Yes	Yes	Yes	Yes
*Can "current" row be updated or deleted by other application processes? See phenomenon P1 (dirty-read) below.*	See Note below	See Note below	No	No

Examples of Phenomena:

Dirty Read. Unit of work UW1 modifies a row. Unit of work UW2 reads that row before UW1 performs a COMMIT. If UW1 then performs a ROLLBACK, UW2 has read a nonexistent row.

Nonrepeatable Read. Unit of work UW1 reads a row. Unit of work UW2 modifies that row and performs a COMMIT. If UW1 then re-reads the row, it might receive a modified value.

Phantom. Unit of work UW1 reads the set of n rows that satisfies some search condition. Unit of work UW2 then INSERTs one or more rows that satisfies the search condition. If UW1 then repeats the initial read with the same search condition, it obtains the original rows plus the inserted rows.

Monday, August 23, 2010

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

Collection Ideas

said that a `COLLECTION` is simply a way of grouping `PACKAGE`s into meaningful (for you) groups. You could use `COLLECTION`s to separate programs for different application areas, such as payroll and inventory. Another use might be to separate programs bound with `ISOLATION UR` from programs bound with `ISOLATION CS`. `COLLECTION`s are simply high-level grouping names to designate that this group of packages share something, anything, in common.

COLLECTIONs enable you to organize your PACKAGEs into like-kind groups. In DB2's younger days, with multiple DBRMs being bound into PLANs, all the DBRMs in a single PLAN had to be bound with the same BIND parameters. However, today you can BIND each PACKAGE into a COLLECTION that has a customized set of BIND parameters associated with it. An example would be to ISOLATE all programs using REOPT(VARS) in one COLLECTION and all programs using OPTHINT in another. DEGREE(ANY) is another BIND parameter that you may want to be a bit more vigilant in monitoring. An easy way of keeping an eye on programs (or children) is to put them in a room together. In other words, BIND parameters are much more granular today than they were when DB2 was young.

With the advent of BIND PACKAGE and the one-to-one relationship of a program to a package, we were given the ability to name the high-level qualifier for the tables accessed by the program. Therefore, the DBRM for one program could be bound into two different COLLECTIONs. The DBRM for program ABC123 could be bound into a COLLECTION called colcorp01, using corp01 as the table high-level qualifier. The same DBRM could be bound into a COLLECTION called colcorp02, using corp02 as the high-level qualifier. Or, you could BIND the same DBRM into colstress to run it against stress test tables and BIND it into colstress to run against regular test tables. Or, you could BIND a DBRM into a COLLECTION called colur to use when you access read-only decision support tables using ISOLATION UR and into a COLLECTION called colcs when you use active production data. There are dozens of examples. Just use your imagination.

So, at runtime, whichever approach you chose, you now have a PACKAGE with the exact same tattoo/timestamp/consistency-token in two different COLLECTIONs. How do you tell DB2 in which collection to search for Danny? Normally, DB2 would search through all of the COLLECTIONs in the named PLAN. But, if you want to search only one COLLECTION, you simply tell DB2 in your program. You can specify which COLLECTION to search by using an SQL statement called SET CURRENT PACKAGESET. PACKAGESET is simply a synonym for COLLECTION. Therefore, if you set the current PACKAGESET to colcorp01, you will access corp01's tables. If you set the current PACKAGESET to colcorp02, you will access corp02's tables. And the beauty of this is that you only have to maintain one program.

Versions

Suppose program A is changed and moved back into production. Before the program was changed it ran in 10 minutes and never bothered anyone. After the change, all the other programs running at the same time are experiencing `-911` timeouts. How do you fall back gracefully and rapidly to the prior version of the program?

At precompile time you can specify a VERSION ID. If the VERSION ID is the same as the current version, BIND will overlay the PACKAGE in its COLLECTION. But, if the VERSION ID is different from the current VERSION ID, BINDing the DBRM will produce a new PACKAGE that won't overlay the prior PACKAGE for the program. You'll have two PACKAGEs for the same program in the same COLLECTION. If you also move the current LOAD module with its old tattoo timestamp into a different loadlib (COBOL.BACKUP), the compile will not overlay it. Then, when you compile the modified source code, you'll have a LOAD module with the new tattoo/timestamp in the current loadlib. If you execute the new LOAD module, you'll find the new PACKAGE. If the system suffers, you can cancel the job and move the old LOAD module back into production simply by pointing to COBOL.BACKUP.

Copies

There are too many other nuances and possibilities to mention; however, one feature that may be useful is the ability to copy a package from one collection to another. If the statistics on your table vary greatly from daytime to evening or beginning of the month to end of the month, you can BIND a PACKAGE in a COLLECTION called colday or colbegin when the statistics in the CATALOG are representative of your daytime or beginning of the month table. You can then COPY that PACKAGE into another COLLECTION called colnight or colend when the statistics in the CATALOG are representative of your nighttime or end of the month table. COPY does a REBIND and uses the DBRM in the CATALOG as its input. Therefore, the tattoo/timestamp doesn't change. If you check the time of day or the day of month at the beginning of the program, you can SET CURRENT PACKAGESET to the appropriate COLLECTION for DB2 to search for Danny.

Thursday, August 12, 2010

Precompiler+Bind+Plans + DBRMS + Packages + Collections + Versions = Confusion

ALL ABOUT BIND

BIND connects to the DB2 in which the program's LOAD module will run, reads the DBRM serially, and then performs three tasks.
The first of the BIND tasks is an authorization check. DB2 must make sure that the programmer has the BIND authority and the SQL authority to perform the requested SQL task (for example, updating the payroll master). When using standard authorization procedures, DB2 won't let you BIND a DBRM if you don't have the authority to execute the SQL that's in the DBRM. This is why you may have the authorization to BIND in development (accessing development tables) but don't have authorization to BIND in production, where the SQL accesses production tables. The second BIND task is a bit redundant. BIND, like precompile, must also check the syntax of the SQL, but the BIND check is more sophisticated. Instead of using the top, DECLARE TABLE portion of the DCLGEN, BIND uses the DB2 CATALOG table information to make sure that the column names are valid, that comparisons are numeric-to-numeric, and so on. This second syntax check occurs because you can't trust the one done by the precompiler because the precompiler check used the DCLGEN. You could have a DCLGEN and not have the DB2 table.

The third, and most important, BIND task is to come up with run-time instructions for the SQL in the DBRM. Each SQL statement is parsed and all of the possible (realistic) methods for retrieving the desired columns and rows from the table are weighed, measured, and evaluated based on possible estimated I/O, CPU, and SORT overhead. A ton of information is used as input to the BIND process, not just CATALOG information put there by running the RUNSTATS utility. BIND input includes, for example:

Indexes (what columns are in the indexes?)
Columns (how long is this column and how much room will it occupy in a SORT record?)
System resources (how big are the system resources, buffer pool, and RIDPOOL?)
Processors (how big are they and how many engines do they have?)
DB2 (what release is running?)
Parameters (what are the values of the BIND parameters?)

After all that input (and more) is weighed and compared, the cheapest, most costeffective access path is chosen, and the runtime instructions for that one path are created. (Interestingly, DB2 BIND sometimes generates instructions for more than one path.) This process is called optimization, and it's repeated for each SQL statement in the DBRM until all access paths are decided and the run-time instructions are created for each. As the optimizer decides on each path, writes are done to DB2.

BIND checks to see if you bound with the parameter EXPLAIN(YES); if so, it writes documentary evidence about the chosen path to the PLAN_TABLE and to the DSN_STATEMNT_TABLE for your edification.
BIND also writes a lot of information to multiple CATALOG tables, documenting the fact that the BIND did occur. In fact, the tattooed DBRM, which is not used at run time, is moved into the CATALOG. Objects chosen by the optimizer are documented in the CATALOG in cross-reference tables. And BIND parameters are recorded in the CATALOG also.

WHERE ARE THE INSTRUCTIONS?

It's interesting that the actual instructions for the access path are not written to the CATALOG. You can't look at information in the DB2 CATALOG to figure out whether a query will do synchronous or asynchronous reads at run time. You can't tell if the query will match on three columns of an index or five. The actual run-time instructions aren't stored in the CATALOG. They're definitely not in the DBRM, which is input to the BIND. So, where are they stored?
This question is one of the many reasons that this column grew into two parts. Where in the heck are the run-time instructions? Should you ever use the PLAN_TABLE at run time? Are the run-time instructions in a package, a plan, or a version? All of the above? The infamous "it depends"? Where are the instructions stored while you wait for the LOAD module to run? Like many question come in mind.... so don't worry ....and
Stay tuned for answers to these and other questions..........................