Sydney Oracle Meetup Message Board › Exadata: 'direct path read', 'object checkpoint' & consistency reads

Exadata: 'direct path read', 'object checkpoint' & consistency reads

Yury Velikanov
Posted Jul 20, 2009 2:38 PM
user 9156087
Sydney, AU
Post #: 4
Send an Email Post a Greeting
Last Friday we had a very good presentation & discussion on the Exadata solution.
As always big thanks to Alex & Martin for organizing and David & Tim for presenting.

This discussion triggered a question on how read consistency is ensured during Smart Scan Offload processing? The problem is that during Full Scan Offload Processing on Storage level some of table blocks could be changed by other sessions and written to disk before Smart Scan read the block. In that case in order to ensure the read consistency for that block Oracle should ether prevent anybody from modifying the table during the scan (we know that readers doesn’t block writers in the Oracle world, so this is less likely happens) or Oracle should do some post-processing work on the blocks returned. Alex suggested that this probably processed in the same way as in case of direct path reads. Oracle issues object checkpoint to ensure read consistency. This is something that I couldn’t understand and spent some time to make some research around the topic.

Reading through some posts available on the internet it seams that direct path reads works as the following:

--- Step A. Issue "object checkpoint"
Just before starting direct path processing Oracle session issues "object checkpoint" to protect itself from reading inconsistent data blocks with lower SCN number than Full Scan’s SCN number is.
Let’s shortly explain the reason for "object checkpoint". Imagine what the Full Scan’s (SQL’s) SCN number is 1050. If Oracle doesn’t issue "object checkpoint" at the beginning of the query processing there might be a dirty block in the memory (SGA) with SCN 1040 and hard copy of the same block on the disk with SCN 1030 (obviously the block is modified just recently before we start the Full Scan and DBWR didn’t write it to the disk jet). As the direct read doesn’t look in SGA but reads blocks directly from the disk it would read the block and wrongly consider the data as consistent (block’s SCN - 1030 < SQL’s SCN - 1050).

--- Step B. Reads data blocks from the disk directly
This step is obvious. The session issues read requests to read the data from the disk to PGA directly.

--- Step C. Reconstruct a consistent version of data blocks if necessary
The session reads each block’s SCN number and compares it with SQL’s SCN. If the block’s SCN number is higher than SQL’s SCN number it looks in to block’s ITL (Interested Transaction List) and reads UNDO data from ether SGA or disk to reconstruct previous version of the block.

Yury

References:
http://www.freelists....
http://oracledoug.com...
http://groups.google....
Yury Velikanov
Posted Jul 20, 2009 2:57 PM
user 9156087
Sydney, AU
Post #: 5
Send an Email Post a Greeting
Question: How Consistent reads are ensured in Exadata case?

It looks like it is clear how Consistent Reads ensured in case of Direct Reads. The consistent image of the blocks should be processed on DB server side (where we do have access to UNDO information).

However if Exadata returns to DB server relevant rows and columns only, the question still is open how the consistent image of the data is processed?

My guess is:
- Ether Oracle returns the whole block from the Exadata cells to DB cell having at least one row what satisfy the condition. In that case I don’t see how the Exadata could eliminate unnecessary columns from to be returned from the storage cell
- Ether Oracle sending block’s headers + relevant rows & columns in the format what makes possible to reconstruct consistent image of the data on the DB cell side.

Any input are welcome,
Yury
Evgeny Platonov
Posted Jul 20, 2009 4:37 PM
user 9369654
Sydney, AU
Post #: 1
Send an Email Post a Greeting
Yury,

Thanks for the research! It makes things clearer. But are you sure that Exadata cell returns only particular rows and columns rather than whole blocks?

Evgeny
Evgeny Platonov
Posted Jul 20, 2009 4:46 PM
user 9369654
Sydney, AU
Post #: 2
Send an Email Post a Greeting
By the way, if every SQL initiates "object checkpoint" for Exadata then it should be very huge load for the system given the number of SQL that Exadata can process.
Yury Velikanov
Posted Jul 20, 2009 8:31 PM
user 9156087
Sydney, AU
Post #: 6
Send an Email Post a Greeting
Yury,
But are you sure that Exadata cell returns only particular rows and columns rather than whole blocks?

This is that white papers suggests e.g.:
http://www.oracle.com...

I am sure that David will have more to comment on that.
Yury Velikanov
Posted Jul 20, 2009 8:37 PM
user 9156087
Sydney, AU
Post #: 7
Send an Email Post a Greeting
By the way, if every SQL initiates "object checkpoint" for Exadata then it should be very huge load for the system given the number of SQL that Exadata can process.

This is true. But on the other hand one of the characteristics of DWH is fewer SQL-s, but lager resource consumptions per SQL.
At least this is one logical explanation on how Exadata can ensure data consistency.
I would be glad to hear explanations from others.

:)

Evgeny - thank you for participation.

Yura
David Centellas
Posted Jul 21, 2009 1:43 PM
user 9635782
Sydney, AU
Post #: 1
Send an Email Post a Greeting
Protection Against Data Corruption
Exadata Cell is compliant with the Oracle Hardware Assisted Resilient Data (HARD)
initiative, a joint initiative between Oracle and hardware vendors to prevent data
corruptions from being written out to disks. Data corruptions, while rare, can have a
catastrophic effect on a database, and therefore on a business. Exadata Cell takes data
protection to the next level by protecting business data, not just the physical bits.
The key approach to detecting and preventing corrupted data is block checking where
the storage subsystem validates the Oracle block contents. Oracle Database validates
and adds protection information to the database blocks, while Exadata Cell detects
corruptions introduced into the I/O path between the database and storage. It stops
corrupted data from being written to disk, and validates data when reading the disk.
This eliminates a large class of failures that the database industry has previously been
unable to prevent.
Exadata Cell implements all the HARD checks, and because of its tight integration
with Oracle Database, additional checks are implemented that are specific to Exadata
Cell. Unlike other implementations of HARD checking, HARD checks with Exadata
Cell operate completely transparently. No parameters need to be set at the database or
storage tier. The HARD checks transparently handle all cases, including ASM disk
rebalance operations and disk failures.

Hope this helps
Powered by mvnForum

Our Sponsors

The Pythian Group

Database services & consulting in Australia. Meetup organizer.

Other nearby
Meetups
Why these groups?
x

The Meetup Groups shown here are topically similar to Sydney Oracle Meetup.

Groups are more likely to be displayed here if they:

  • have a Meetup scheduled
  • have a high rating
  • have a group photo
  • are "public" and not "private"
  • have shown they are likely to stick around (older than 30 days)

Log in

  • Not registered with us yet?
or

Log in to Meetup with your Facebook account.

Sign up

or

Join this Meetup Group even quicker with your Facebook account.

By clicking the "Sign up using Facebook" or "Sign up" buttons above, you agree to Meetup's Terms of Service