Day 18: Transaction Management and Concurrency Control Techniques

Hello Dear Students,
  Hope you all are doing good...

Aaj hum transaction management and concurrency control techniques wala full topic cover karenge.

Let's get started...

TRANSACTION MANAGEMENT - Transaction means set of operations on the database from real world events. Means, jab bhi real world mein kuch event occur hote hain to usse database mein update kiya jata hai to wo ek transaction hoti hai. Example, ek bank mein agar kisi account mein kuch bhi changes ya updation hote hain to use hum database mein store karte hain to wo ek transaction hoti hai. Jaise ki transaction of deposit money, transaction of withdraw of money, etc. Transaction perform karne ke liye UPDATE statement use ki jati hai. For example, person A has account in the bank, and transactions can be-
  • Adding amount of Rs.1,000 to bank account of A using UPDATE command.
  • Subtracting amount of Rs.500 from bank account of A using UPDATE command. 
READ/WRITE OPERATIONS - Jab ek transaction perform hoti hai to bahut sare operations bhi perform hote hain. Like, Database secondary storage device mein store hota hai, jab transaction perform karni hoti hai to ye data secondary storage device se fetch kiya jata hai and then update hone ke baad dubara use secondary storage device mein hi store kar diya jata hai. 
Jab secondary storage device se database ko fetch kiya jata hai to wo Read operation hota hai, and jab data ko database mein update karke dubara secondary storage device mein store kar diya jata hai to wo Write operation hota hai.

Ab jab multiple transactions ko perform karna hoga to ye baar baar disk transfers problem create karte hain and isko reduce karne ke liye hum data ko secondary storage device ke ek small portion (buffer) mein transfer kar dete hain. Ab saare read/write is buffer par consider hoge. Agar data ko fetch karna ho and wo buffer mein present na ho to buffer manager Read operation ko perform karega and data ko store kar dega, and agar data already buffer mein huya to use fetch karke update kar dega to wo Write operation hoga. 

PROPERTIES OF TRANSACTION - Jab multiple operations same time par perform honge to usse transactions overlap ho jayengi and inconsistent state mein aa jayengi. So, isse data mein accuracy nahi rahegi, thus, is problem ko solve karne ke liye transaction ki ACID properties hain jo ki safely data ko transfer karti hai. ACID means ye 4 properties hain jinke 1st letter se ACID bana hai-
  1. Atomicity
  2. Consistency
  3. Isolation
  4. Durability
ATOMICITY - Atomicity basically means "all or nothing". Atomicity property ye ensure karti hai ki ya to transaction fully complete hogi ya fir bilkul bhi nahi hogi. Koi bhi transaction half complete nahi hogi. Ya to saare operations perform honge ya fir ek bhi operation perform nahi hoga. 

CONSISTENCY - Consistency basically means sirf valid data hi database mein store kiya jayega, invalid or incorrect data store nahi kiya jayega. If invalid data store hota hai database mein, to wo inconsistent hoga and rollback ho kar dubara database pehla vali state mein aa jayega.

ISOLATION - Isolation means lonely basically. So, isolation property means ek transaction ek time par akele hi perform ki jayega and agar kisi aur transaction ko wo transaction ka result chahiye hoga to wo transaction complete hone ke baad hi fetch kar payega. Ek transaction jab tak complete nahi ho jati tab tak wo transaction kisi aur ko visible nahi hogi, means, other transactions cannot see the changes unless it gets completed.

DURABILITY - Durability means jab ek baar transaction commit ho gyi then wo lost nahi ho sakti and na hi undone ho sakti hai, wo forever database mein store rahegi. If system crashes or event failure, then recover ho jayega.

CONCURRENCY CONTROL - 'Concurrency' word Concurrent se aya hai which basically means at the same time. Jab hum transactions ko same time par access karte hain to usme bahut saari problems aa jati hai. Wo problems ko hum concurrency kehte hain. For example, airline reservation ek bahut bada database hai and isme multiple users same time par data ko access karte hain and update karte hain, to aise large databases mein concurrency problems aati hai. 

Ek possible solution hai ki hum data ko serially access karein means one by one, but wo bahut time lega and mostle transactions ko bahut time wait karna hoga, and large systems mein it is not a good practice. 

So pehle hum study karenge ki concurrency ki problems konsi hai and then us problems ko control karne ki techniques ko understand karenge. 

CONCURRENCY PROBLEMS - Basically 3 problems-

  • LOST UPDATE PROBLEM - Jab multiple transactions same time par run karti hain to ek transaction update hone se pehle hi doosri transaction usse overwrite kar deti hai, to wo lost update problem hoti hai. This is WW conflict means Write-Write conflict because dono transactions write operation perform kar rahi hain. Jab 2 transactions update kar rahi hi ek transaction update ko doosri transaction overwrite kar deti hai, wo lost update problem hai.
  • DIRTY READ PROBLEM - It is also known as Temporary update problem. Dirty read problem tab aati hai jab transactions ko intermediate results dekhne ki authority di jati hai. For example, agar ek transaction update kar rahi hai but due to some reasons wo transaction abort ho jati hai and rollback ho jati hai, but another transaction wo intermediate result par updation kar deti hai, to wo dirty read problem hoti hai. This is WR conflict means Write-Read conflict because ek transaction write karti hai and doosri transaction usse read karti hai.
  • UNREPEATABLE READ - It is also known as Inconsistent Analysis Problem. Above both problems are due to write operation but this problem is due to read problem. Agar hum ek transaction ko multiple times read karte hain and wo jab update ho jati hai and dubara jab hum use read karte hain to wo alag result deti hai so this is unrepeatable read. It is RW conflict means Read-Write conflict because pehle hum read operation perform karte hain and then write. For example,
T1: Read A
T2: Update A 
T1: Read A

T1 reads A and then T2 update A, then again when T1 reads A then the value of A is changes, this is called unrepeatable read problem.

CONCURRENCY CONTROLS - Concurrency problems ko conrol karne ke liye various techniques/methods hain which are as follows-
  1. With Locking Methods
  2. 2-Phase Locking
  3. Based on Time-Stamp Ordering
  4. Based on Optimistic Scheduling
  5. Multiversion Control Techniques

1. CONCURRENCY CONTROL WITH LOCKING METHODS - Is technique mein various locks use kiye jate hain concurrency ko control karne ke liye. Various locks are-
  • Binary Locks - Locks basically use hote hain data/transaction ko lock karne ke lye so that other transaction use process na kar sake. Binary lock mein basically 2 values hoti hai- Locked and Unlocked which are represented as 1 and 0. Jab ek transaction already use ho rahi hai to wo lock ho jayega so that other transactions use process na kar sake, when the transaction is free then it unlocked so that it can be processed by another transaction. Transactions request karengi and if locked then wait karegi unlocked hone tak. Jab transaction process honi shuru hogi to wo apni transaction ko lock kar degi and then jab transaction complete ho jayegi to wo apni transaction ko unlock kar degi.
  • Shared/Exclusive Locks - Shared and exclusive locks alag alag 2 locks hain. Shared lock is also known as read lock. Shared lock mein multiple transactions ek hi time par lock ho jayegi only read karne ke liye, ye transactions modify nahi kar sakti only read hi kar sakti hai.  Exclusive lock is also known as write lock. Isme single transaction lock hoti hai write operation perform karne ke liye and jab write operation perform ho jata hai to ye unlock ho jati hai and then is transaction ko read kiya ja sakta hai.

2. TWO-PHASE LOCKING (2PL) - Two-Phase Locking mein bhi locking and unlocking hoti hai but ye 2 phases mein hoti hai. Phases-
  1. Growing Phase - It is also known as expanding phase because Growing phase mein only locking hoti hai and unlocking nahi hoti hai. Means locks lagaye jate hain(shared, exclusive) read/write but jin par already locks hain wo unlock nahi kiye jate only new transactions par hi locking ki jati hai.
  2. Shrinking Phase - It is also known as contracting phase because Shrinking phase mein only unlocking ki jati but new locking nahi ki jati. Jo locks hain unhe unlock kar diya jata hai. 
Growing phase mein number of locks are increasing and Shrinking phase mein number of locks only decreasing. In 2PL agar ek transaction unlock ho jati hai then wo transaction ko dubara lock nahi kiya ja sakta hai.

3. CONCURRENCY CONTROL BASED ON TIME-STAMPING ORDERING - Time-stamping ki help se hum concurrency problems ko control kar sakte hain. Time-stamping basically means hum uniquely idntify kar sakte hain ki konsi transaction older hai and konsi transaction new/young hai. Like, hum har ek transaction ka time-stamp consider karenge transaction ki duration dekhne ke liye. Time-stamp 2 tareeke se kiya ja sakta hai-
  1. System clock- Means system clock ke time se hume pata chal jayega ki konsi transaction kab execute huyi thi.
  2. Increementor - Means hum increement laga denge, jo transaction jitni baar execute hogi uska increement hota jayega.
Isme older transaction wo hai jo bahut time se execute na huyi ho and younger transaction wo hai jo recently use huyi ho.

 Read_timestamping means jo transaction last time read operation perform ki ho wo transaction. 
Write_timestamping means jo transaction last time write operation perform ki ho wo transaction.

Time-stamping basically priority ke liye use ki jati hai. Older transaction ki priority is more than younger transactions. If there are multiple transactions to access same data item, then the older transaction time-stamping will be accessed first as it has more priority. 

4. CONCURRENCY CONTROL BASED ON OPTIMISTIC SCHEDULING - Optimistic scheduling also known as Validation technique. The word 'optimistic' means being positive and confident. So, in this technique we will assume that all the operations being performed are accurate. Optimistic scheduling mein 3 phases hote hain-
  1. READ PHASE - Read phase mein hum multiple transactions ko same time par read and write kar sakte hain. Agar hum write operation perform karte hain to wo local copies mein store hota hai and actual database mein store nahi hota. 
  2. VALIDATION PHASE - Validation phase mein ye validate kiya jata hai ki jo bhi updations read phase mein huyi hain wo correct hai ya nahi. If that updations are inconsistent and inaccurate then it is discarded, and if it is correct then it is go to next phase.
  3. WRITE PHASE - Write phase mein saari updations ko commit kiya jata hai means permanently save kiya jata hai database mein. Isme transactions ko time-stamp kiya jata hai so that transaction overlapping na ho.

5. MULTIVERSION CONCURRENCY CONTROL TECHNIQUES - Multiversion technique means old data ko bhi retain kiya jaye and new data ko bhi. Jab transaction update hoti hai to new transaction hi save ki jati hai and old values ko usse overwrite kar diya jata hai. But multiversion mein hum old and new dono versions ko retain karte hain. Multiversion techniques-
  1. Multiversion time-stamping ordering
  2. Multiversion two phase locking

Best of Luck Students, 
  Do share, subscribe and comments if you like our efforts..