gplivna.eu

The curse of gapless sequences

By Gints Plivna

 

The curse. 1

The road to the curse. 1

The cure (unfortunately just more like a phantom than real one) 2

Oracle sequences. 2

Separate ID table. 2

Enhancing the solution with RETURNING clause of UPDATE. 2

Small tables (1 row!!!) do not need indexes? Wrong! 3

Reducing serialization. 4

Summary. 5

About the author 5

Licenses. 5

 

The curse

Gapless sequences usually are one of the best examples of bad requirements gathering process. Gapless sequences means:

1)      single serialization point for inserts (in one table usually) and

2)      thus seriously reducing scalability and

3)      thus resulting in bad performance for multi-user multi-insert environments.

Every other user has to wait for the current user while he either COMMITs his insert and so uses next gapless sequence number or ROLLBACKs the insert so allowing another user choose the same next number.

This topic recently had quite much attention in semi closed (archives available only to subscribers) ODTUG-SQLPLUS-L list.

The road to the curse

As I've described in my blog post roots of the problems related to gapless sequences must be sought much before coding and tuning. Usually the main problem is bad requirements gathering process not thinking about how the requirement will be implemented and how it will affect application. So as soon as one encounters such requirement the very first reaction should be NOOOOOOO!!! Explain customer above mentioned problems. Try to enforce customer not to ask for such requirement. Explain the alternatives:

-         if he needs it just for reports, generate the numbers in query time,

-         if he needs it just because the old system does so, explain that this is the best moment to change old rules,

-         if he needs it just because it seems more convenient for him, explain that weaknesses will be much bigger than strengths,

-         if he needs it just because to make some counting, offer him appropriate report,

-         if he needs it just because anything, tell him something that disproves that.

However there is one case when usually no one can do anything - legislation. Usually it is too hard to change the law (-s) enforcing such a stupid requirement. This is the time when you have to do all the best to reduce the damage and of course don't forget to document your fight so that you can sleep well after your decision.

The cure (unfortunately just more like a phantom than real one)

So what are the possibilities to implement gapless sequences in Oracle?

Oracle sequences

Forget about sequences - I mean oracle supplied objects - sequences. They aren't gapless! Never! Neither parameter combination ensures continuous numbers. NOCACHE won't ensure that, ORDER won't ensure that. NONE. Why?

Because as soon as one gets the sequence number it is lost forever. The most common reasons are simple rollback and closed session because of "alter system kill session" or simply lost contact between client and server which ultimately results in rollback. As soon as you are using SQL Merge statement and sequences together for When Not Matched Then Insert clause you are destined to sequence gaps as I've shown in my blog entry.

To tell the truth Oracle never guaranteed gapless sequences, Oracle guaranteed only unique values out of sequences.

So Oracle objects sequences are not suitable for gapless sequences.

Separate ID table

Most common scenario is creating separate ID table containing one row for sequence value. When you need new id:

1)      select new id value from ID table and lock it;

2)      do your work ;

3)      update ID table with next sequence value;

4)      commit or rollback all your work along with actions with ID table in one transaction.

Example follows.

CREATE TABLE id (id_row NUMBER NOT NULL);

INSERT INTO id VALUES (1);

DECLARE

  v_new_id NUMBER;

  CURSOR c IS SELECT id_row FROM id FOR UPDATE;

BEGIN

  OPEN c;

  FETCH c INTO v_new_id;

  -- do your work;

  UPDATE id SET id_row = id_row + 1 WHERE CURRENT OF c;

  CLOSE c;

  COMMIT;

END;

/

So what other people will get while you are doing your work i.e. thinking whether to commit or rollback?

All other sessions will simply wait forever while you make your decision. And here comes the most critical point - you have to make your decision very fast to at least a bit relief other people's pain.

Enhancing the solution with RETURNING clause of UPDATE

OK. Now let's assume you have done everything to minimize the think time of people as much as you can. Can we do something with our procedure to make it also a bit faster?

The answer is yes.

Firstly let's remember the wonderful RETURNING clause. In the script above we are touching table two times once with SELECT and second time with UPDATE. We can do all that at once!

Let’s look at example:

DECLARE

  v_new_id NUMBER;

BEGIN

  UPDATE id SET id_row = id_row + 1

  RETURNING id_row INTO v_new_id;

  -- do your work;

  COMMIT;

END;

/

As soon as your row will be updated it will be locked and unlocked only after you'll commit or rollback.

I've done comparisons of 10K rows updated with both methods and compared the timings and latches using Tom Kyte's runstats.

Comparison and full result list you can find here but the summary is as follows:

SELECT FOR UPDATE + UPDATE needed 183 centiseconds but

UPDATE…RETURNING just 94 centiseconds – almost two times less.

Sum of latches was 434 K for SELECT FOR UPDATE + UPDATE but 303 K for UPDATE…RETURNING, so the gain was ~30%.

Small tables (1 row!!!) do not need indexes? Wrong!

So we have one row in our table. Does really adding index will do something good for this 1 row table?

We’ll create another table with a key holding value for our gapless sequence value and adding everywhere where clauses to select just one appropriate row with key = 1.

CREATE TABLE id1 (

  key NUMBER NOT NULL,

  id_row NUMBER NOT NULL,

  CONSTRAINT id1_pk PRIMARY KEY (key));

INSERT INTO id1 VALUES (1, 1);

DECLARE

  v_new_id NUMBER;

  CURSOR c IS SELECT id_row

  FROM id1 WHERE key = 1 FOR UPDATE;

BEGIN

  OPEN c;

  FETCH c INTO v_new_id;

  -- do your work;

  UPDATE id1 SET id_row = id_row + 1 WHERE CURRENT OF c;

  CLOSE c;

  COMMIT;

END;

/

And enhanced variant with UPDATE…RETURNING is as follows:

DECLARE

  v_new_id NUMBER;

BEGIN

  UPDATE id1 SET id_row = id_row + 1

  WHERE key = 1

  RETURNING id_row INTO v_new_id;

  -- do your work;

  COMMIT;

END;

/

So what runstats have shown us this time?

Comparison and full result list you can find here but the summary is as follows:

SELECT FOR UPDATE + UPDATE needed 125 centiseconds but

UPDATE…RETURNING just 58 centiseconds – more than two times less.

Sum of latches was 196 K for SELECT FOR UPDATE + UPDATE but 122 K for UPDATE…RETURNING, so the gain was ~38%.

But the most beautiful thing is that all these numbers are much smaller compared to the table without index.

Reducing serialization

Assume your thousand user multi-insert application runs on this terrible requirement – one continuous gapless sequence. Usually this is some kind of old paperwork habit – giving out paper sheets with continuous numbers. However even in the old times it was not so bad for a real thousand multi-user company running its business. I cannot imagine a process when thousand people in old days came to one distributor and he gave the next paper sheet to thousands of people especially if these people were located in different cities. So why do we have to do that today? Why do we have to use only one distributor i.e. one row? We have already created a structure that can hold more than one value and what’s more important to get exactly that value. So we can insert more than one row into our table of IDs with just different key values. For example give a sequence window of allowable values for each of our five branches all around the country.

 

INSERT INTO id1 VALUES (2, 100);

INSERT INTO id1 VALUES (3, 200);

INSERT INTO id1 VALUES (4, 300);

INSERT INTO id1 VALUES (5, 400);

 

Of course we need to enforce that in case the window is exhausted values do not overlap with the next window. Check constraints may be used for that:

ALTER TABLE id1 ADD CONSTRAINT id1_win1_ck

CHECK (key = 1 AND id_row BETWEEN 1 AND 99 OR key <> 1);

ALTER TABLE id1 ADD CONSTRAINT id1_win2_ck

CHECK (key = 2 AND id_row BETWEEN 100 AND 199 OR key <> 2);

ALTER TABLE id1 ADD CONSTRAINT id1_win3_ck

CHECK (key = 3 AND id_row BETWEEN 200 AND 299 OR key <> 3);

ALTER TABLE id1 ADD CONSTRAINT id1_win4_ck

CHECK (key = 4 AND id_row BETWEEN 300 AND 399 OR key <> 4);

ALTER TABLE id1 ADD CONSTRAINT id1_win5_ck

CHECK (key = 5 AND id_row BETWEEN 400 AND 499 OR key <> 5);

Probably that’s not the best case if you need to issue a new window each day, but for relatively rare cases this may be used.

In case you have contention in database block level i.e. updating different rows located in the same database block, you can try to use two techniques:

1)      either increasing INITRANS parameter for table or

2)      use old trick with minimize records_per_block. I.e. creating table, putting in only one row, then issue ALTER TABLE <table_name> MINIMIZE RECORDS_PER_BLOCK, and then add all other rows. With such technique one can get 2 rows per block (not one BTW!).

However both non-default INITRANS and especially MINIMIZE RECORDS_PER_BLOCK should be used with caution and tested before you try to use them in production.

Summary

Don’t be fooled by the quite long “The cure” section! All methods explained in it aren’t worth a single penny as soon as user starts to think whether to commit or rollback. Even if he thinks one second each time, and using techniques above you decreased all other components to almost zero (remember also to add the time for business insert!) the maximum inserted record count per single gapless sequence won’t be greater than 60*60 = 3600 per hour. With normal Oracle sequences you can achieve much much better results and at least sequences won’t be your bottleneck. So think about that in requirements gathering process!
If you have any comments about this article you can discuss them in my blog here.

About the author

 

Gints Plivna gints.plivna@gmail.com is system analyst in Rix Technologies Ltd. (www.rixtech.lv). He has experience in working with Oracle since 1997 and his interests mostly have been connected with analyzing system requirements, design, development and SQL tuning.

 

Contacts:

e-mail - gints.plivna@gmail.com

website - http://www.gplivna.eu/

Licenses

 

This work is licensed under the Creative Commons Attribution-ShareAlike 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

 

First edition: 2007-04-11

Added link to Merge and sequence blog entry: 2008-05-22