Home > Tools > Oracle Data Pump vs Update

Oracle Data Pump vs Update

Here is simple scenario:

First I am going to create simple table with two rows and pouplate it with 10000 records:
SQL> create table demo.test ( a number, b number);

Table created.

SQL> begin
for i in 1..10000 loop
insert into demo.test values (i,1);
end loop;
end;
/  2    3    4    5    6

PL/SQL procedure successfully completed.

SQL> select count(*) from demo.test;

COUNT(*)
———-
10000

SQL> commit;

Commit complete.

SQL> select * from demo.test where rownum < 10;

A          B
———- ———-
1          1
2          1
3          1
4          1
5          1
6          1
7          1
8          1
9          1

9 rows selected.

Next I am going to alter table and add new column.Also I need to insert some values into this newly created columns:
SQL> ALTER TABLE DEMO.TEST ADD (C  NUMBER);
ALTER TABLE DEMO.TEST ADD (D  NUMBER);
Table altered.

SQL>

Table altered.

SQL> desc test
Name                                      Null?    Type
—————————————– ——– —————————-
A                                                  NUMBER
B                                                  NUMBER
C                                                  NUMBER
D                                                  NUMBER

SQL>  select * from test where rownum < 10;  ( quick check )

A          B          C          D
———- ———- ———- ———-
1          1
2          1
3          1
4          1
5          1
6          1
7          1
8          1
9          1

9 rows selected.
The C and D columns are empty since I did not specified default/virtual value.

e.g.

ALTER TABLE DEMO.TEST  ADD (c  NUMBER  DEFAULT 0);
ALTER TABLE DEMO.TEST  ADD (d  NUMBER  DEFAULT 1);
When I use default/virtual value for column I run into issues with Materialiezed Views.If you do not have MW than this will work.

The other options is to run update statement and populate this columns:

e.g.

SQL> update demo.test test set c=0 , d=1;

10000 rows updated.

SQL> commit;

Commit complete.

SQL> select * from demo.test where rownum < 10;

A          B          C          D
———- ———- ———- ———-
1          1          0          1
2          1          0          1
3          1          0          1
4          1          0          1
5          1          0          1
6          1          0          1
7          1          0          1
8          1          0          1
9          1          0          1

9 rows selected.
I realized that this can take long long time for table with huge number of rows plus lot of archive and flashback logs generated.
My third approach is to use Oracle Data Pump and specificaly data masking.

So before I can run export data pump I will create data pump masking package to map new columns.In my case function will return 0 and 1:

CREATE OR REPLACE PACKAGE sys.datapump_masking_pkg
AS
FUNCTION ZERO_C (p_in NUMBER)
RETURN NUMBER;
FUNCTION ONE_D (p_in NUMBER)
RETURN NUMBER;
END;
/
CREATE OR REPLACE PACKAGE BODY sys.datapump_masking_pkg
AS
FUNCTION ZERO_C (p_in NUMBER)
RETURN NUMBER
IS
BEGIN
RETURN 0;
END;

FUNCTION ONE_D (p_in NUMBER)
RETURN NUMBER
IS
BEGIN
RETURN 1;
END;
END;
/
so after altering table ( adding new columns ) I would run export data pump:

expdp dumpfile=demo.dmp tables=demo.test \
remap_data=demo.test.C:datapump_masking_pkg.ZERO_C \
remap_data=demo.test.D:datapump_masking_pkg.ONE_D \
LOGFILE=test.log

Note: If you need to find when last DML was run on table use ora_rowscn:
select max(ora_rowscn), scn_to_timestamp(max(ora_rowscn)) from  demo.test;

Now if I can ( depens on application etc ) I can drop table ( I am using flashback so it is easy to restore it if needed )  or
you can just load data to new table and later on rename it.

DROP TABLE demo.test;

and finaly import data:

impdp dumpfile=demo.dmp tables=demo.test logfile=demo.log
and quick check:
SQL> select * from demo.test where rownum < 10;

A          B          C          D
———- ———- ———- ———-
1825          1          0          1
1826          1          0          1
1827          1          0          1
1828          1          0          1
1829          1          0          1
1830          1          0          1
1831          1          0          1
1832          1          0          1
1833          1          0          1

9 rows selected.
My columns are there and populated with expected values.

Pros:  Faster than update statement.

It will still generate lot of archivelogs.Did not have chance to compare flashback logs generation between update and impdp.
Do not forget to purge dba recyclebin at the end if you have dropped table.

PURGE DBA_RECYCLEBIN;

About these ads
Categories: Tools
  1. Ankur
    October 17, 2010 at 10:10 pm

    Hi,

    Thanks for a some new way for adding a column with default value in the table. But in this case we need some downtime for the table while importing as this might take more time if my table is really big (say 40 GB).
    In other case of Update – at least my application will not stop and it will keep on working as while update it will take a lock every time on the row and we can commit on every 1000 or 2000 rows.

    Thanks

  1. May 28, 2010 at 4:53 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 43 other followers

%d bloggers like this: