Some time ago, I had posted a note on performing the initial data load for a GoldenGate capture and replication environment.
http://gavinsoorma.com/2010/02/oracle-goldengate-tutorial-4-performing-initial-data-load/
There are several ways of performing the initial data load with GoldenGate as well as from outside GoldenGate using say the Oracle export/import utility or Datapump facility.
In this case we shall look at using the Oracle database utility SQL Loader as well as the BULKLOAD extract parameter to perform the GG initial data load.
Note that in a production environment where we would not have the luxury of extended downtime for the initial load, we will configure addtional change synchronization extract and replicat processes and we would start those before performing the initial data load. That way whatever changes which are happening while the initial load is being performed are being captured as well and we can apply these changes in the target database via the replicat process after the initial data load in the target database has been completed.
In this method the initial load extract process will read the records from the source table directly (not from the redo logs or achived redo logs) and write them to the local or remote trail files in ASCII format.
These external ASCII files are then read by the Oracle SQL Loader utility to load records in the target database. Not only just SQL Loader, but these ASCII files can also be read by the BCP, DTS and SSIS SQL Server utilities as well.
On the target, the initial load Replicat process will create (and can also run) the control files used by the SQL Loader utility.
Loading data with a database utility
Let us now look at a test case. We have a source table called LOAD_DATA with 50614 rows. The same table exists in the target database but at the moment does not have any records. We want to now load the 50614 rows from source to target table.
Source database
SQL> select count(*) from load_data; COUNT(*) ---------- 50614
Target database
SQL> select count(*) from load_data; COUNT(*) ---------- 0
Initial Load Extract parameter file
extract load3 SETENV (NLS_LANG = "AMERICAN_AMERICA.AL32UTF8") userid ggs_owner, password ggs_owner FORMATASCII, SQLLOADER rmthost demora061rh, MGRPORT 7809 rmtfile ./dirdat/load_data.dat PURGE TABLE SH.LOAD_DATA;
We now run the Extract process directly from the command line as follows:
[oracle@pdemora062rhv goldengate]$ ./extract paramfile /u01/app/goldengate/dirprm/load3.prm reportfile load3.rpt
If we view the report for the load3 extract process we can see that records have been written in ASCII format to the external file load_data.dat
2012-06-16 06:54:27 INFO OGG-01478 Output file ./dirdat/load_data.dat is using format ASCII. 2012-06-16 06:54:33 INFO OGG-01226 Socket buffer size set to 27985 (flush size 27985). Processing table SH.LOAD_DATA *********************************************************************** * ** Run Time Statistics ** * *********************************************************************** Report at 2012-06-16 06:54:35 (activity since 2012-06-16 06:54:24) Output to ./dirdat/load_data.dat: From Table SH.LOAD_DATA: # inserts: 50614 # updates: 0 # deletes: 0 # discards: 0
Target Initial Load Replicat parameter file
GENLOADFILES sqlldr.tpl userid ggs_owner, password ggs_owner extfile ./dirdat/load_data.dat assumetargetdefs map sh.load_data,target sh.load_data;
The GENLOADFILES parameter specifies the name of the template file which is going to be used to generate the control and run files which in this case is going to be used by SQL Loader.
The tenplate file for SQL Loader is sqlldr.tpl and this file can be found in the root folder of the GoldenGate software installation.
More information about the GENLOADFILES parameter can be found in the Oracle GoldenGate Windows and UNIX Reference Guide (Pages 223-226).
We now run the Replicat process directly from the command line as follows:
[oracle@pdemora061rhv goldengate]$ ./replicat paramfile /u01/app/goldengate/dirprm/load4.prm reportfile load4.rpt
If we view the report for the initial load Replicat process load4, we can see that the SQL Loader control file has been created.
…
….
File created for loader initiation: LOAD_DATA.run File created for loader control: LOAD_DATA.ctl Load files generated successfully.
If we look at the contents of the LOAD_DATA.run file, we find that it has all the required commands we need to load data using the SQL Loader utility.
[oracle@pdemora061rhv goldengate]$ cat LOAD_DATA.run sqlldr userid=ggs_owner/ggs_owner control=LOAD_DATA log=LOAD_DATA direct=true
Let us check the contents of the SQL Loader control file which has been created.
[oracle@pdemora061rhv goldengate]$ cat LOAD_DATA.ctl unrecoverable load data infile load_data.dat truncate into table LOAD_DATA ( OWNER position(4:33) defaultif (3)='Y' , OBJECT_NAME position(35:64) defaultif (34)='Y' )
I have edited the control file and inserted the full path of the location of the SQL Loader .dat file. I have also qualified the table name with the schema name as well.
[oracle@pdemora061rhv goldengate]$ cat LOAD_DATA.ctl unrecoverable load data infile '/u01/app/goldengate/dirdat/load_data.dat' truncate into table SH.LOAD_DATA ( OWNER position(4:33) defaultif (3)='Y' , OBJECT_NAME position(35:64) defaultif (34)='Y' )
We now execute the LOAD_DATA.run file.
[oracle@pdemora061rhv goldengate]$ ./LOAD_DATA.run SQL*Loader: Release 11.2.0.1.0 - Production on Sat Jun 16 07:05:01 2012 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. Load completed - logical record count 50614.
Let us confirm this…
SQL> select count(*) from load_data; COUNT(*) ---------- 50614
Loading data using the BULKLOAD parameter
BULKLOAD directs the Replicat initial load process to communicate directly with the Oracle SQL*Loader interface and load data as a direct path bulk load operation.
The limitations of this method as that BULKLOAD is specific for the Oracle utility SQL Loader and cannot be used for other databases. Also, if the table has columns with LOB or LONG data, then BULKLOAD cannot be used.
Let us test this out using the same source and target tables which we used in the previous example.
Target database
SQL> truncate table load_data; Table truncated. SQL> select count(*) from load_data; COUNT(*) ---------- 0
On the Source GoldenGate environment these are the contents of the extract parameter file:
[oracle@pdemora062rhv dirprm]$ cat load1.prm EXTRACT load1 USERID ggs_owner, PASSWORD ggs_owner RMTHOST pdemora061rhv, MGRPORT 7809 RMTTASK replicat, GROUP load2 TABLE sh.load_data;
On the target GoldenGate environment these are the contents of the replicat parameter file:
[oracle@pdemora061rhv dirprm]$ cat load2.prm REPLICAT load2 USERID ggs_owner, PASSWORD ggs_owner BULKLOAD ASSUMETARGETDEFS MAP sh.load_data, TARGET sh.load_data;
On the source we now start the initial load extract process.
GGSCI (pdemora062rhv.asgdemo.asggroup.com.au) 1> start extract load1 Sending START request to MANAGER ... EXTRACT LOAD1 starting GGSCI (pdemora062rhv.asgdemo.asggroup.com.au) 2> info extract load1 EXTRACT LOAD1 Last Started 2012-06-15 06:21 Status STOPPED Checkpoint Lag Not Available Log Read Checkpoint Table SH.LOAD_DATA 2012-06-15 06:22:13 Record 50614 Task SOURCEISTABLE
On the target database we see that the rows form the source table have been inserted into the LOAD_DATA table.
SQL> select count(*) from load_data; COUNT(*) ---------- 50614