Oracle Notes: Oracle GoldenGate Best Practices and Tips

Lately I've been working, once again, with GoldenGate (now Oracle GoldenGate) data integration software. GoldenGate offers tremendously useful capabilities which include CDC (Change Data Capture), Data Warehouse ETL, efficient/low impact data replication from diverse database management systems, real time standby database maintenance (for high-availability, upgrades and patches, feeding Oracle Data Integrator (ODI) and data distribution. So, I thought I'd offer some GoldenGate Best Practices and Tips that I've learned largely by making mistakes:

I. Best Practices

PARALLEL PROCESSING

Ensure the system has enough shared memory. GoldenGate runs as an Oracle process. Each Extract or Replicat process requires upwards of 25-50 MB of system shared memory. This means less memory for the Oracle DBMS, especially the SGA.

Use parallel Replicat groups on the target system to reduce latency thru parallelism. Consider parallel Extract groups for tables that are fetch intensive (e.g., those that trigger SQL procedures).

Group tables that have R.I. to each other in the same Extract-Replicat pair.

Pair each Replicat with its own trail and corresponding Extract process.

When using parallel Replicats, configure each one to process a different portion of the overall data.

PASSTHRU PARAMETER

Consider using this parameter if there is no filtering, conversion or mapping required and you’re using DATAPUMP.

In pass-through mode, the Extract process does not look up table definitions, either from the database or from a data definitions file. Pass-through mode increases the throughput of the data pump, because all of the functionality that looks up object definitions is bypassed.
This saves database fetches to improve performance.

INSERTAPPEND

A new GoldenGate 10.4 feature.

Use for large transactions .

Puts records at end of table rather than doing a more costly insert into other areas of table.

DATAPUMP (not the Oracle DB utility)

1. Primary Extract group writes to a trail on the source system.

2. Reads this trail and sends the data across the network to a remote
trail on the target.

3. Adds storage flexibility and also serves to isolate the primary
Extract process from TCP/IP activity.

4. Can be configured for online or batch.

5. Can perform data filtering, mapping, and conversion, or it can be
configured in pass-through mode, where data is passively
transferred as-is, without manipulation.

6. Use to perform filtering thereby removing that processing overhead
from the primary extract group.

7. Use one or more pumps for each source and each target for
parallelism.

In most business cases, it is best practice to use a data pump. Some reasons for using a data pump include the following:

● Protection against network and target failures:
In a basic GoldenGate configuration, with only a trail on the target system, there is nowhere on the source system to store data that the Extract process continuously extracts into memory. If the network or the target system becomes unavailable, the primary Extract could run out of memory and abend. However, with a trail and data pump on the source system, captured data can be moved to disk, preventing the abend. When connectivity is restored, the data pump extracts the data from the source trail and sends it to the target system(s).

● You are implementing several phases of data filtering or transformation. When using complex filtering or data transformation configurations, you can configure a data pump to perform the first transformation either on the source system or on the target system,
and then use another data pump or the Replicat group to perform the second transformation.

● Consolidating data from many sources to a central target. When synchronizing multiple source databases with a central target database, you can store extracted data on each source system and use data pumps on each of those systems to send the data to a trail
on the target system. Dividing the storage load between the source and target systems reduces the need for massive amounts of space on the target system to accommodate data arriving from multiple sources.

● Synchronizing one source with multiple targets. When sending data to multiple target systems, you can configure data pumps on the source system for each target. If network connectivity to any of the targets fails, data can still be sent to the other targets

STEP BY STEP Datapump Configuration

ON THE SOURCE SYSTEM

To configure the Manager process . (Reference: Oracle GoldenGate Administration Guide, Version 10.4):

1. On the source, configure the Manager process according to the instructions in Chapter 2.

2. In the Manager parameter file, use the PURGEOLDEXTRACTS parameter to control the purging of files from the local trail.

To configure the primary Extract group:

3. On the source, use the ADD EXTRACT command to create a primary Extract group. For documentation purposes, this group is called ext.

ADD EXTRACT , TRANLOG, BEGIN [, THREADS ]

❍ Use TRANLOG as the data source option.

4. On the source, use the ADD EXTTRAIL command to create a local trail. The primary Extract writes to this trail, and the data-pump Extract reads it.

ADD EXTTRAIL , EXTRACT

❍ Use the EXTRACT argument to link this trail to the primary Extract group. The primary Extract group writes to this trail, and the data pump group reads it.

5. On the source, use the EDIT PARAMS command to create a parameter file for the primary Extract group. Include the following parameters plus any others that apply to your database environment.

-- Identify the Extract group:

EXTRACT
-- Specify database login information as needed for the database:
[SOURCEDB ,][USERID [, PASSWORD ]]
-- Specify the local trail that this Extract writes to:
EXTTRAIL
-- Specify tables to be captured:
TABLE .;

To configure the data pump Extract group:

6. On the source, use the ADD EXTRACT command to create a data pump group. For documentation purposes, this group is called pump. In reality, I suggest naming this to match the extract name e.g., dpext1.

ADD EXTRACT , EXTTRAILSOURCE , BEGIN

❍ Use EXTTRAILSOURCE as the data source option, and specify the name of the local trail.

7. On the source, use the ADD RMTTRAIL command to specify a remote trail that will be created on the target system.

ADD RMTTRAIL , EXTRACT

❍ Use the EXTRACT argument to link the remote trail to the data pump group. The linked data pump writes to this trail.

8. On the source, use the EDIT PARAMS command to create a parameter file for the data pump. Include the following parameters plus any others that apply to your database environment.

-- Identify the data pump group:

EXTRACT
-- Specify database login information as needed for the database:
[SOURCEDB ,][USERID [, PASSWORD ]]
-- Specify the name or IP address of the target system:
RMTHOST , MGRPORT
-- Specify the remote trail on the target system:
RMTTRAIL
-- Allow mapping, filtering, conversion or pass data through as-is:
[PASSTHRU | NOPASSTHRU]
-- Specify tables to be captured:
TABLE .

;

NOTE - To use PASSTHRU mode, the names of the source and target objects must be identical. No column mapping, filtering, SQLEXEC functions, transformation, or other functions that require data manipulation can be specified in the parameter file. You
can combine normal processing with pass-through processing by pairing PASSTHRU and NOPASSTHRU with different TABLE statements.

ON THE TARGET SYSTEM

To configure the Manager process:

9. On the target, configure the Manager process according to the instructions in the Administration Guide, Chapter 2.

10. In the Manager parameter file, use the PURGEOLDEXTRACTS parameter to control the purging of files from the local trail.

To configure the Replicat group:

11. On the target, create a Replicat checkpoint table. This is a best practice. For instructions, see “Creating a checkpoint table” on page 121 of Administration Guide.

12. On the target, use the ADD REPLICAT command to create a Replicat group. For documentation purposes, this group is called rep.
ADD REPLICAT , EXTTRAIL , BEGIN

❍ Use the EXTTRAIL argument to link the Replicat group to the remote trail.

13. On the target, use the EDIT PARAMS command to create a parameter file for the Replicat group. Include the following parameters plus any others that apply to your database environment.

-- Identify the Replicat group:

REPLICAT

-- State whether or not source and target definitions are identical:

SOURCEDEFS | ASSUMETARGETDEFS

-- Specify database login information as needed for the database:

[TARGETDB ,] [USERID [, PASSWORD ]]

-- Specify error handling rules:

REPERROR (, )

-- Specify tables for delivery:

MAP .

, TARGET .

[, DEF

Oracle Notes

Saturday, July 3, 2010

Oracle GoldenGate Best Practices and Tips

6 comments:

BlogCatalog

Followers

Search This Blog

Blog Archive

About Me