Configuring
Function Operation Instructions
1. Parameter Meanings
When configuring a MongoDB data synchronization task, here is the detailed meaning of each parameter:
- 
workName: - Meaning: Task name
- Description: Used to identify the name of the data synchronization task. If not provided, it defaults to "workNameDefault".
 
- 
sourceDsUrl: - Meaning: Source MongoDB connection URL
- Description: Specifies the connection URL of the source MongoDB database, which can be a single node, a replica set, or a sharded cluster.
 
- 
targetDsUrl: - Meaning: Target MongoDB connection URL
- Description: Specifies the connection URL of the target MongoDB database, which can be a single node, a replica set, or a sharded cluster.
 
- 
syncMode: - Meaning: Synchronization mode
- Description: Specifies the mode of data synchronization, which can be one of the following options:- "all": Full mode, sync all tables, excluding operations on source tables during synchronization.
- "allAndRealTime": Full plus real-time mode, performs full sync first and then starts real-time sync.
- "allAndIncrement": Full plus incremental mode, performs full sync first and then syncs only operations on source tables during synchronization.
- "realTime": Real-time mode, syncs based on configured start and end times.
 
 
- 
realTimeType: - Meaning: Real-time task type
- Description: Selects the type of real-time task, which can be "oplog" or "changestream".
- Additional Information:- "oplog": Uses MongoDB's oplog for real-time synchronization, suitable for source replica sets, supports DDL operations, and is faster.
- "changestream": Uses MongoDB's changestream for real-time synchronization, suitable for source replica sets or mongos, does not support DDL operations, and has moderate speed.
 
 
- 
fullType: - Meaning: Full task type
- Description: Selects the type of full task, which can be "sync" or "reactive".
- Additional Information:- "sync": Uses a stable transmission method for full synchronization.
- "reactive": Uses a faster transmission method for full synchronization.
 
 
- 
dbTableWhite: - Meaning: Tables to synchronize
- Description: Specifies tables to synchronize using regular expressions. For example, to sync all tables under the mongodb database: mongodb\..+, the default is to sync all tables.
 
- 
ddlFilterSet: - Meaning: DDL operations to synchronize
- Description: Specifies DDL operations to synchronize, separated by commas. The default is *, meaning sync all DDL operations.
 
- 
sourceThreadNum: - Meaning: Source task thread number (full mode)
- Description: Specifies the number of threads to read source tasks in full synchronization.
 
- 
targetThreadNum: - Meaning: Target task thread number (full mode)
- Description: Specifies the number of threads to write target tasks in full synchronization.
 
... (Continues with the rest of the parameter explanations)
2. Parameter Usage Scope
| Parameter           | Real-Time Task | Full Task | Full + Increment Task | Full + Real-Time Task |
|--------------------|--------------|----------|----------------------|-----------------------|
| workName           | ✔️           | ✔️       | ✔️                   | ✔️                    |
| sourceDsUrl        | ✔️           | ✔️       | ✔️                   | ✔️                    |
| targetDsUrl        | ✔️           | ✔️       | ✔️                   | ✔️                    |
| syncMode           | ✔️           | ✔️       | ✔️                   | ✔️                    |
| realTimeType       | ✔️           |          | ✔️                   | ✔️                    |
| fullType           |              | ✔️       | ✔️                   | ✔️                    |
| dbTableWhite       | ✔️           | ✔️       | ✔️                   | ✔️                    |
| ddlFilterSet       | ✔️           |          | ✔️                   | ✔️                    |
| batchSize          | ✔️           | ✔️       | ✔️                   | ✔️                    |
| bucketNum          | ✔️           | ✔️       | ✔️                   | ✔️                    |
| bucketSize         | ✔️           | ✔️       | ✔️                   | ✔️                    |
| startOplogTime     | ✔️           |          |                      |                       |
| endOplogTime       | ✔️           |          | ✔️                   | ✔️                    |
| delayTime          | ✔️           |          |                      |                       |
| nsBucketThreadNum  | ✔️           |          |                      |                       |
| writeThreadNum     | ✔️           |          |                      |                       |
| ddlWait            | ✔️           | ✔️       | ✔️                   | ✔️                    |
| clusterInfoSet     | ✔️           | ✔️       | ✔️                   | ✔️                    |
| bind_ip            | ✔️           | ✔️       | ✔️                   | ✔️                    |
3. Data Validation
# Data validation script
# 0: Multi-threaded validation: Configure 1-8 validation methods after 0, which can be processed concurrently
# 1: Estimate count validation for libraries and tables (may be inaccurate)
# 2: Accurate count validation for libraries and tables
# 3: Library and table dbHash validation (locks the library, use with caution)
# 4: Validate 100 randomly selected data from libraries and tables, source side randomly selects 100 data, check if they exist on the target side
# 5: Validate 100 data of each data type from libraries and tables, extract 100 data of each data type for _id (first 50 and last 50), check if they exist on the target side
# 6: Check missing index information in libraries and tables
# 7: Check missing index information in libraries and tables and create missing indexes
# 8: Library dbHash validation (locks the library, use with caution)
# 9: Output detailed validation log information. When not specified, the log only records abnormal validation information
# Can be used in combination, e.g., 123456 123457 1237. If not specified, the default is combination 16
checkData=12456