1. Deduplication Guide & Core Concepts
The Remove Duplicates node acts as an automated gatekeeper, scanning incoming data arrays and stripping away redundant elements. This operation is absolutely critical before performing batch insertions into databases like MySQL or dispatching mass email campaigns.
Configuring Compare Modes
- All Fields (Full Object Match): Analyzes every single byte of the JSON structure. If two objects are identical across all keys and values, the redundant instance is safely deleted.
- Specific Fields (Targeted Match): Instructs the engine to only verify specific keys (e.g.,
emailorphone_number). The system successfully catches duplicates even if their metadata (like timestamps) differs.
2. Advanced Deduplication Features
Unlike standard automation platforms, nLink allows you to dictate exactly which item survives: Keep First (retains the original appearance) or Keep Last (keeps the most recently updated item). All surviving data absolutely maintains 100% of its original chronological order!
- Ignore Case Sensitivity: When activated, the engine forces strings like
Admin@gmail.comandadmin@gmail.cominto a normalized state, ensuring absolute accuracy when identifying identical users. - Dot Notation Support: Reach deep into complex, multi-layered data arrays using dot syntax (e.g.,
data.customer.id).
3. Frequently Asked Questions (FAQ)
Will this node cause Out-Of-Memory (OOM) crashes with millions of rows?
Absolutely not. The Remove Duplicates engine is heavily optimized by nLink architects using 16-byte MD5 crypto hashes and O(N log N) memory-index tracking. It consumes up to 99% less RAM than traditional caching systems.
Can I route the deleted duplicate items to another node?
By design, this node acts as a destructive filter to ensure main-thread purity. If you need to route duplicates into a separate warning system (like Slack), we recommend evaluating array counts using the If / Else Node.
