Domino Code Fragment

Code Name*
Deleting Duplicate Documents from a Database
Date*
04/28/2024
Source (or email address if you prefer)*
Rlatulippe@romac.com
IP address:.3.141.30.162
Description*
Problem: There are many instances when one needs to know whether there are duplicate documents in a database. Having a macro which could identify or even delete these duplicate docs would be great! Solution: By duplicate documents, we really mean documents with the same ID, where ID can be any unique identifier you choose; CallNumber, CustomerName, SocialSecurityNumber, etc... (this should not be confused with Replication Conflicts, which are multiple instances of the same document!) Given this definition, there are two principal cases we will consider: 1. Duplicate documents within a database 2. New versions of some documents are to be imported, and the user doesn't want to create a condition where there are duplicate docs. Therefore, they want to delete the old docs before importing their new counterparts.
Type*
Formula
Categories*
(Misc)
Implementation:
Required Client:
Server:
Limitations:
Comments:
Files/Graphics attachments (if applicable): Code:
Here are the solutions:

1. The following macro will delete all but the most current of duplicate docs, where most current is defined by a field called Date. The unique identifier here is referred to as CallNumber.
list := @DbLookup("Notes"; ""; "Dup"; CallNumber; "Date");
biggest := @Subset(list; -1) - [01/01/01 01:01:01 AM];
current := Date - [01/01/01 01:01:01 AM];
@If(current = biggest; ""; @DeleteDocument);
SELECT @All

The View called Dup has just two columns. The first column is sorted on CallNumber.
The Second column is sorted on Date (Ascending order).
Note that this only works for documents whose Date is greater than January 1st, 1901.

2. The following macro will delete all instances of duplicate documents in the current database. Here, duplicate docs are defined as those who share a CallNumber with a document in another database, called Import.nsf.
@If(@IsError(@DbLookup("Notes"; "ServerName" : "Import.nsf"; "Dup"; CallNumber; "CallNumber")); ""; @DeleteDocument);
SELECT @All
The View called Dup need only have one column, sorted on CallNumber.

In both #1 and #2, the macro can be run as either a Filter Macro or a Background Macro. If run as a Filter Macro, it will only put Deletion Marks next to the duplicate documents; in a Background Macro, it will actually delete them.

Explanation of formula #1:

list := @DbLookup("Notes"; ""; "Dup"; CallNumber; "Date");
This generates a list of Dates from all documents sharing the same CallNumber as the current document.
biggest := @Subset(list; -1) - [01/01/01 01:01:01 AM];
This grabs the last date, which is the greatest if you sorted the View using Ascending order.
current := Date - [01/01/01 01:01:01 AM];
This grabs the current document's Date.
@If(current = biggest; ""; @DeleteDocument);

If current = biggest, then Do Nothing; Otherwise, Delete the current document!
SELECT @All

Run on all documents.

Explanation of formula #2:

@If(@IsError(@DbLookup("Notes"; "ServerName" : "Import.nsf"; "Dup"; CallNumber; "CallNumber")); ""; @DeleteDocument);
If the macro cannot find a document in the Import.nsf database sharing the same CallNumber as the current document, then that will generate an Error. In that case, Do Nothing; Otherwise, Delete the current document!
SELECT @All
Run on all documents.