|
dedupeIT breaks the matching process into two steps:
-
The Import process loads the data into a dedupeIT database, and generates
phonetic and other "match keys". The phonetic keys allow it to match e.g.
Deighton and Dayton, Phillips and Philips.
-
The Find Matches process takes the data file created by Import, and looks for
matches (i.e. potential duplicates) in it. It does this by using match keys to
highlight pairs of records that might be matches. It takes each pair of records
that may match and compares the name, address and Zip/postcode fields in both
records. Each of these items that matches contributes a score to a total match
score for the pair. The more closely the pair of records matches overall, the
higher the total match score. There are many specific techniques applied to the
different types of data, for example:
-
When comparing name fields, dedupeIT compares them as a block, even if they are
separated into e.g. title, initials and last name. dedupeIT allows for
phonetic, reading and typing errors in the data. It also allows for nicknames
and missing and reversed elements in names e.g. Robert Graham, Bob Graham and
Mr R J Graham all get a sure match score; Robert Graham and Mr Graham get a
likely match score; Robert Graham and Graham Robert get a possible match score.
-
When comparing address lines, dedupeIT also compares them as a block, to allow
for cases where e.g. one address has a house or building name and the other
address does not.
If none of the matches shown are potential duplicates, it is
likely that something has gone wrong with the setup of the file by the Import
Wizard. You can check this by looking at the names of the fields shown in the
Inspect Matches window. If any field names are not correctly describing the
data within that field, repeat the job, paying particular attention to the
naming of all the fields to ensure that all the name and address fields are
correctly labelled. Even if the field names are correct, try repeating the job.
If the dupes are still all false, .
dedupeIT shows the lower scoring (least likely) matches first,
so as you go through the list of matches, they should start to include more
true matches and less false matches. If this is the case and the matches at the
lower scores are all false, you can increase the Minimum score to report in
dedupeIT Options to a value that excludes these false matches. If the false
matches are distributed evenly through the file, refer to the answer to the
previous question.
Refer to the answer to the previous question. If the matches are
all true matches above a particular score, you can use Delete Matches when
prompted (or from the Results menu) to automatically remove all the dupes that reach that score.
Check that the field names are correct by looking at the names
of the fields shown in the Inspect Matches window. If any field names are not
correctly describing the data within that field, repeat the job, paying
particular attention to the naming of all the fields to ensure that all the
name and address fields are correctly labelled. If the field names are correct,
try reducing the Minimum score to report in dedupeIT Options by 10 or 15. If
the duplicates are still not all being picked up, .
This is because if you have more than two records which match
each other, say 1, 2 and 3, then record 1 matches record 2 and record 1 matches
record 3, but also record 2 matches record 3. So, you will see each of these
records twice when reviewing the matches, but not necessarily consecutively as
the different pair combinations may have got different scores. For this reason,
dedupeIT allocates a unique reference to each record, so that you can see
whether a record has been repeated in another pair, or is actually a different
record.
Yes. You can overtype data or use the normal Edit functions of
Cut (Ctrl+X), Copy (Ctrl+C) and Paste (Ctrl+V). In addition, you can copy the
contents of any field to the corresponding field in the other record by right
clicking on the field.
Not in dedupeIT. You can buy a more functional product from the
helpIT systems' range, matchIT, to do this. For more information, visit
www.helpit.com or .
Yes. To see what each shortcut is, hover the mouse over the corresponding button.
This skips over the displayed match pair and all others that
have the same match score as the displayed pair. It will display the first
matching pair that has a higher match score.
Refer to the answer to the question "When I delete
the dupes in Inspect Matches, I often find records that have already been
deleted or have already been shown".
The most common cause of this is that the two names don't have
any forenames or initials. For example, Mr J Brown and Mr Brown. In this case,
dedupeIT doesn't have enough information to be sure that these are actually the
same person e.g. they could be father and son. dedupeIT gives this match a
lower score than exact to reflect this possible difference.
The fewer data items that are the same, the lower the match
score - empty data items score less than non-empty identical items. Refer also
to the answer to the previous question. Other common reasons for this happening are:
- neither record has a zip/postcode
- the person's name does not seem to be a valid name e.g. Managing Director, Part Sales
- company name is empty, when deduping to business level.
Switch on the dedupeIT option Must have match on gender to avoid
seeing these matches. This setting is the default for matching to Person level.
To restore the default settings, just reselect Person (contact) Matching Level.
From dedupeIT Options, just reselect the appropriate Matching Level.
These are explained fully in the on-line Help in the topics Import Summary and Matching Summary.
|