![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
Sorry to ask what seems like such a simple question, but... I was just given a database that badly needs to be redesigned... looks like someone for whom Excel got too big... Basic structure: Client Info (Name, Address, Phone...) Repair Info (Item, Service Date Info, some notes) the problem is that the Client Info has duplicates and "semi" duplicates. Well, if you look at it, you can tell the addresses are the same, but you can't use SQL to identify them with complete consistency. (The differences are slight and in inconsistent places). What's the easiest way to go about fixing this? Creating a list of unique Customers is easy. Creating a list of the latest addresses is easy. How would I go about identifying likely duplicates? (I have about 15,000 records in the database, so doing it manually would take a LONG time!) I'd just like to get the bulk of the work done using SQL. I know I'll have to do some of it manually... Any pointers? (I'd loan ya mine, but he's out hunting...) Thanks! Pieter |
#2
| |||
| |||
|
|
Sorry to ask what seems like such a simple question, but... I was just given a database that badly needs to be redesigned... looks like someone for whom Excel got too big... Basic structure: Client Info (Name, Address, Phone...) Repair Info (Item, Service Date Info, some notes) the problem is that the Client Info has duplicates and "semi" duplicates. Well, if you look at it, you can tell the addresses are the same, but you can't use SQL to identify them with complete consistency. (The differences are slight and in inconsistent places). What's the easiest way to go about fixing this? Creating a list of unique Customers is easy. Creating a list of the latest addresses is easy. How would I go about identifying likely duplicates? (I have about 15,000 records in the database, so doing it manually would take a LONG time!) I'd just like to get the bulk of the work done using SQL. I know I'll have to do some of it manually... Any pointers? (I'd loan ya mine, but he's out hunting...) Thanks! Pieter |
#3
| |||
| |||
|
|
I recently had a similar need, and found the Ratcliff/Obershelp algorithm to be most useful. I'm sure you can find the code on the web, like I did. |
#4
| |||
| |||
|
![]() |
| Thread Tools | |
| Display Modes | |
| |