dbTalk Databases Forums  

Need help serializing duplicate records

comp.databases.filemaker comp.databases.filemaker


Discuss Need help serializing duplicate records in the comp.databases.filemaker forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Scott
 
Posts: n/a

Default Need help serializing duplicate records - 07-29-2003 , 03:52 PM






Hi everyone...hope you can help!

We have a single database containing contact info for individuals and
their companies. There are about 8000 records total, with about half of
those records containing duplicated company info.

In other words, contact John Smith works for ABC Company and so does
Sally Jones. There are two records with different contact info but the
same company info.

I need to (1) locate the duplicate records and (2) assign each unique
company with a serial number (company_id). Then I want to (3) split out
all of the contacts and relate them back by the company_id. The
contacts have already been serialized with a contact_id.

Essentially we want to make the database company driven rather than
individual driven.

I am drawing a complete blank on how to accomplish this. Any
suggestions would be REALLY helpful to get me going.

Scott
scottcoatsNOSPAM (AT) nasfm (DOT) org


Please remove the anti-spam stuff from my email if replying.


Reply With Quote
  #2  
Old   
Bridget Eley
 
Posts: n/a

Default Re: Need help serializing duplicate records - 07-29-2003 , 06:21 PM






Hi Scott

Please, PLEASE back up your database before you test this.

Both scripts assumes that company names have been entered correctly and
consistently and that they are unique. If company names are misspelled,
they will be treated as new companies. You can minimise this risk in future
by formatting this field as a pop-up list that references the field itself.
If a company already exists, it should be selected from the list rather than
entered manually. If two companies have the same name, they will be treated
as one company. If this is going to be an issue for you, you may need to
find all the records that fall into this category manually, create another
field into which to enter something that would make the company name unique
then concatenate the two fields in a calc field then substitute the calc
field for the company field in both scripts.

The first script is only meant to be used once - as soon as the existing
records are processed, delete it and begin using the second script below to
enter Company No in new records.

Update existing records:

Show All Records
Sort [ Sort Order: Company (Ascending) ] [ Restore sort order, No dialog ]
Go to Record/Request/Page [ First ]
Set Field [ Current Company No, 1 ]
Set Field [ Current Company, Company ]
Loop
If [ Company = Current Company ]
Set Field [ Company No, Current Company No ]
Else
Set Field [ Current Company, Company ]
Set Field [ Current Company No, Current Company No + 1 ]
Set Field [ Company No, Current Company No ]
End If
Go to Record/Request/Page [ Next, Exit after last ]
End Loop

(where Current Company is a global text field and Current Company No is a
global number field)

Enter Company No for new records:

If [IsEmpty(Company No)]
Exit Record/Request
If [ Count(Company::Company) = 1 ]
Set Field [ Company No, Max(Constant::Company No) + 1 ]
Else
Set Field [ Company No, Company::Company No ]
End If
Else
Beep
End If

(where the constant self join rel is based on calc field = 1 on both sides
and where company self join rel is based on company field on both sides)

Take the Company No field out of the tab order, format it to be
non-enterable and as a button that runs this script.

Bridget Eley


in article 290720031652021375%noone (AT) nowhere (DOT) com, Scott at noone (AT) nowhere (DOT) com
wrote on 30/7/03 6:52 AM:

Quote:
Hi everyone...hope you can help!

We have a single database containing contact info for individuals and
their companies. There are about 8000 records total, with about half of
those records containing duplicated company info.

In other words, contact John Smith works for ABC Company and so does
Sally Jones. There are two records with different contact info but the
same company info.

I need to (1) locate the duplicate records and (2) assign each unique
company with a serial number (company_id). Then I want to (3) split out
all of the contacts and relate them back by the company_id. The
contacts have already been serialized with a contact_id.

Essentially we want to make the database company driven rather than
individual driven.

I am drawing a complete blank on how to accomplish this. Any
suggestions would be REALLY helpful to get me going.

Scott
scottcoatsNOSPAM (AT) nasfm (DOT) org


Please remove the anti-spam stuff from my email if replying.



Reply With Quote
  #3  
Old   
Scott
 
Posts: n/a

Default Re: Need help serializing duplicate records - 07-30-2003 , 01:24 PM



Hi Bridget,

First, thanks for your elegant and thoughful reply. You wrote it in
such a way that even a FM neophyte like myself could understand it.

The first script went perfectly. I haven't implemented the second, but
it looks like the right solution.

Please post you contact info if you do contract work. The folks I'm
helping may have paid opportunities in the future.

Thanks again,

Scott


In article <BB4D3EB6.2324%bridgeteley (AT) ihug (DOT) com.au>, Bridget Eley
<bridgeteley (AT) ihug (DOT) com.au> wrote:

Quote:
Hi Scott

Please, PLEASE back up your database before you test this.

Both scripts assumes that company names have been entered correctly and
consistently and that they are unique. If company names are misspelled,
they will be treated as new companies. You can minimise this risk in future
by formatting this field as a pop-up list that references the field itself.
If a company already exists, it should be selected from the list rather than
entered manually. If two companies have the same name, they will be treated
as one company. If this is going to be an issue for you, you may need to
find all the records that fall into this category manually, create another
field into which to enter something that would make the company name unique
then concatenate the two fields in a calc field then substitute the calc
field for the company field in both scripts.

The first script is only meant to be used once - as soon as the existing
records are processed, delete it and begin using the second script below to
enter Company No in new records.

Update existing records:

Show All Records
Sort [ Sort Order: Company (Ascending) ] [ Restore sort order, No dialog ]
Go to Record/Request/Page [ First ]
Set Field [ Current Company No, 1 ]
Set Field [ Current Company, Company ]
Loop
If [ Company = Current Company ]
Set Field [ Company No, Current Company No ]
Else
Set Field [ Current Company, Company ]
Set Field [ Current Company No, Current Company No + 1 ]
Set Field [ Company No, Current Company No ]
End If
Go to Record/Request/Page [ Next, Exit after last ]
End Loop

(where Current Company is a global text field and Current Company No is a
global number field)

Enter Company No for new records:

If [IsEmpty(Company No)]
Exit Record/Request
If [ Count(Company::Company) = 1 ]
Set Field [ Company No, Max(Constant::Company No) + 1 ]
Else
Set Field [ Company No, Company::Company No ]
End If
Else
Beep
End If

(where the constant self join rel is based on calc field = 1 on both sides
and where company self join rel is based on company field on both sides)

Take the Company No field out of the tab order, format it to be
non-enterable and as a button that runs this script.

Bridget Eley


in article 290720031652021375%noone (AT) nowhere (DOT) com, Scott at noone (AT) nowhere (DOT) com
wrote on 30/7/03 6:52 AM:

Hi everyone...hope you can help!

We have a single database containing contact info for individuals and
their companies. There are about 8000 records total, with about half of
those records containing duplicated company info.

In other words, contact John Smith works for ABC Company and so does
Sally Jones. There are two records with different contact info but the
same company info.

I need to (1) locate the duplicate records and (2) assign each unique
company with a serial number (company_id). Then I want to (3) split out
all of the contacts and relate them back by the company_id. The
contacts have already been serialized with a contact_id.

Essentially we want to make the database company driven rather than
individual driven.

I am drawing a complete blank on how to accomplish this. Any
suggestions would be REALLY helpful to get me going.

Scott
scottcoatsNOSPAM (AT) nasfm (DOT) org


Please remove the anti-spam stuff from my email if replying.




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.