Class: CLEANING UP YOUR DATABASE
©2007 by Donald R. Snow
Sections of the Class Notes
Return to Don's
Class
Listings page or to the home page of Utah Valley PAF Users
Group . This page was last
updated 9 Aug 2007.
INTRODUCTION
- Instructors are Elder and Sister Donald R. and Diane
M. Snow of the England London Mission, Hyde Park
Family History Centre (snowd@math.byu.edu, dms34@juno.com)
- These notes with active Internet links are
posted on the
Utah Valley PAF Users Group
website http://uvpafug.org under Class Outlines, Don's
Listings.
- These notes are written for PAF (Personal
Ancestral File), but the ideas are useful for any genealogy
program., .e.g. Ancestral Quest, RootsMagic, Legacy, Family Tree Maker
- Any database being worked with
needs cleaning up periodically -- example of our Early LDS online
database at http://earlylds.com
-- 54,000 names of early LDS Church members - we have
corrected 1000's of
errors, but there are still many more
CLEANING
UP YOUR PAF
DATABASE -- WHY
- Reasons you need complete, uniform, and
accurate names, dates, and places
- So you can upload your additional data to New
FamilySearch when it is released "soon"
- To make reports and lists complete and correct for
library visits and/or genealogy field trips
- To be able to find what you are looking
for when filtering
- To eliminate duplicates
of same person in twice with different spellings or even the same spelling
- For
temple work
- To be able to find already completed
ordinances in TempleReady
- To be able to send names to the temple
with accurate data for the IGI/IIGI
- So TempleReady recognizes countries and states
spelled in English with no abbreviations and also some countries spelled
in their language (The list in "TempleReady Reference Guide", pages B-6
1-3, available at any FHC, is now out-of-date.)
- If you have same
person in twice, you might inadvertently duplicate the temple work yourself.
- With
incorrect places TempleReady won't find ordinances already completed and they will be
duplicated; also ordinance data will go into the wrong IGI region, e.g.
WORLD MISC, and when people find the correct data later the temple work
will probably be redone again.
- When you are ready to submit names to the temple, take a backup of
your complete and corrected PAF database to a FHC and restore it on a
TempleReady-for-Windows computer to do the name selection. That way
all corrections, additions, and temple data that you find in TRWin will be
put directly into your PAF database while forming the submission.
HOW DATA SHOULD BE ENTERED
- NAMES -- names (in English and most languages) should
be entered in PAF as spoken and the PAF Preferences can
be set to put the slashes around the surname,
e.g. [Given Names] /[Surname]/ [Title, e.g. Jr., Sr., III]
- See Tools/Preferences in PAF 5 for options to
automatically enter slashes // around surnames depending on the wy the surname occurs
in that language -- In English you enter the name as
[Given Names Surname Jr.] and it enters the slashes and knows that Jr. is a
suffix and not part of the surname
- Names and places should be entered in upper and
lower case letters, e.g. Richard Daniel Smith, not Richard Daniel SMITH, nor
RICHARD DANIEL SMITH
- For ease of reading click
Tools/Preferences/Names and check the box for Surnames in Caps, so they show in
caps, but don't enter them that way so you have the choice of showing them
in caps or not
- PLACES -- places should be entered completely from smallest to
largest jurisdiction, e.g. "City, County, State, Country", separated by
comma-space (the spaces are for uniformity and easier reading)
- PAF 5 will allow you to use
more than 4 fields and some locations require more than 4 fields
- Names and places should be complete and uniform
- No abbreviations, except "USA" -- spell it out so
no confusion; e.g., Does "CO" mean Colorado or County? Does "CA" mean
California or Central America? Does DE mean Delaware or Germany
(Deutschland)?
- Include country to avoid problems -- e.g., Does
"Georgia" mean the state or the country? Use "Georgia, USA" to avoid
confusion.
- Places
should normally be entered as at time of the event --
see PAF 5.2 Users Guide, p 45 by clicking on the PAF Help/Users Guide
- Most people don't follow this rule
completely, e.g. they write Woburn, Middlesex, Massachusetts, USA,
for a location in the 1600's even though there was no USA before 1789
- Crucial
thing is so someone can find the location and records -- can include
information on county changes in the notes
- Get
exact places, where possible
- If you don't have proof of the place
yet, but you know there is some connection, put "of, [Town],
[County], [State], [Country]", or "of, , [County], [State], [Country]"
- Separate the "of" by a comma so the
place alphabetizes near where it belongs instead of with the
O's -- also TempleReady might not recognize it
- Dates and
places that are inside angle brackets, < ,
, ,>, have been computer
estimated, as in the Ancestral File -- It helps to change
them to "Abt xxxx" and "of, ..." until you find the exact data.
- Until you find
the exact date for
birth and marriage it helps to distinguish people by putting "Abt
[year]" in birth and marriage fields
- For rules
to estimate dates for birth and marriage
(if you know they were married), see A Member's
Guide to Temple and FH Work (white booklet), pages 11-12
- DON'T send use these "Abt"
dates for temple work, unless you can't find the exact dates with
reasonable
effort
- For temple work
include all dates and information you can find for the person
(birth,
christening, marriage, death, and burial dates) since these are
included and indexed in the IIGI, so someone searching the IIGI
on any of these will find the completed ordinances.
- TEMPLE CODES
CORRECTING
FILE STRUCTURE ERRORS
- File structure errors -- computer science type errors
-- You have no control over these and they creep in from various
operations.
- Run File/Check-Repair/CHECK (Note: It's Check-Repair/CHECK, NOT Check-Repair/CHECK-REPAIR at
first) and look at the report to see if your database has any structure
errors
- DON'T run
the File/Check-Repair/CHECK-REPAIR option until you know there
are no structure
errors involving notes for any individual since there
is a bug in PAF 5.2
- If
file structure errors exist, but none involve notes for any individual, then you are safe to run
the CHECK-REPAIR option to try to repair it
- If file structure errors
exist, and any involve notes for individuals, don't run the
CHECK-REPAIR option since there is a bug in PAF 5.2 and it may
disconnect the notes from some individuals -- correct your database
with other methods, e.g. PAF Insight ( http://www.ohanasoftware.com ) or try
GEDCOM'ing everything out and into a new database
- PAF Insight will correct some file structure
errors that PAF and vice versa -- PAF Insight doesn't have the notes
problem
- Always check
your file for structure errors before exporting any data
or clearing names for temple or working on data errors
FINDING
DATA ERRORS
- KINDS OF ERRORS
- Data errors -- duplicate names, sources, or
repositories, typos, logical data errors, "loops", incomplete places, places
not separated by commas, county not shown, abbreviations
- Many location errors can be corrected without
much research since they are misspellings, wrong jurisdictions, etc.
- Errors in dates usually take more research to correct
- A good long-range goal is to
verify and get copies and enter sources for each piece of data in your file
- For large databases there are additional complications -- see
my notes Working With Large Databases on Don's
Class
Listings page on http://uvpafug.org
- FINDING AND
CORRECTING DUPLICATES
- Several ways to find and correct duplications -- PAF's Tools/Match-Merge, PAF
Insight -- http://www.ohanasoftware.com , GenMerge -- http://www.genmerge.com/
- Merging is a "dangerous" operation and can't be
undone without going to a backup -- always make a backup before any merging
-- see my Basic PAF class notes on Don's
Class
Listings
- Finding duplicates in a large database may be difficult since names
may be spelled slightly differently and not alphabetize near
each other at all
- http://www.namethesaurus.com/ -- a website that
shows surname and given name variant spellings, plus Soundex and Metaphone
codes
- Merging duplicate sources in PAF -- run
Tools/Match-Merge Duplicate Sources and Citations
- Sources and repositories
have to be spelled exactly the same, before they will
merge
-- may have to edit them to be exactly the same before you can
get them to merge
- Other
programs may
help in merging sources, e.g. http://www.rootsmagic.com
- FINDING LOGICAL DATA ERRORS
- POSSIBLE PROBLEMS REPORT in PAF --
Print/Lists/Possible Problems/
-
Do a Preview of this list (Print/Lists/Possible Problems/Preview) and begin correcting
errors, e.g. "Children in a family are in wrong birth order" or "Child was
born before its mother", etc.
- Can save Problems list as a text file so you
can have it opened in WordPad while you are making corrections
in PAF
-
Click Print/Lists/Possible Problems/Print-to-File (lower right side of screen), then Print -- save it
as "problems.rtf" file (a rich text file),
- Remake this list periodically, clear WordPad, and
load the new list showing the changes you have already made
- Can change some options of what PAF shows as
errors -- click on Options button on Print Menu before you run Possible
Problems list
- LOOPS in the data, e.g. someone who is linked as
their own ancestor
- PAF 5.2 won't indicate loops, but a few other
programs will
- PAF Companion 5.2 (bundled with PAF 5.2 CD)
-- do a Pedigree Chart preview and it will stop on RIN's that are
in loops
- GenMerge finds loops in preparing file for merging -- http://www.genmerge.com
- Can check Possible Problems report to see if a
child was born when parents were not appropriate ages -- sometimes very
difficult to find loops in your data
- To cut a loop,
unlink the child from the parents where the first "repeat" occurs
- FINDING PLACE NAME ERRORS
- SETTING COLUMNS IN INDIVIDUAL VIEW to find problems
- In
Individual View you can set the columns to show data you want -- may see
some obvious spelling errors or no county shown, etc.
- To select the columns to view in
Individual View, right-click any column heading to open menu,
then left-click Add or Modify Columns, and select the columns you want
- Move items between the left and right panes by
the arrows ">" and "<"
- Move items up or down
in the right pane to get the order you want the
columns
in;
then click OK
- Can control width of columns by dragging the
column title separators right or left
- Can also move entire columns by dragging and
dropping their headings right or left
- Can sort the data by RIN's or alphabetically by
clicking in title bars of those two columns -- second click there reverses
the sort order
- Limitations of Individual View
- Can't sort the data by any column except
RIN's or alphabetically
- Can't show marriage places nor dates
- Other programs will show these, e.g.
GENViewer (see below in Tools paragraph)
- Use Individual View to see if names or places are
entered in mixed case
- Go to Tools/Preferences/Names and uncheck Show
Surnames in Caps
- If surnames or places still show
as capitalized,
except for temple codes, they were entered in caps -- To change all, see Changing to
Mixed Case in paragraph below
- PLACE SORTED LIST in PAF to find problems with
places
- To find problems in places, can generate a Places
Sorted Alphabetically list
- Preview the Places Sorted Alphabetically list by
Print/Lists/Places Sorted Alphabetically/Preview
- Can save this Places list as a text file by
checking Print-to-File (lower right hand side of screen), then Print --
save it as "places.rtf" file (a rich text file) and it opens in WordPad
-
Use it to find corrections needed, do some corrections, remake
the list, then clear WordPad and load the new list
- If the Places Sorted Alphabetically
list doesn't show the RIN's, go to Tools/Preferences/Names and tell PAF
you want RIN's appended to names -- then remake the Places Sorted
Alphabetically list
- OTHER PROGRAMS TO FIND PLACE ERRORS (see
details in later paragraphs in these notes)
- PAF
INSIGHT -- free for FHC's -- very helpful viewer and editor -- http://www.ohanasoftware.com
- Shows alphabetical listing of
all locations in your PAF database and you can correct them to
be uniform
- Limitation is that PAF Insight doesn't find
place errors -- you have to find them some other way
- U.S.
CITIES GALORE2 -- free for FHC's -- http://www.uscitiesgalore.com/ -- viewer and
editor, but only works on GEDCOM's
- FAMILY
ATLAS -- free for FHC's -- viewer, not an editor -- http://www.familyatlas.com
- MAP
MY FAMILY TREE (used to be WORLD PLACE
ADVISOR) ($40) -- http://www.progenygenealogy.com/
- Finds location errors in your PAF, Legacy,
FamilyTreeMaker, or other type file
- Checks spelling, counties, provinces, ambiguities
- Has
3 million towns and cities world-wide in its database and you
can others
- Viewer, not an editor, so it finds the location
errors, but you use your genealogy program to fix them
- GENVIEWER -- free
version for FHC's -- http://www.mudcreeksoftware.com/ -- viewer, not an
editor
- Very helpful program to find errors in
data, e.g. you can set columns to show any data and sort
on any column to find problems
- Runs on PAF
files, GEDCOM's, Legacy, Family Tree Maker, and other
genealogy programs
- Can download and try out the full version free
for 15 days
- Also does Internet searches on selected data
- See my
notes on Working With Large
Databases on Don's
Class
Listings page on http://uvpafug.org
for
many uses to find data errors
CORRECTING
DATA ERRORS
TOOLS TO FIND CORRECT
PLACE NAMES -- Finding correct spelling of places, county that
a city is in, etc.
- INTERNET
SEARCH ENGINES -- http://www.google.com
, http://www.yahoo.com/
, http://www.msn.com/
, http://www.dogpile.com
- Typing misspelled locations into many search
engines asks if you meant the correct place, e.g. do a search for
"Illinoin" and you get "Did you mean Illinois?"
-
Also helpful to find county, e.g. type in "Nauvoo Illinois county"
(without the quotes) and you get Hancock County
- Also Google maps -- type Sandbach England into
maps.google.co.uk and you get Sandbach, Cheshire
- FHLC -- FAMILY HISTORY LIBRARY CATALOG
- GETTY MUSEUM THESAURUS from Getty Art Museum in Los
Angeles, California -- very helpful web site to find info about
places world-wide
- http://www.getty.edu/research/tools/vocabulary/tgn/
- To find this web site without having to remember this URL, do
a Google search -- http://www.google.com -- for "Getty Thesaurus" and
it comes up at or near the top
- Getty Thesaurus gives all places, rivers,
mountains, etc., containing that name world-wide
- Shows a hierarchy of the places from smallest
jurisdiction to largest; historical names of the location and geographical
coordinates
- Can copy and paste this data into your PAF notes,
if you want
- Black entries at bottom of note are old names of
current locations
- FUZZY GAZETTEER -- http://tomcat-dmaweb1.jrc.it/fuzzyg/query/?q
-
Shows places with spellings near what you have
-
NORTH AMERICA -- maps
and gazetteers
- BRITISH ISLES -- maps and gazetteers
UPDATING
TEMPLE CODES
- List of temple codes, both old and new,
on http://www.geocities.com/rgpassey/temple/abclist.htm
- Three ways to determine which incorrect temple codes
you have in your PAF database
- Use GENViewer with the columns set to show the
ordinance temples and sort on these columns -- incorrect codes show up
alphabetically between correct ones
- Make a GEDCOM of entire file and import it into a
new temporary database -- listing file for the import will show the
incorrect temple codes, but not in order and will repeat the incorrect
ones each time they occur
-
Make a custom report -- select all
individuals with any ordinance done and have the report show RIN, Name, Sex,
Baptism Temple, Endowment Temple, Sealing to Parents Temple, and Sealing to
Spouse Temple
- Sort the custom report on Baptism Temple, save it
to a text file, and open it in Wordpad -- incorrect Baptism Temple codes
show up alphabetically between the correct ones -- make the corrections by
Global Search and Replace on Temple Codes
- After correcting the codes in the Baptism
Temple list, remake the custom report and sort it on the Endowment Temple
to find more incorrect ones; correct these, and repeat this for the
Sealing to Parents Temple -- each correction made carries over to the
other fields
- Note: The
custom report in PAF 5.2 will not sort correctly on the Sealing to
Spouse Temple (bug in PAF 5.2), so you have to look over that list
without it being alphabetized, but most of the incorrect temple codes
will have already been found and corrected from the earlier sortings
CONCLUSION
- Cleaning up your database is a never-ending process
and needs repeating regularly.
- There are many
other procedures, e.g. how to find duplicates in large
databases, how
to change sources from Notes into real Sources -- see
more ideas and procedures in my
notes on Working
With Large
Databases on Don's
Class
Listings page on http://uvpafug.org
ASSIGNMENT
- Form a Places Sorted Alphabetically list from
your PAF database and look through it to recognize some errors you have.
- Use the Family History Library Catalog to find
the county for some parish in the U.K. or
town in the U.S. .
- Use PAF Insight and/or US Cities Galore 2 (Go to
a FHC.) on your database to see and correct some place errors.
-
Check the temple codes in your database and correct the ones needed.
Return
to Don's
Class
Listings page or to the home page of Utah Valley PAF Users
Group .