[Editor’s note: This post originally appeared in Vita Brevis on 17 July 2014. Since the time of that posting, we have made enhancements to our search functionality on AmericanAncestors.org that return broader results without using wildcards. The wildcard strategy still works as advertised, however.]
When we were deciding how our AmericanAncestors.org database search would work, one of the key considerations was that we didn’t want to return search results that contained a lot of ‘noise.’ On other websites, the database architects allowed for a certain (sometimes significant) number of irrelevant search results. This was undoubtedly intended to be helpful, but it is actually quite frustrating. So we decided to do ‘exact’ searches with a couple of twists. The goal was to give results that were exactly what you searched for. We spent quite a lot of time tuning our search algorithm, trying different approaches and analyzing the results. We’re pretty happy with our final approach, but it’s definitely helpful to understand how it works. And what the twists are.
Actually, I said that our searches are ‘exact’, but the ‘exact’ portion of the search applies to the surname, year range, record type, location, and any specific database or database type specified.
Twist #1 is that first names (which can include middle names and initials as well as maiden names) are searched with an ‘any match’ algorithm. So you will get a search result ‘hit’ by searching for given name ‘jon’ where ‘jon’ makes up any separate part of the given/middle/maiden name. For instance, searching our American Canadian Genealogical Society Index of Baptisms, Marriages, and Burials, 1840-2000, with ‘jon’ as a first name (and no other search fields filled-in) will returns hits for not only ‘Jon’ but also ‘Michael Jon,’ ‘Jon Alfred,’ ‘Jon Robert Jr,’ etc. Why did we do this? Well, often, a middle name or initial or suffix is not known by the searcher, and we feel that requiring an exact match in this case is too limiting. Any of those ‘Jons’ may be the one you’re looking for.
But what if you don’t want to do an exact search for last names? It is certainly the case that some names have common spelling variations. And, in many cases, last names appear with the ‘as written’ spelling, often phonetically, as written by a town clerk or census enumerator, who often just wrote ‘what they heard.’ Sometimes when these phonetic surnames are indexed, we add spelling variations to make it possible to find the surname without having to resort to Twist #2. In our original database of Yarmouth, Massachusetts Vital Records to 1850,* we indexed last names as written. Those last names often contained many spelling variations, with ‘Eldridge’ being a prime example. In the Yarmouth records, ‘Eldridge’ was occasionally spelled as such and a search for ‘eldridge’ in the original Yarmouth to 1850 VR database results in 271 hits for that spelling. However, the surname variations of ‘Eldredg,’ ‘Eldred,’ ‘Eldreg,’ ‘Eldredge’ and others appear in those Yarmouth records.
Twist #2 involves the use of ‘wildcards.’ A search wildcard is a special character that represents any single character (a question mark, or ‘?’) or a sequence of any characters (an asterisk, or ‘*’). In the Eldridge example, with the original Yarmouth Vital Records to 1850 database, if you search for ‘eld*’ as a last name, you’ll get more hits. The asterisk allowed all the various spellings of the name that started with ‘eld’ to be found.*
My own last name ‘Sturgis’ is frequently spelled as ‘Sturges.’ Some branches of the family traditionally used the ‘e’ spelling, but mine usually used the ‘i’ spelling. And sometimes the same individual’s name was spelled with both variations. When I’m researching my own family, I always use the ‘?’ wildcard as part of the last name: ‘sturg?s’. This gets me both spelling variations.
The ‘?’ and ‘*’ wildcards can be used in any of the text fields, including First Name, Last Name, and keyword. If you haven’t tried using wildcards in searches, now’s the time to try them!
*Our original Yarmouth, Massachusetts Vital Records to 1850 database has been revised with the addition of first names and last name spelling variations and added to our Massachusetts Vital Records to 1850 database. A large number of previously ‘stand-alone’ databases have been given the same treatment.