Saturday 29 February 2020

MyHeritage releases 545million records from 25,000 historical US City Directories

MyHeritage has released a new collection of 545 million records from 25,000 historical US City Directories.

Cities in the United States have been producing and distributing directories since the 1700s as an up-to-date resource to help residents find local individuals and businesses. City directories typically list names (and spouses), addresses, occupations and workplaces. Some include additional information. They can be extremely useful to genealogists for placing ancestors at a particular location at a specific time, especially between censuses.

With so many publishers involved in producing so many city directories across the United States, there was no standard or agreed publishing format. As a result, MyHeritage's project created a huge amount of varied data that required the development of special technology to process the directories.

After all the information was parsed, MyHeritage identified records thought to describe the same individual who lived at one particular address over several years, as published in multiple editions of the city directories. The next step was to consolidate all these entries into one aggregated record that covers a span of years.

This reduced “search engine pollution,” where a search for a person would have returned multiple, very similar entries from successive years, obscuring other records. The aggregation makes it easier to spot career changes, approximate marriage dates, re-marriages, and plausible death dates.

As far as the company is aware, the algorithmic deduction of marriage and death events from city directories is unique to MyHeritage. For this one collection, 1.3billion records, many of which included similar entries for the same individual, have been aggregated to 545 million records.

In the example above, MyHeritage consolidated 31 records from the years 1912–1959 into a single record. Based on the information collected over the years, it is likely that Alfred and Mary Albert married circa 1914, and Alfred died circa 1959.

MyHeritage have published a detailed blogpost on the methods used to create this collection. Read it here.