OCRing the 1950 census

The greatest achievement of the release of the 1950 Census is not the records themselves, but the technology used to index the records. On April 1, 2022, the National Archives and Records Administration released the census on a dedicated website using a unique optical character recognition (OCR) software designed to translate the handwritten names into text that can be searched online. This made 6.4 million digitized pages of the 1950 Census immediately available. Think about that – immediately available…?! It seemed too good to be true.

I had the opportunity to attend a roundtable with the National Archives and Records Administration before the release of the census. Members of the archives team demonstrated the new technology on the 1940 Census (since they couldn’t use the 1950 Census, owing to privacy laws) and each of their examples worked. It actually worked. I was skeptical when the time would come to search for my own ancestors, but for the time being, I was pretty impressed.

On April 1, 2022, at midnight, I started searching. I didn’t do it the old way: find the person’s likely address in 1950, determine the Enumeration District, browse the microfilm to the address, mess it up, start over again, etc. Instead, I just searched. And like all new technology, the OCR and Artificial Intelligence (AI) did not work perfectly, but it was pretty darn close. Of my four grandparents, I was able to find three of them using the searchable index. And they all lived in populous cities (New York and Lowell)! That’s remarkable. Really remarkable.

[Like] all new technology, the OCR and Artificial Intelligence (AI) did not work perfectly, but it was pretty darn close.

Now, what does that mean for the future? In theory, it means that we will be able to examine traditionally difficult records using OCR and AI. That means deeds, probate records, pensions, personal papers, federal land grants, immigration records, town records (New England researchers rejoice!), etc.,  could be more easily searched using this new technology. It won’t be perfect, but what searchable index is? I’ve had to manually examine census records, vital records, and naturalizations over the years. This advance doesn’t mean that genealogy will be easier – records just become easier to find. The skill of putting everything together, understanding complex relationships, and broadening your scope will all still be part of the search.

Look for this technology to be the standard in the future. I am very excited to see where it brings us.

*

To learn more about the census, including the 1950 Census, please visit our dedicated page of census resources here: https://www.americanancestors.org/census-resources

Lindsay Fulton

About Lindsay Fulton

Lindsay Fulton joined the Society in 2012, first a member of the Research Services team, and then a Genealogist in the Library. She has been the Director of Research Services since 2016. In addition to helping constituents with their research, Lindsay has also authored a Portable Genealogists on the topics of Applying to Lineage Societies, the United States Federal Census, 1790-1840 and the United States Federal Census, 1850-1940. She is a frequent contributor to the NEHGS blog, Vita-Brevis, and has appeared as a guest on the Extreme Genes radio program. Before, NEHGS, Lindsay worked at the National Archives and Records Administration in Waltham, Massachusetts, where she designed and implemented an original curriculum program exploring the Chinese Exclusion Era for elementary school students. She holds a B.A. from Merrimack College and M.A. from the University of Massachusetts-Boston.View all posts by Lindsay Fulton