OCRing the 1950 census

The greatest achievement of the release of the 1950 Census is not the records themselves, but the technology used to index the records. On April 1, 2022, the National Archives and Records Administration released the census on a dedicated website using a unique optical character recognition (OCR) software designed to translate the handwritten names into text that can be searched online. This made 6.4 million digitized pages of the 1950 Census immediately available. Think about that – immediately available…?! It seemed too good to be true.

I had the opportunity to attend a roundtable with the National Archives and Records Administration before the release of the census. Members of the archives team demonstrated the new technology on the 1940 Census (since they couldn’t use the 1950 Census, owing to privacy laws) and each of their examples worked. It actually worked. I was skeptical when the time would come to search for my own ancestors, but for the time being, I was pretty impressed.

On April 1, 2022, at midnight, I started searching. I didn’t do it the old way: find the person’s likely address in 1950, determine the Enumeration District, browse the microfilm to the address, mess it up, start over again, etc. Instead, I just searched. And like all new technology, the OCR and Artificial Intelligence (AI) did not work perfectly, but it was pretty darn close. Of my four grandparents, I was able to find three of them using the searchable index. And they all lived in populous cities (New York and Lowell)! That’s remarkable. Really remarkable.

[Like] all new technology, the OCR and Artificial Intelligence (AI) did not work perfectly, but it was pretty darn close.

Now, what does that mean for the future? In theory, it means that we will be able to examine traditionally difficult records using OCR and AI. That means deeds, probate records, pensions, personal papers, federal land grants, immigration records, town records (New England researchers rejoice!), etc.,  could be more easily searched using this new technology. It won’t be perfect, but what searchable index is? I’ve had to manually examine census records, vital records, and naturalizations over the years. This advance doesn’t mean that genealogy will be easier – records just become easier to find. The skill of putting everything together, understanding complex relationships, and broadening your scope will all still be part of the search.

Look for this technology to be the standard in the future. I am very excited to see where it brings us.

*

To learn more about the census, including the 1950 Census, please visit our dedicated page of census resources here: https://www.americanancestors.org/census-resources

Lindsay Fulton

About Lindsay Fulton

Lindsay Fulton is a nationally recognized professional genealogist and lecturer who joined American Ancestors in 2012. She leads the Research and Library Services team as Chief Research Officer, as well as the research team working on 10 Million Names. In addition to helping constituents with their research, Lindsay has authored a Portable Genealogists on the topics of Applying to Lineage Societies and the United States Federal Census (1790-1950), and is a frequent contributor to the American Ancestors blog, Vita-Brevis. She was featured in the Emmy-Winning Program: Finding your Roots: The Seedlings, a web series inspired by the PBS series “Finding Your Roots with Henry Louis Gates, Jr.", as well as another popular PBS series, “Samantha Brown’s Places to Love.” Before, American Ancestors, Lindsay worked at the National Archives and Records Administration in Waltham, Massachusetts, where she designed and implemented an original curriculum program exploring the Chinese Exclusion Era for elementary school students. She holds a B.A. from Merrimack College and M.A. from the University of Massachusetts-Boston. Area of Expertise: State and Federal Censuses, New England, Ireland, and New York research, with a focus on research methodology and organization. Lindsay also oversees the research team working on the 10 Million Names project.View all posts by Lindsay Fulton