Transforming historical bank records into data sets for research and teaching
The blog post and recording of our New York Public Library “Doc Chat” on the Brown Brothers Collection and New Orleans ledgers is now available! View it here.
The Textile Roots of Wall Street
Now housed in the iconic skyscraper at 140 Broadway, just blocks from the New York Stock Exchange in Manhattan’s Financial District, Brown Brothers Harriman is one of the oldest and largest private banking and wealth management institutions in the United States.
The result of a 1931 merger, Brown Brothers Harriman was founded in Philadelphia in 1818 as Brown Brothers by George and John Brown. In 1825, their brother James opened an affiliate company on Pine Street in New York City; eight years later, James established the headquarters of the now-consolidated Brown Brothers & Co. on Wall Street.
And, in many ways, the story of Brown Brothers & Co. is the story of Wall Street and of the rise of the financial industry in New York City.
Sons of an Irish immigrant who settled in Baltimore in 1800, the Brown brothers helped build a thriving textile operation. They began as importers of linen and soon moved on to cotton, purchasing vessels and establishing branches in Liverpool and the American South, including New Orleans. After its inception, Brown Brothers & Co. became a major lender and asset manager to the textile, commodities, and transportation industries in the U.S. and abroad.
Digitizing the Collection
Today, the New York Public Library holds over 110,000 pages of Brown Brother & Co.’s business records. These materials include journals, ledgers, records of sale, and other documents dated between 1825 and 1880, among them financial documents from the bank’s New Orleans and Havana offices.
Because of its historical importance, size, and the condition of its contents, the Brown Brothers Collection was a great candidate for digitization. With the help of a Council on Library and Information Resources (CLIR) “Hidden Collections” grant — which funds the digitization of rare and unique materials underrepresented in digital collections — the entire collection was photographed and processed for addition to the New York Public Library Digital Collections.
From Photo to Data: NewYorkScapes and the Brown Brothers Collection Project
Even though the Brown Brothers Collection has been digitized, our knowledge of what it contains is still limited. The tens of thousands of available images aren’t searchable, or even fully documented. Because the materials are written in handwriting from the 19th century, it’s nearly impossible to convert them to digital text automatically using existing tools.
In order to investigate the Brown Brothers Collection’s potential, NewYorkScapes embarked on a pilot project, generously funded by the NYU Center for the Humanities.
The project has three central goals:
- to conduct further historical research on the Brown Brothers Collection, including the important figures and events represented in the collection, the research that’s been done using these materials, and the collection’s place in the history of New York City banking
- to produce teaching materials and public programs focused on the information history of American capitalism, the plantation economy, and financial data
- to develop a data set and machine-readable transcriptions of the Brown Brothers Collection records, and to develop potential uses of the data by scholars, teachers, and the general public as an open-access resource
We’ve assembled a team of data scientists and humanities researchers to design models, explore methods of text and column detection, and ultimately extract machine-readable text from the digitized images programmatically. With a machine-readable data set, we can make parts of the collection searchable, as well as analyze the data in various ways to learn more about the stories represented in the collection.
Getting Started with Transcription
As we experiment with ways to turn the digital photographs in NYPL Digital Collections into machine-readable text (like text documents or spreadsheets), we have to “teach” our models to read the 19th century handwriting in the documents.
To do that, we’re creating “training” data sets for our algorithms. To ensure that our training data is accurate and as complete as possible, we’re transcribing a subset of a ledger from the collection by hand.
Manual document transcription can be time-consuming and requires careful attention to detail. For that reason, we welcome guest transcription and other forms of collaboration on this project. To get started as a transcriber, get project updates, or simply get in touch, please enter your information in the form below!