CEP/STICERD Applications Seminars
Deep learning methods to curate economic data at scale
Melissa Dell (Harvard University)
Monday 13 June 2022 16:00 - 17:30
Many of our seminars and public events this year will continue as in person or as hybrid (online and in person) events. Please check our website listings and Twitter feed @STICERD_LSE for updates.
Unless otherwise specified, in-person seminars are open to the public.
Those unable to join the seminars in-person are welcome to participate via zoom if the event is hybrid.
About this event
Vast amounts of data are trapped in non-computable formats, such as document image scans and text. Deep learning has the potential to greatly expand the questions that economists can study by providing rigorous methods for converting non-computable information into structured, computable data. Combined with advances in GPU compute and inexpensive cloud compute, this makes it feasible to process data on a massive scale. This talk will provide an overview of our work to develop deep learning methods and tools for creating computable social science data, with an aim of making structured digital data more representative of documentary history. This work emphasizes lower resource contexts - for which there are few incentives for commercial technology ? and encompasses novel approaches and tools for document layout analysis, OCR, and NLP pipelines.