6 April 2017, 4-5pm CET

Robots Reading Vogue: How text and data mining (TDM) cast new light on a large historical archive

The ever-increasing amount of digitized cultural heritage material online has researchers and students alike curious about new approaches to making sense of large-scale digital archives. Text and data mining is one approach to meeting this challenge -- either by allowing researchers to search across vast amounts of data in order to answer specific questions, or by surfacing latent patterns within Big Data. 

In this webinar, Peter Leonard and Lindsay King from Yale University Library explain how text and data mining techniques cast new light on a large historical archive: every page of every issue of American Vogue magazine from 1892 to 2013. They will start with simple techniques, such as n-gram visualizations, and move on to more sophisticated approaches such as topic modeling and word embedding. Finally, they will consider some emerging ways of applying data mining techniques to photography and illustrations. 

Prior to Peter and Lindsay’s presentation, key concepts of TDM will be presented by Evangelos Theodoridis (PhD),  along with indicative use cases and applications. Finally, various challenges in this area will be highlighted.

For editorial questions

undefined undefined

Timon Oefelein

Team Lead Account Development Europe

+49 (30) 827875274



Register for this webinar

Learning Outcomes

After this webinar you should have a better understanding of:

  • The basic principles of text data mining 

  • The adoption of text data mining to meet the challenges of searching large-scale digitized material 

  • N-gram visualization, topic modeling, and word embedding techniques 

  • The application of data mining techniques to photography and illustrations

Webinar details

Thursday, 6th April 2017

4pm CET / 3pm BST / 10am EDT

Subject level:
Beginner to intermediate

1 hour incl. Q&A

About the presenters

Peter Leonard

Peter Leonard

Director of the Digital Humanities Lab

Yale University Library

Peter Leonard (@pleonard) is the Director of the Digital Humanities Lab at Yale University Library, where he helps humanities researchers use quantitative and algorithmic techniques. He came to Yale in 2013 as the first Librarian for Digital Humanities Research. Prior to coming to Yale, Peter was responsible for humanities research computing at the University of Chicago and served as a postdoctoral researcher in text-mining at UCLA, supported by a Google Digital Humanities Research Award.

Lindsay King

Lindsay King

Associate Director for Access and Research Services Haas Arts Library

Yale University

Lindsay King (@mslindsayking) is Associate Director for Access and Research Services in the Haas Arts Library at Yale University. She oversees public services--including reference, instruction, outreach, and digital services--supporting students and faculty in art, history of art, architecture, drama, theater studies and dance. Her research interests include art patronage, fashion history, and applications of digital humanities methods in the visual and performing arts.


Evangelos Theodoridis (PhD)

Senior Data Engineer

Springer Nature

Evangelos Theodoridis (PhD) is a senior data engineer at Springer Nature working at the Knowledge Graph Team. His research interests span across the areas of database technologies, data mining, indexing & information retrieva,l and algorithm engineering. Before joining Springer-Nature he was a research scientist at Intel Labs Europe and he was a lecturer in databases and data structures.

About the host

undefined undefined

Matt Peck

Account Development Manager

Springer Nature

Matt has worked for over 7 years in scientific publishing, with early career experience of journals product marketing, he has spent the past couple of years working in Library Communications, and in April 2016, was appointed Account Development Manager for UK & Ireland markets at Springer Nature.

Unable to attend? This session will be recorded and made available for downloading.


For further questions please get in touch with Matt Peck

Stay up to date

Follow us on Twitter

Here to foster information exchange with the library community

Follow us on Facebook

Springer Nature's LibraryZone is a community developed to foster sharing of information with the library community. Enjoy!

Connect with us on LinkedIn

Connect with us on LinkedIn and stay up to date with news and development.

Sign up for our alerts

News, information on our forthcoming books, author content and discounts.