

The database consists of 2,106 input videos and their related summaries. The resulting database from this web-scraping is called WikiHow Summaries.
#SUMMARIZE MACHINE HOW TO#
The WikiHow articles (e.g., see How to Make Sushi Rice) contain exactly this: corresponding text that contains all the important steps in the video listed with accompanying images/clips illustrating the various steps in the task.’ ‘Viewers who want an overview of the task would prefer a shorter video without all of the aforementioned irrelevant information.

‘Each article on the WikiHow Videos website consists of a main instructional video demonstrating a task that often includes promotional content, clips of the instructor speaking to the camera with no visual information of the task, and steps that are not crucial for performing the task. The system is trained on pseudo-summaries generated from the content structure of the WikiHow website, where real people often leverage popular instructional videos into a flatter, text-based multimedia form, frequently using short clips and animated GIFs taken from source instructional videos.ĭiscussing the project’s use of WikiHow summaries as a source of ground truth data for the system, the authors state:
