Published June 28, 2021 | Version v1
Other Open

Project Production Files for Collection 21

Description

This is an archive of the WE1S project folder from which Collection 21 is derived. WE1S "projects" are folders that contain all notebooks, publicly available data, topic models and other analyses, and visualizations associated with the project. WE1S "collections" contain just the data and visualizations produced in project folders and may contain code updated for public presentation. This project folder contains a snapshot of the work that had been done on the project at the time of archiving. Some code may not be the latest code, and there may be examples of test code that was run after the main workflow.

WE1S makes available only "non-consumptive use" word frequency, topic model, and other datasets along with their visualizations. Datasets cannot be used to access, read, or reconstruct the original texts. Where such content cannot be published because of intellectual property restrictions, it has been deleted from project data files. However, tables of features (tokens and annotations) are retained.

A record of our published data and visualizations for Collection 21 is available at https://zenodo.org/record/4927745.

Notes

The archive file can be extracted by a variety of third-party tools. However, for convenience, we provide a Python script to perform this action. To use it, cd to the folder where you have downloaded the script and run python extract.py path/to/archive/file. Run python extract.py --help for further options.

Files

Files (1.6 GB)

Name Size Download all
md5:d8b1918e0dc0e48d72a1dcd16fb9938f
1.6 GB Download
md5:97b9fbf27b919bbccce35681e8aa7a17
1.4 kB Download

Additional details

Related works

Is source of
10.5281/zenodo.4927745 (DOI)