Notes on the Archive

Welcome to the Library Data Archive

Robert E. Molyneux

This is an archive of data and reports about libraries. Before the broader introduction, I'd like to provide short links to the data available now, those planned, or documentation or related publications in digital form. After this section, there is a broader discussion of this project.

The current focus of the work of the Archive is to finish a project I started over 20 years ago: to collect all US government publications dealing with libraries. So, for now, I will start with this particular project. The general discussion of the Archive begins with a statement of the goals of the Archive

Librarians collect and organize publications from about every field but, oddly, do not collect, preserve, organize, or, perhaps, even value their own data. Episodically, the US government steps in and surveys, collects, and publishes library data. And when the reason for the government doing that changes, it stops. This pattern is not obvious unless one is inclined to look for old library data. There are exceptions that show librarians are capable of maintaining their data but those exceptions are few but notable.

In any case, I realized that the day would come when all these data and the publications about libraries based on analysis of the data collected by government agencies would be lost or at least imperiled. So, when I was at the U.S. National Commission of Libraries and Information Science (NCLIS), I started what has turned out to be this Archive. And it appears that day is now.

The government agencies involved in this collection currently are the National Center for Education Statistics and the Institute for Museums and Library Services, although there are a number of agencies that have been involved over the years. I have collected any such document by any such agency published in any year. What I have is here. The NCES-IMLS portion of the collection is, say, fairy systematic but the rest is opportuntistic. I did not go looking for the early publications. I had many other things to do.

For more on those two agencies, see the: Introduction to the NCES/IMLS Data

The reader will note that there are data from other types of libraries and collected by other non-US government agencies. I have more so I will add them as I have time

Comments and suggestions are always welcome. I have done this without an editor and everyone needs an editor. For now, write me at drdata@librarydataarchive.com.

Links to library datasets, documentation, and reports

Data available now (all from US Government sources)
Source of data	Type of library	Link to publications	Years	Notes
NCES-IMLS	A collection of publications summarizing and reporting on characteristics of libraries from results of the various surveys published by NCES-IMLS about the US Public Library Data. There are also publications about the history of these data series.	Reports	FY 1983-FY2022	This is a large collection of publications. Note that currently, the earliest publication was about the 1977-78 US public library data collection. There are many such early publications. Collecting them converting them to a digital format will be a formidable task. Open source.
NCES-IMLS	US Public Library Data Survey Documentation	Documentation	FY 1987-FY2022	The annual publication of these data comprises three series: The State Summary/State Characteristics file [State library data], The Administrative Entity file [public library data], and the Outlet file [branch libraries.] The documentation is for all three of those files. There is no longitudinal file for the outlet data. The documentation for the two longitudinal files, PLDF3 and PUSUM, are at those links below. These files used the NCES-IMLS data but rearranged them so they are not, strictly speaking, government publications.. Open source.
NCES-IMLS	US public library data--raw data files	Annual data	FY 1973-FY2022	These data go back to FY1973, so they also predate the FSCS era. This earlier series was the LIBGIS (Library and General Information Survey.) Open source.
NCES-IMLS	State Library Agencies Survey/State Library Administrative Agency Survey	Main page	1994-2022	Open source.
NCES	Academic Library Statistics	ALS	1970-1971 through FY 2012	Early years are from the Higher Education General Information Survey (HEGIS) Open source.
NCES	School Library/Media Center publications	SLMC	1974 through 2013	Early years are from the Library General Information Survey (LIBGIS) Open source.
NCES	Federal Libraries and Information Centers	FLIC	1994	Open source.

Datasets derived from NCES/IMLS data
Source of data	Type of library	Link to publications	Years	Notes
NCES-IMLS	US Public Library Data	PLDF3	FY 1987-FY2020	This is a longitudinal recompilation of the annual Administrative Entity US public library from NCES-IMLS. This is a large file. Open source.
NCES-IMLS	US State Summary Public Library Data	PUSUM	FY 1992-FY2020	This is a longitudinal recompilation of the annual State Summary/State Characteristics public library from NCES-IMLS. Open source.
NCES-IMLS	Benchmarking Tables	Benchmarking Tables are under construction	FY 1992-FY2020	There seem to be 27 tables for each of the years 2018-2022. The original source on the IMLS site. The files are in .csv format but organized differently from the way they were published by IMLS. Open source.

These data are library data or analysis collected and published by either the National Center for Education Statistics or the Institute for Museum and Library Services. For more on those two agencies, see: NCES-IMLS introduction

These two sets of are from non-US government sources and available at the links below.
Source of data	Type of library	Filename and link	Years	Notes
Princeton Compilation	Data from the Princeton Compilation	Princeton	[Academic years] 1919/20- 1943/44	I keyed these data when I was working on the The Gerould Statistics. Open source
Purdue Data	Academic library data series of 58 academic libraries	Purdue	from 1951	Are these the actual Purdue data? I believe so but it is a tangled web discussed at the link. Open source.

The following are not open source because I did them for the agencies listed. The are “works for hire.” I would need permission of those agencies to distribute the data. These were based on the infrastructure of the Stubbs-Buxton Cumulated ARL University Library Statistics.

Works for hire
Source of data	Who owns the data?	Type of library	Filename and link to ARL Infrastructure	Years	Notes
Gerould Statistics	ARL	Academic library data begun in 1907/08	Gerould background. [ARL infrastructure]	[Academic years] 1919/20- 1943/44	I keyed these data when I was working on the The Gerould Statistics for the Association of Research Libraries ARL has produced derivative products and owns these data.
Survey/Compilation	ACRL	Academic: Historically Black Colleges and Universities	HBCU [ARL]	[Academic year] 1988-89	Not a longitudinal series but one of the collections following the ARL structure
ACRL members	ACRL	ACRL libraries not in ARL	ACRL [ARL]	1978/79-1987/88	Also followed the ARL structure and used the ARL form. Essentially, ARL surveyed (roughly) the largest 100 academic libraries and ACRL surveyed the (roughly) second 100 libraries.
Gerould/ARL	ARL	Research Libraries	Research Library Statistics [ARL]	[Academic years] 1907/08-1987/88	This was a compilation issued in digital formats, with a guide. It was the first time the Gerould and ARL data were joined in one series.

March 26, 2025