Major Data Sources with Restricted Versions

A number of major social science data sources that are publicly available also have restricted access, confidential versions that may be obtained under certain conditions that are generally well codified.  As examples:

American National Election Study (ANES)

The mission of the American National Election Studies (ANES) is to inform explanations of election outcomes. An ANES study interviews a sample of about 6,000 U.S. eligible voters during one or more interview periods. Most often, the sample is designed to be representative at the national level only, and is not selected to represent states or smaller areas.  ANES does produce restricted access files that provide geographic, occupation or industry detail.

Bureau of Labor Statistics (BLS)

The Bureau of Labor Statistics (BLS) has an onsite researcher program that allows access to confidential microdata to eligible researchers for approved statistical analyses.  The restricted geocode files from National Longitudinal Surveys are also available under other terms outlined below.

Census Bureau Research Data Centers (RDCs)

The Census Bureau's Center for Economic Studies maintains and enhances microdata for research use in approved projects. All researcher access to restricted–use data occurs at the secure Census RDCs (including the one at UC Berkeley) which provide access to a variety of confidential data including economic data for business establishments and firms (Economic Census, Longitudinal Business Database (LBD), Annual Survey of Manufactures (ASM)); demographic data for individuals and households (Decennial Census, American Community Survey (ACS), Current Population Survey (CPS));  Londitudinal Employer-Household Dynamics (LEHD) Data (Employment History Files (EHF), Employer Characteristics File (ECF)) and Health Data from partnering agencies (Agency for Healthcare Research and Quality (AHRQ) and the National Center for Health Statistics (NCHS)). As of November 2014, the Social Science Data Service partners with the California Census Research Data Center at UC-Berkeley  to allow UC Davis researcher to make use of the Berkeley facility free of standard user or laboratory fees. Contact the Data Service staff for additional information.

Centers for Medicaid and Medicare Services (CMS)

The Centers for Medicaid and Medicare Services (CMS) provide a variety of data on health and healthcare utilization.  CMS works with ResDAC to provide access to its data to researchers.  There are two levels of restricted access data: Limited data sets and research identifiable files.  Limited Data Set Files (LDS)  have been stripped of data elements that might permit identification of beneficiaries. These files contain beneficiary level health information but exclude specified direct identifiers as outlined in the HIPAA Privacy Rule.

Displaced New Orleans Residents Survey (DNORS)

The Displaced New Orleans Residents Survey (DNORS) examines the current location, well-being, and plans of people who lived in the City of New Orleans when Hurricane Katrina struck on 29 August 2005.  DNORS provides both public use and restricted data.

Fragile Families and Child Wellbeing Study

The Fragile Families and Child Wellbeing Study is following a cohort of nearly 5,000 children born in large U.S. cities between 1998 and 2000 (roughly three-quarters of whom were born to unmarried parents). Unmarried parents and their children are referred to as  “fragile families” to underscore that they are families and that they are at greater risk of breaking up and living in poverty than more traditional families. The study releases geographic identifiers and additional data to the public via restricted use data agreements.  Add-on files available include 1) a geographic file with variables for the focal child's birth city, mother's and father's state of residence at each interview,  and stratum and psu (note: replicate weights are available on the public file in lieu of these), 2) a set of contextual characteristics of the census tract at each wave, 3) medical records data for mothers and children from the birth hospitalization record, and 4) a school characteristics file (for the focal child's school at the Year 9 follow-up wave) based on National Center for Educational Statistics data.

General Social Survey (GSS)

The GSS contains a standard 'core' of demographic, behavioral, and attitudinal questions, plus topics of special interest. Many of the core questions have remained unchanged since 1972 to facilitate time-trend studies as well as replication of earlier findings. The GSS takes the pulse of America, and is a unique and valuable resource. It has tracked the opinions of Americans over the last four decades. GSS public use files are described here. The GSS geographic identification code files are made available to researchers under special contract with NORC.

Health and Retirement Study (HRS)

The Heath and Retirement Study is a longitudinal panel study that surveys a representative sample of more than 26,000 Americans over the age of 50 every two years. The HRS explores the changes in labor force participation and the health transitions that individuals undergo toward the end of their work lives and in the years that follow. Since its launch in 1992, the study has collected information about income, work, assets, pension plans, health insurance, disability, physical health and functioning, cognitive functioning, and health care expenditures.   To receive files for use at UC Davis, researchers must currently be receiving federal grant funds (among other conditions).  The Michigan Center on the Demography of Aging (MiCDA) Data Enclave is available to prospective users of restricted data files who do not meet the requirements for restricted data contractual agreements or researchers who have special data analysis needs that cannot be met under the terms of a standard restricted data agreement.

Inter-university Consortium for Political and Social Research (ICPSR)

The Inter-university Consortium for Political and Social Research provides access to a number of restricted dataset through the ICPSR Data Access Request System (IDARS).

Los Angeles Family and Neighborhood Study (LA FANS)

The Los Angeles Family and Neighborhood Survey (L.A.FANS) is a study of adults, teens, children, and neighborhoods in Los Angeles County. Its goal is to understand: how neighborhoods affect a variety of outcomes, including children’s development and well-being and stress and health among children and adults. In 2000-2001, 3,000 families in 65 neighborhoods were interviewed as part of Wave 1 of L.A.FANS. For Wave 2, which was fielded in 2006-2008, these families were reinterviewed as well as new families who moved into these neighborhoods.

National Center for Education Statistics (NCES)

The National Center for Education Statistics provides restricted data licenses for a number of its studies through the Institute for Education Studies (IES).

National Center for Health Statistics (NCHS)

A number of NCHS datasets contain restricted information that could compromise the confidentiality of survey respondents or institutions, or is sensitive by nature.  There are several modes of access to these data, including use at a Census Bureau RDC or the CDC RDC, as well as staff assisted or remote access. These modes and their access requirement are described here.

National Death Index (NDI)

The Centers for Disease Control produce the National Death Index (NDI), a centralized database of death record information on file in state vital statistics offices.

National Longitudinal Study of Adolescent Health (Add Health)

The National Longitudinal Study of Adolescent Health (Add Health) is a longitudinal study of a nationally representative sample of adolescents in grades 7-12 in the United States during the 1994-1995 school year. The Add Health study gathered data from a sample of 80 high schools and 52 middle schools representative of US schools with respect to region of country, urbanicity, school size, school type, and ethnicity.  The Add Health cohort has been followed into young adulthood with four in-home interviews, the most recent in 2008, when the sample was aged 24-32. Add Health combines longitudinal survey data on respondents' social, economic, psychological and physical well-being with contextual data on the family, neighborhood, community, school, friendships, peer groups, and romantic relationships, providing unique opportunities to study how social environments and behaviors in adolescence are linked to health and achievement outcomes in young adulthood.

National Longitudinal Surveys (NLS) Geocode

The National Longitudinal Surveys are a set of surveys designed to gather information at multiple points in time on the labor market activities and other significant life events of several groups of men and women. For more than 4 decades, NLS data have served as an important tool for economists, sociologists, and other researchers.

National Science Foundation/National Center for Science and Engineering Statistics (NSF/NCSES)

NCSES collects data related to U.S. competitiveness and STEM education including data on research and development, the science and engineering workforce, and U.S. competitiveness in science, engineering, technology, and R&D and the condition and progress of STEM education in the United States.  Under certain conditions, restricted access microdata files may be obtained under a license agreement for the following NSF/NCSES surveys: National Survey of College Graduates, National Survey of Recent College Graduates, Survey of Doctorate Recipients, Survey of Earned Doctorates, and SESTAT Integrated File.

National Survey of Child Abuse and Adolescent Well-Being

The Administration on Children, Youth, and Families and the Office of the Assistant Secretary for Planning and Evaluation have undertaken the National Survey of Child and Adolescent Well-Being (NSCAW). NSCAW makes available, for the first time, nationally representative longitudinal data drawn from first-hand reports of children and families or other caregivers who have had contact with the child welfare system. The target population for the NSCAW includes all children and families that enter the child welfare system. Two samples were drawn from the population in 92 participating county child welfare agencies throughout the nation.   the NSCAW data at both levels of release have been extensively analyzed to determine which items of information, used alone or in conjunction with other variables, have significant disclosure potential. Variables that were found to pose significant risk of reidentification were suppressed or altered to remove

or reduce such risks. The restricted release data are more complete and have been only minimally altered through suppression and recoding.

New Immigrant Survey (NIS)

The New Immigrant Survey (NIS) is a nationally representative multi-cohort longitudinal study of new legal immigrants and their children to the United States based on nationally representative samples of the administrative records, compiled by the U.S. Immigration and Naturalization Service (INS), pertaining to immigrants newly admitted to permanent residence.  Restricted use datasets, with two levels of data are available.

Panel Study of Income Dynamics (PSID)

The Panel Study of Income Dynamics is the longest running longitudinal household survey in the world. The study began in 1968 with a nationally representative sample of over 18,000 individuals living in 5,000 families in the United States.  Information on these individuals and their descendants has been collected continuously, including data covering employment, income, wealth, expenditures, health, marriage, childbearing, child development, philanthropy, education, and numerous other topics. The original sample has also been augmented since 1968 to enhance represntativeness.