reference material
This page has additional information about:
Administrative Data Research Facility (ADRF)
The training program will be taught on the Administrative Data Research Facility (ADRF) - a secure computing environment developed by NYU. The goal of the ADRF is to encourage collaboration while preserving data confidentiality - it is designed following the Five Safes framework. For Administrative Data Research Facility (ADRF) documentation, please see the ADRF ReadTheDocs.
cheatsheets & tutorials
The below cheatsheets and tutorials were compiled during the 2017 training programs.
R Cheatsheets
Other Resources
Git 1-pager and web tutorial
Python 1-pager from DataCamp and longer version of general Python notes
Pandas:
Drive folder with Python, Pandas, and Matplotlib notes,
Python's requests & BeautifulSoup libraries (for webscraping & APIs): link
PostGIS: extensive function cheatsheet, and full function reference
Boundless Intro to PostGIS (great intro exercises)
Spatial analysis in Python:
SciKit-Learn’s algorithm selection cheatsheet
readings
Schnepel, Kevin T. "Good jobs and recidivism." The Economic Journal (2016)
From College To Jobs: Making Sense Of Labor Market Returns To Higher Education
Transforming U.S. Workforce Development Policies for the 21st Century
Temporary Help Agencies and the Advancement Prospects of Low Earners
Record linkage readings
Disclosure limitation & Confidentiality
The modernization of statistical disclosure limitation at the U.S. Census Bureau
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
Implications of Differential Privacy for Census Bureau Data and Research
How Modern Disclosure Avoidance Methods Could Change the Way Statistical Agencies Operate
Text Analysis
Probabilistic Topic Models by Blei
Topic Modeling Workshop by Blei
Text as Data by Gentzkow, Kelly, and Taddy
Spatial analytics readings recommended by Julia Koschinsky
Varied statistics, econometrics, and programming readings recommended by Tim Savage