Program Content

The program will provide a hands-on introduction to data analytics topics ranging from basic SQL and Python coding to building and interpreting machine learning models to tackle policy problems. During the program, we will use a mix of confidential datasets in the course materials.

The training will take place at the main Hyde Park campus of the University of Chicago. 

Table of contents:


Communications

We primarily use two media to distribute information and communicate amongst the group: website, e-mail and Slack. In general, the instructor team will respond quickly to either e-mail or Slack messages; however, we tend to prefer Slack for technical issues and sharing snippets of information.

E-mail addresses

Slack

We use Slack extensively and it is often the best way to get in touch with us, the team is ada-course.slack.com and by now you should all have received invitations to join (if you have not please let us know). If you are unfamiliar with Slack, it has different "channels" to help organize conversations; previous classes have used Slack in various ways (eg for a channel for Python specific questions), but the two primary channels we expect to use are (i) the "class-5-uchicago" channel for general class discussion and sharing documents and (ii) the "adrf-tech-support" channel for any technical support in accessing the ADRF.


Pre-course material

Our collaborators at the University of Maryland and the University of Mannheim have prepared some introductory material for you to work through prior to the program. The material is presented in a four-week, short course format; two weeks for SQL and two for Python. We expect the material for each week to take a maximum of 4 hours total to work through. 

data documentation

Project team list: PDF 

WEEK 1: July 18-20 & 23-24 (all times central)

July 18 - Program introduction, Kent Laboratory 107 1020 E 58th Street, Chicago IL, 60632

  • 9:00-10:30 Welcome & Introductions (slides)

  • 10:30-11 Break

  • 11-12:00 NYU's Administrative Data Research Facility (ADRF) - Security training, data agreements, and brief demo

  • 12:00-1:00 Lunch

  • 1:00-1:30 Hands introduction to ADRF

  • 1:30-4:00 Projects and scoping

July 19 - Data exploration & Visualization, 9am-4pm Central at Kent Laboratory 107 1020 E 58th Street, Chicago IL, 60632

  • 9:00-10:30 Introduction to Databases (slides)

  • 10:30-10:45 Break

  • 10:45-12:00 Hands on exploration of datasets

  • 12:00-1:00 Lunch

  • 1:00-2:30 Data visualization (lecture)

  • 2:30-4:00 Data visualization (exercises)

July 20 - Record Linkage, 9am-4pm Central at Kent Laboratory 107 1020 E 58th Street, Chicago IL, 60632

  • 9:00-9:30 Measurement and Description (slides)

  • 9:30-10:45 Record Linkage (lecture)

  • 10:45-11:00 Break

  • 11:00-12:00 Record Linkage (exercises)

  • 12:00-1:00 Lunch

  • 1:00-4:00 Hands on data exploration in Python

July 23 - Introduction to Machine Learning, 10am-5pm Central at Harris School of Public Policy 289A 1155 E 60th St, Chicago, IL 60637

  • 10:00-11:30 Introduction to Machine Learning (slides | recording)

  • 11:30-12:00 Break

  • 12:00-1:00 Machine Learning (cont.)

  • 1:00-2:00 Lunch

  • 2:00-4:00 Machine Learning model evaluation

  • 4:00-5:00 Project discussion and data exploration

July 24 - Text Analysis and Network Analysis, 10am-5pm Central at Harris School of Public Policy 289A 1155 E 60th St, Chicago, IL 60637

  • 10:00-11:30 Introduction to Network Analysis (lecture | recording)

  • 11:30-11:45 Break

  • 11:45-1:00 Network Analysis (exercises)

  • 1:00-2:00 Lunch

  • 2:00-3:45 Text Analysis (lecture | recording)

  • 3:45-5:00 Text Analysis (exercises)

WEEK 2: Sept 5-7 & 10-11 (10am-5pm central)

Location for week 2: Polsky North -  2nd Floor, 1452 E 53rd St, Chicago, IL 60615 (Washington Park Room)

Sept 5 - Machine Learning Deep Dive (WebEx recording)

  • 10:00 - 10:30: Welcome back, Recap of Week 1, and Goals for this week

  • 10:30 - 11:00: Machine Learning Recap (what we covered in week 1)

    • What is Machine Learning and what can it be used for

    • Types of Machine Learning Methods

    • How to evaluate Machine Learning Methods (methodology and metrics)

  • 11:00 - 1:00: Machine Learning Lecture (Deeper dive in to methods)

    • Unsupervised learning Methods

    • Supervised Learning Methods

  • 1:00 - 2:00: Lunch

  • 2:00 - 3:00: Machine Learning notebook

  • 3:00 - 5:00: Project work

Sept 6 - Inference (WebEx recording)

  • 10-11:30 Inference lecture (slides)

  • 11:30 - 12:00 Break

  • 12:00 - 1:00 Inference notebook

  • 1:00 - 2:00 Lunch

  • 2:00 - 5:00 Project work

Sept 7: Machine Learning in Practice

  • 10:00 - 11:00 Group discussion of team projects

  • 11:00 - 11:15 Break

  • 11:15 - 1:00 Machine Learning in Practice

    • FAQs

    • Common issues/mistakes

    • Life after building models

  • 1:00 - 2:00 Lunch

  • 2:00 - 5:00 Project work

Sep 10: Privacy and Confidentiality (WebEx recording)

  • 10:00 - 11:30 Privacy and confidentiality lecture (slides)

  • 11:30 - 11:45 Break

  • 11:45 - 1:00 ADRF Disclosure review & Export requests

  • 1:00 - 2:00 Lunch

  • 2:00 - 5:00 Project work

  • Happy hour

Sep 11: Ethics & Interim presentations

  • 10:00 - 11:30 Ethics, Bias, and Fairness in Machine Learning Systems (slides)

  • 11:30 - 11:45 Break

  • 11:45 - 1:00 Project work

  • 1:00 - 2:00 Lunch

  • 2:00 - 3:00 Interim presentations and instructor comments/feedback (~10 minutes total per team)

  • 3:00 - 5:00 Project work

presentations

Final project presentations will be held remotely over WebEx on Friday, September 28, between 11am and 2pm Central time. Each team will have 20 minutes to present followed by 10 minutes of Q&A.

Presentation schedule (WebEx recording)

  • 11:00 - 11:30 Team 1: WIA Youth Employment Outcomes (presentation, report)

  • 11:30 - 12:00 Team 2: Firm Survivability in Illinois (presentation, report)

  • 12:00 - 12:30 Team 6: Employer survivability (report)

  • 12:30 - 1:00 Team 4: 1-year survivability of small businesses in Illinois (presentation, report)

  • 1:00 - 1:30 Team 3: Predicting which firms will have high-turnover of low-wage workers (presentation)

  • 1:30 - 2:00 Team 5: Firm survivability across IL Economic Development Regions (presentation, report)

WebEx information

Program participants should have all received an invite from “messenger@webex.com” to be a Panelist during the WebEx Event on Friday, Sep 28. Please feel free to invite your colleagues to watch your presentation via the Attendee link.