Q Guide and Gov 1005

I occasionally claim that Gov 1005 has “among the highest Q scores for any lecture course at Harvard.” True or false?


The Q Guide is the official source of student evaluations for Harvard classes. These evaluations, often termed “Q scores,” are available to members of the Harvard community. Almost every year, two or three students from CS 50 will scrape these scores and create a public project to help students choose classes. Last year, Benny Chang ’22 and Blake Young ’22 created QGuide+. Check it out! Benny kindly shared the underlying data, which covers academic year 2018/2019. This data is a bit of a mess, which is not Benny’s fault! Scraping is hard, the Q website is buggy and the underlying information may not be accurate. But we work with what we have!


After cleaning the data, removing Graduate courses and keeping only courses listed as “Lecture,” we have 802 courses, which seems plausible. Here are the 10 with the largest enrollment:

name course department overall workload number
Introduction to Computer Science COMPSCI 50 Computer Science 3.7 11.2 633
The Ancient Greek Hero GENED 1074 General Education 4.2 2.6 550
Introduction to Probability STAT 110 Statistics 4.2 10.2 454
Multivariable Calculus MATH 21A Mathematics 3.7 8.2 317
An Integrated Introduction to the Life Sciences: Chemistry, Molecular Biology, and Cell Biology LIFESCI 1A Molecular & Cellular Biology 3.5 7.6 310
Introduction to Quantitative Methods for Economics STAT 104 Statistics 3.3 6.6 277
Intermediate Macroeconomics ECON 1010B Economics 3.7 4.9 257
Medical Ethics and History GENED 1116 General Education 4.0 4.5 244
Principles of Organic Chemistry CHEM 17 Chemistry & Chemical Biology 3.0 9.8 211
Intermediate Microeconomics ECON 1010A Economics 3.4 7.1 209

number is the number of students submitting evaluations, which is, on average, about 80% of the actual enrollment. overall is the average overall student evaluation of the course, on a 1 to 5 scale. workload is the average reported hours per week, generally understood not to include class time.

This is not obviously wrong, but Harvard students would immediately note some problems: Where is EC 10, always one of the largest courses? Are these just fall courses? How are big classes with many sections, like Expos, handled? Ignore those issues for now, although they are all real enough.

Lecture Course

What exactly is a “lecture course?” Good question! Although we have restricted our sample to courses official labelled as “Lecture,” many of these are small discussion groups, including language classes. Let’s remove any language class, any class which “really” meets almost only in small sections (like Expos) and all classes with less than 25 students. Ordered by evaluation, this give us:

name course department overall workload number
Conflict Resolution in a Divided World GENED 1033 General Education 4.9 3.7 40
Performing Latinidad SPANSH 126 Romance Languages & Lit 4.8 4.0 32
Privacy and Technology COMPSCI 105 Computer Science 4.8 5.0 31
Privacy and Technology COMPSCI 105 Computer Science 4.8 5.0 31
Systems Security COMPSCI 263 Computer Science 4.8 10.6 26
Data GOV 1005 Government 4.7 11.2 35
High and Low in Postwar America ENGLISH 170A English 4.7 4.5 44
Studies in Algebra and Group Theory MATH 55A Mathematics 4.6 20.2 38
Government and Politics of Modern Japan GOV 1270 Government 4.6 3.3 34
Topics in Machine Learning: Batch Reinforcement Learning COMPSCI 282R Computer Science 4.6 11.2 26


  1. I think that this table justifies my claim that Gov 1005 has “among the highest Q scores for any lecture course at Harvard.”

  2. Conflict Resolution in a Divided World is an impressive course. If you are a Harvard student, check out the written Q comments. They are stunning. Alas, there is a lottery to get in.

  3. I am not sure if COMPSCI 263 and COMPSCI 282R belong in this list. Aren’t they mostly for graduate students?

  4. I sometimes claim that, adjusted for workload, Gov 1005 has the best student evaluations at Harvard. GENED 1033 and SPANSH 126 have better scores, but they only require about 1/3 as much work as I do. However, MATH 55A (!!!) puts me to shame on that dimension . . .

  5. I am not sure why COMPSCI 105 is listed twice. Again, this is a messy data set which I have not cleaned thoroughly.

Note that I have not included all the code in this post because I assign similar questions in a problem set.

David Kane
Data Scientist
comments powered by Disqus