My lecture notes on R

and some rants about teaching the course
R
lecture notes
teaching
Author

Kaan Öztürk

Published

December 12, 2024

Between 2017 and 2022, I was teaching two mass courses on introductory programming at Boğaziçi University. One of them was CMPE140, R programming for economics and management students. I converted my lecture notes for this course to a book-like HTML format using Quarto. They are now available on my website.

When I took it up, I did not know R, even though I’ve been teaching C and Python for many years. Of course, the basics of all structured programming languages are alike, so you can pick them up pretty easily: Variable declarations, decisions, loops, subroutines take you most of the way. For a course designed to be an introduction to programming, these are the main teaching objectives anyway.

The course teaches only base R, without using any libraries. This is by design. Modern R users use advanced data science modules such as dplyr, ggplot, and other tidyverse libraries. But the course’s scope was limited to teaching algorithmic thinking and basic programming, so these tools were left out by choice. However, I added some R-specific topics such as data visualization, even linear regression. I wanted the students to use their R skills in later courses, even improve them.

Roughly half of the notes are on general programming concepts, presented in R, and the rest on R-specific topics. You might find the order of lectures a bit odd (e.g., plotting before functions, functions before if statements). Sometimes we shifted some topics up, so that we could design programming questions about them in exams and assignments.

Before the course was passed to me, grading was based on multiple-choice exams. However, I believed that programming skills should be tested by having students write actual programs. Luckily, our department had a very secure lab system, which was used extensively for both teaching and testing. I adopted that system for CMPE140, too.

Our quizzes and other exams were actual programming problems. Students were free to use RStudio to write and execute their programs, and check the results against test cases we provided. Once they were confident, they would submit their solutions. There was no internet access, but they could access the inline help system.

The upside of this method is that students could use trial and error to get to the right result, guided by the computer. The downside is that the time constraint would cause stress for some students. If I were to give the course again, I would diversify grading by including some multiple-choice quizzes.

Grading was automated, using scripts that run the code and compare the output with the expected answer. So we have a binary score for every sub-problem. This method unfortunately prevents partial credit for “going the right path” but not getting the correct result. But with hundreds of students, manual grading was infeasible. We tried to subdivide tasks as much as possible, in order to give as much credit as possible.

Covid-19 forced us to change our ways. I could not possibly administer exams in a cramped computer lab. Grades became based on take-home assignments only. These were more challenging than exam problems, but I let them work for a full weekend to work on them.

To check against cheating, we used the Moss system. I was pleased to see that copying, though present, was not rampant. We took action for those we detected.

My teaching ended in Spring of 2022, when the appointed rector started to wreak havoc at Boğaziçi University. By that time, I had switched to part-time teaching. He did not renew my contract, against the wishes of the department, mere days before the start of the semester. Another lecturer had to take over the class with a very short preparation period.

If I continued, I certainly would have to change the delivery of the course, due to the wide availablity of LLMs. I gave some of my assignments to ChatGPT, and its answer was almost exactly the same as mine. Obviously I could not rely on take-home assignments, even with similarity checks. I would have returned to using in-class programming exams. I would also have to change the delivery of the course, possibly integrating it with LLMs, discussing things like the strong and weak points of LLMs, or how to check the correctness of the LLM’s answer.

Some words on my tech stack: I had prepared the notes as Jupyter notebook, which allowed me to integrate verbal descriptions with actual R code cells. During class, I would run the cells one by one, discuss the outcome, change the cell content to illustrate some point, run again, discuss again, etc.

With the RISE extension, I could present the notebook as a slide deck, displaying one or two cells at a time. It was not a static slideshow; I could run the cells in real time.

I uploaded my set of lecture notebooks to a GitHub repository, which also housed lab and problem-solving documents. We configured the repo to use the Binder service. With Binder, students could run the notebooks in the could, without having to donwload them or setting up a Jupyter environment (we were endorsing RStudio instead).

Having all the notes in Jupyter notebook form had another benefit: The Quarto publishing system can directly work with them. I had already prepared this blog with Quarto, without having any HTML or web design knowledge.

I converted all ipynb files to qmd (Quarto markdown) files with quarto convert, because I had some trouble re-establishing an R kernel with Jupyter notebook in the same virtual environment. Quarto render engine would then run R code blocks automatically, embedding the output in the document.

I did some editing on the notes, but it was either some minimal formatting for Quarto, or adding more explanations about the lecture (I do the explanations verbally in class, so written notes had some gaps).

Citation

For attribution, please cite this work as:
Öztürk, Kaan. 2024. “My Lecture Notes on R.” December 12, 2024. https://mkozturk.com/posts/en/2024/r-lecture-notes/.