Technical Advice

Follow this advice.

R

  • When using download.file(), make sure to set mode = "wb". This ensures that the download will work on all platforms.

  • When loading libraries at the start of an Rmd file, load tidyverse last. This decreases the chance of confusing name conflicts, like getting count() from the plyr library rather than from the dplyr library, which is almost certainly the version you want. dplyr is a part of the tidyverse so, by loading the tidyverse last, you ensure that dplyr will take precedence over any other library with identically named functions.

Rmd

  • Use two blank spaces at the end of a line to ensure a new line.

  • You can add tabs in your RMarkdown document by adding {.tabset} in your header. All sub-headers will then appear in a tab instead of alone.

  • An empty code chunk with — r ref.label=knitr::all_labels(), echo=TRUE, eval=FALSE — in the chunk header will magically print out all the code (to the pdf/html output) from the Rmd file in which it lives. I use this for the problem set and exam solutions.

  • code_download: true added below html_document: in the YAML header will produce a Code button for downloading the underlying Rmd file.

Git

  • If you have git problems, your first stop is Happy Git and GitHub for the useR by Jenny Bryan.

  • If you are using Windows 10, it may be helpful to install GitHub Desktop.

  • Never check in your .Rproj file.

  • Always check in your .gitignore file. I always add *.Rproj to my .gitignore file so that Git doesn’t keep bothering me about the .Rproj file.

  • Read your git error messages. They often tell you what to do!

  • You often need to “pull” before you can “push” your code. Pulling and then pushing is good default workflow.

  • If you commit something by mistake, you can, before you push it, undo the commit by typing, from the command line: git reset HEAD~. Background here and here.

Github

  • Keep your collection of repos clean. You can delete repos that you are no longer using or want to keep. For a specific repo, look under “Settings” for instructions.

  • These steps will often solve a weird GitHub problem, especially in the case of a problem set or exam:

    • Clone the repository from GitHub.com into a new local directory (same way we always do it, but just save it somewhere other than where you are currently working)
    • You now have whatever the latest version is of your project on GitHub. Copy over any local changes since your last successful push — from the old local directory you were working in — to the new local directory you just created.
    • Add, commit, push. Use this new local directory from now on. Delete the bad directory from your computer to avoid confusion.

RStudio

  • Under Tools -> Global Options -> General, set the “Save workspace to .RData on exit:” to “Never”.

  • Reflow comment lines with Shift+CMD+/

  • Under Tools -> Global Options -> Code -> Saving, set the “Default text encoding:” to “UTF-8”. This is especially important for Windows users from non-English locales.

  • Under Tools -> Global Options -> Code -> Saving, check the box for Ensure Source Files end with a Newline.

  • Reformat code with Shift+CMD+A

DataCamp

  • If you have trouble finding a course directly from DataCamp, go to the syllabus and click on the link that we provide for each course. This is often the easiest way of starting.

  • Clicking “Start Course for Free” may not work, but “Continue Chapter” almost always will.

  • If DataCamp is behaving strangely, then restart your browser. This solves most problems. If that does not work, restart your computer. If that does not work, use a different browser. Chrome seems to work best.

  • It is fine to use several “Take Hint” and a few “Show Answer” options each chapter. We never want you to get stuck.

  • If weird things start happening — especially a failure of DataCamp to credit the right solution — try restarting your browser. You won’t lose any work. Sometimes, DataCamp just needs to reset itself.

  • If, despite choosing “Show Answer”, DataCamp refuses to give you credit, don’t worry! Just skip that question and do everything else. DataCamp will still think you have not completed the course — because of that one question — but that is OK. Just e-mail a teaching fellow when the assignment is due and tell them about the difficulty. They will give you full credit and not charge you a late day.

David Kane
Data Scientist
comments powered by Disqus