Canterbury Statistics Portal

Statistics events of interest to the Canterbury region.

Modern Regression Techniques in R Workshop Upload ID 148

John Maindonald and Maheswaran Rohan (ANU and AUT)

Starts: 09:00 AM, Nov 23, 2015
Ends: 05:00 PM, Nov 23, 2015

Presented by John Maindonald and Maheswaran Rohan

"Modern regression" is a name for methodologies that allow the automated fitting of curves and surfaces. “Statistical Learning” is another commonly used name. Where the conditions for their use are near enough to satisfied, they can be highly effective. Implications of failure of the independence assumption, as with time series and other correlated data, will be noted. The methodology can provide a useful check on theoretically based or other specific assumed forms of regression response. The methodology is modern, also, in its use of re-sampling methods – cross-validation, repeated simulation, and (although this may be beyond the scope of a 1-day workshop) bootstrap sampling. These tools have an important role in avoiding over-fitting and over-optimistic assessment of accuracy as a result of variable and/or model selection. Hybrid methods, for example combining random forests with regression smoothing, will be noted.

The workshop will start by reviewing “traditional” regression methodology. Techniques will be demonstrated for gaining insightful two or three dimensional views of data where there may be many variables. Graphs will be used, extensively, to display and give insight into results.

It enhances and supplements, rather than replaces, methodology from a more conventional tradition of data analysis. The workshop will aim to give a sense of where the new tools fit in this larger context. Hopefully, the workshop will help avoid the repeating of mistakes made in some recent papers whose claims have attracted media attention. (One such recent claim has been that US Atlantic hurricanes with female names, because taken less seriously, have been more dangerous to human life than those with male names!)

The course will use a “learning by doing” style of teaching. Most use of R will be from the command line, using the attractive RStudio "interactive display environment" to manage, organize and record work. Wherever possible, graphical displays will be used to help in the interpretation of results. A side theme will be the use of RStudio’s abilities for keeping a log of calculations and, optionally, output — a first step in the use of RStudio’s abilities for reproducible reporting.

The presenters

John Maindonald has had wide experience as a quantitative problem solver, working with researchers in diverse areas of science and industry. In 1996 he moved from NZ to Australia, then taking a position at The Australian National University (ANU) in 1998. He is the author of a book on Statistical Computation, and the senior author of “Data Analysis and Graphics Using R” (3rd edition, CUP 2010). Now in semi-retirement, he does occasional consulting, fronts workshops on the R system, and continues to write. He has recently moved back to NZ, to live in Wellington.

Dr Maheswaran Rohan (Rohan) has joined the academic staff at Department of Biostatistics and Epidemiology, Auckland University of Technology (AUT) in 2014. Prior to AUT University, Rohan was a statistician with the Department of Conservation in Hamilton, New Zealand. He developed a unified framework to compute robust statistics for various statistical models parameters. In addition, his applied research involves with medical and environmental research and statistics education especially in how to teach statistics for non-statisticians.

Information for Attendees

It is strongly recommended you bring your own laptop to the workshop. This allows you to take away all the coding, data and worked examples from the workshop. Wifi facilities will be available. If you are not bringing a laptop you need to inform the NZSA conference organisation no later than November 16 so they can organise a user id for you.

The laptop you bring should have R version 3.2.2 (or later) and RStudio installed. The R packages DAAG, ggplot2, knitr, gamclass, rgl, scatterplot3d and gamlss should be installed, plus dependencies. For further details see Links below, notes further packages that it may be useful to install, and gives instructions on preparation for the course. If you have any problems with this we will have people available from 8:30-9:00 of the day of the workshop to assist.

It is important that anyone with limited or no previous exposure to R work through the preparatory notes and exercises. The first hour of the workshop, 9am-10am, will quickly review Chapters 1-4 from the preparatory notes and briefly cover the exercises.

This workshop is being provided as part of the NZSA conference in Christchurch, but you do not need to attend conference to go to this workshop.

Morning tea, lunch and afternoon tea will be provided

Workshop Fees

Regular

Attending NZSA conference $150
If not attending NZSA conference $250

Student

Attending NZSA conference $50
If not attending NZSA conference $100

If you are not going to NZSA conference to register for this workshop fill in the form at https://form.jotform.co/52780952912865.

You do not need to register for the NZSA conference to attend this workshop, but you will get a discount if you do. To register for the NZSA conference and this workshop you need to go to the NZSA conference website https://secure.orsnz.org.nz/conf49/index.php. If you are a student and present at NZSA conference we will refund your fee after your presentation and attendance at one of the conference workshops but you still need to register and pay for both.

For further information NZSA.2015.conference@gmail.com

Location

University of Canterbury

Basement of Erskine Building, Erksine 035/036
Engineering Road
University of Canterbury

Location map