About MATH1710

Organisation of MATH1710

This module is MATH1710 Probability and Statistics I. (A small number of second-year scientists are taking this module as the first half of MATH2700 Probability and Statistics for Scientists.)

This module lasts for 11 weeks from 2 October to 15 December 2023. The exam will take place between 15 and 26 January 2024.

The module leader, the lecturer, and the main author of these notes is Dr Matthew Aldridge (you can call me “Matt”, “Matthew”, or “Dr Aldridge”, pronounced “old-ridge”).

Lectures

The main way you will learn new material for this module is by attending lectures. There are two lectures per week. Because this is a very large class each lecture will be delivered twice:

Check your timetable to see which lecture you are assigned to each day (or click this link for a map of how to get to Chemistry West LT F.

I recommend taking your own notes during the lecture. I will put brief summary notes from the lectures on this website, but will not reflect all the details I say and write during the lectures. Lectures will go through material quite quickly and the material may be quite difficult, so it’s likely you’ll want to spend time reading through your notes after the lecture.

You are probably reading the web version of the notes. If you want a PDF copy (to read offline or to print out), it can be downloaded via the top ribbon of the page. (Warning: I have not made as much effort to make the PDF as neat and tidy as I have the web version, and there may be formatting errors.) I am very keen to hear about errors in the notes, mathematical, typographical or otherwise. Please email me if think you may have found any.

Attendance at lectures is compulsory.

Problem Sheets

There will be 5 problem sheets. Each problem sheet has a number of short and long questions, for you to work on in your own time to help you learn the material, and two assessed questions, which you should submit for marking. The assessed questions on each problem sheet make up 3% of your mark on this module, for a total of 15%. Deadlines are 2pm on Mondays, although I’d recommend completing and submitting the work in the previous week.

Problem Sheet Lectures covered Deadline for assessed work
1 1 and 2 Monday 16 October (Week 3)
2 3–6 Monday 30 October (Week 5)
3 7–10 Monday 13 November (Week 7)
4 11–14 Monday 27 November (Week 9)
5 15–18 Monday 11 December (Week 11)

An informal Problem Sheet 6 covering material from Lectures 19 and 20 will be available. Lectures 21 and 22 are revision lectures with no new material.

Assessed questions should be submitted in online through the Gradescope platform. Most students choose to hand-write their solutions on paper and then scan and submit on their phone using the Gradescope app. Further Gradescope details to follow nearer the first deadline.

Tutorials

Tutorials are small groups of about a dozen students. You have been assigned to one of 34 tutorial groups, each with a member of staff as the tutor. Your tutorial group will meet five times, in Weeks 2, 4, 6, 8, and 10; you should check your timetable to see when and where your tutorial group meets.

The tutorials are an interactive session, where the main goal will be to go over your answers to the non-assessed questions on the problems sheets, which you will have worked on in advance of the tutorial. In this smaller group, you will be able to ask detailed questions of your tutor, and have the chance to discuss your answers to the problem sheet. Your tutor may ask you to present some of your work to your fellow students, or may give you the opportunity to work together with others during the tutorial. Your tutor may be willing to give you a hint on the assessed questions if you’ve made a first attempt but have got stuck. Because of the much smaller groups, the tutorials are the most valuable type of teaching on the module; you should make sure you attend, and you should be well prepared to ensure you make the most of the opportunity.

My recommended approach to problem sheets and tutorials is the following:

  • Work through the problem sheet before the tutorial, spending plenty of time on it, and making multiple efforts at questions you get stuck on. I recommend spending at least 4 hours per problem sheet. This is a long time, but you shouldn’t expect to be able to answer the hardest questions on a problem sheet without making multiple attempts. You don’t have to wait until all lectures in a section are complete until starting to work on some of the questions. Collaboration is encouraged when working through the non-assessed problems, but I recommend writing up your work on your own; answers to assessed questions must be solely your own work.
  • Take advantage of the small group setting of the tutorial to ask for help or clarification on questions you weren’t able to complete.
  • After the tutorial, attempt again the questions you were previously stuck on.
  • If you’re still unable to complete a question after this second round of attempts, then consult the solutions.

Your tutor will also be the marker of your answers to the assessed questions on the problem sheets.

Attendance at tutorials is compulsory.

R Worksheets and Practicals

R is a programming language that is particularly good at working with probability and statistics. Learning to use R is an important part of this module, and is used in many other modules in the University, including MATH1712 Probability and Statistics II. R is used by statisticians throughout academia and increasingly in industry too. Learning to program is a valuable skill for all students, and learning to use R is particularly valuable for students interested in statistics and related topics like actuarial science.

You will learn R by working through one R worksheet each week in your own time, starting from Week 2. Even-numbered worksheets will also contain a few questions for assessment, which will be due by 2pm Monday the following week (except the last one). Each of these is worth 3% of your mark for a total of 15%. You will submit your answers through a Microsoft Form (details to follow later). I recommend spending one hour per week on the week’s R worksheet, plus one extra hour if there are assessed questions that week.

Week Worksheet Deadline for assessed work
2 1: R basics
3 2: Vectors Monday 23 October (Week 4)
4 3: Data in R
5 4: Plots I – Making plots Monday 6 November (Week 6)
6 5: Plots II – Making plots better
7 6: Discrete distributions Monday 20 November (Week 8)
8 7: Discrete random variables
9 8: Normal distribution Monday 4 December (Week 10)
10 9: Law of large numbers
11 10: Recap Thursday 14 December (Week 11)

 

R Practical sessions: You will be introduced you to R in your first Practical session, in Week 2. You will first see how to use R on University computers (these sessions will take place in computer “clusters”). There will then be an opportunity to install R on your own device – if you have a laptop on which you want to install R, bring it along to the practical session. A second practical, in Week 3, will allow you to get help on the R Worksheet 2, which is the first worksheet with assessed questions.

There are 11 R practical session groups – check your timetable for Weeks 2 and 3 to see when and where your group meets.

Attendance at the first R practical session (Week 2) is compulsory.

“Office hours” drop-in sessions

If you there is something in the module you wish to discuss one-on-one with the module leader, the place for the is the optional weekly “office hours”, which will operate as drop-in sessions. These sessions are an optional opportunity for you to ask questions you have to me; these are particularly useful if there’s something on the module that you are stuck on or confused about, but I’m happy to discuss any statistics-related issues or questions you have.

I currently plan two “office hours” drop-in sessions per week:

I may change arrangements as term continues – if attendance levels are low, I will move office hours to be actual office.

If neither time is possible, you may email me to arrange an alternative time to talk to me.

Attendance at “office hours” sessions is optional.

Time management

It is, of course, up to you how you choose to spend your time on this module. But my recommendations for your work would be something like this:

  • Lectures: 2 hours per week, plus 1 hour per week reading through notes.
  • Problem sheets: 4 hours per problem sheet, plus 1 extra hour for writing up and submitting answers to assessed questions.
  • R worksheets: 1 hour per week, plus 1 extra hour if there are assessed questions.
  • Tutorials: 1 hour every other week.
  • Revision: 16 hours total at the end of the module.
  • Exam: 2 hours.

That makes about 100 hours in total. (MATH1710 is a 10-credit module, so is supposed to represent 100 hours work. MATH2700 students are expected to be able to use their greater experience to get through the material in just 75 hours, so should scale these recommendations accordingly.)

Exam

There will be an exam in January, between 15 and 26 January 2024, which makes up the remaining 70% of your mark. The exam will consist of 20 short and 2 long questions, and will be time-limited to 2 hours. We’ll talk more about the exam format near the end of the module.

Who should I ask about…?

There are over 440 students registered for this module. If each student emails me once a week, and if each email takes me 10 minutes to read and respond, that will take more than 15 hours of my time every day! Generally, it’s much better to come to speak to me at the “office hours” drop-in session or, if it will be very quick, before or after a lecture.

  • I don’t understand something in the notes or on a problem sheet: Come to office hours, or ask your tutor in your next tutorial.
  • I’m having difficulties with R: In Weeks 2 or 3, you should ask at your R practical session; at other times, come to office hours.
  • I have an admin question about arrangements for the module: Come to office hours or talk to me before/after lectures.
  • I have an admin question about arrangements for my tutorial: Contact your tutor.
  • I have an admin question about general arrangements for my programme as a whole: Contact the Student Information Service or speak to your personal tutor.
  • I have a question about the marking of my assessed work on the Problem Sheets: First, check your feedback on Gradescope; if you still have questions, contact your tutor.
  • I have a question about the marking of my assessed work on the R Worksheets: You can email me about this.
  • Due to truly exceptional and unforeseeable personal circumstances I require an extension on or exemption from assessed work: You can apply by filling in the mitigating circumstances form at this link. Neither I nor your tutor can unilaterally offer an extension or exemption, so please don’t ask. (Extensions of up to 4 days are available for Problem Sheets. Only exemptions are available for R Worksheets.)

Content of MATH1710

Prerequisites

The formal prerequisite for MATH1710 is “Grade B in A-level Mathematics or equivalent”. I’ll assume you have some basic school-level maths knowledge, but I won’t assume you’ve studied probability or statistics in detail before (although I recognise that many of you will have). If you have studied probability and/or statistics at A-level (or post-16 equivalent) level, you’ll recognise some of the material in this module; however you should find that we go deeper in many areas, and that we treat the material through with a greater deal of mathematical formality and rigour. “Rigour” here means precisely stating our assumptions, and carefully proving how other statements follow from those assumptions.

Syllabus

The module has three parts: a short first part on “exploratory data analysis”, a long middle part on probability theory, and a short final part on a statistical framework called “Bayesian statistics”. There’s also the weekly R worksheets, which you could count as a fourth part running in parallel, but which will connect with the other parts too.

An outline plan of the topics covered is the following.

  • Exploratory data analysis [2 lectures]: Summary statistics, data visualisation
  • Probability [16 lectures]:
    • Probability with events: Probability spaces, probability axioms, examples and properties of probability, “classical probability” of equally likely events, independence, conditional probability, Bayes’ theorem [6 lectures]
    • Probability with random variables: Discrete random variables, expectation and variance, binomial distribution, geometric distribution, Poisson distribution, multiple random variables, law of large numbers, continuous random variables, exponential distribution, normal distribution, central limit theorem [10 lectures]
  • Bayesian statistics [2 lectures]: Bayesian framework, Beta prior, normal–normal model
  • Summary and revision [2 lectures]

You’ll notice that this module is heavier on the “Probability” than the “Statistics” of its title. MATH1712 Probability and Statistics II, on the other hand, which many students on this module will take next semester, is almost entirely “Statistics”, but uses probabilistic techniques developed here.

Books

You can do well on this module by attending the lectures and tutorials, and working on the problem sheets and R worksheets, without needing to do any further reading beyond this. However, students can benefit from optional pre-reading in advance, extra background reading, or an alternative view on the material, especially in the parts of the module on probability. These books are also a good place to look if you want extra exercises and problems for revision.

For exploratory data analysis, you can stick to Wikipedia, but if you really want a book, I’d recommend:

  • GM Clarke and D Cooke, A Basic Course in Statistics, 5th edition, Edward Arnold, 2004.

For the probability section, any book with a title like “Introduction to Probability” would do. Some of my favourites are:

  • JK Blitzstein and J Hwang, Introduction to Probability, 2nd edition, CRC Press, 2019.
  • G Grimmett and D Welsh, Probability: An Introduction, 2nd edition, Oxford University Press, 2014. (The library has online access.)
  • SM Ross, A First Course in Probability, 10th edition, Pearson, 2020.
  • RL Scheaffer and LJ Young, Introduction to Probability and Its Applications, 3rd edition, Cengage, 2010.
  • D Stirzaker, Elementary Probability, 2nd edition, Cambridge University Press, 2003. (The library has online access.)

I also found lecture notes by Prof Oliver Johnson (University of Bristol) and Prof Richard Weber (University of Cambridge) to be useful.

Some students have asked about getting more practice questions. In addition to the textbooks above, I also recommend the following “exercise books” full of questions. Both are available from the library.

  • Gr Grimmett and D Stirzaker, Probability and Random Processes: Problems and Solutions, Oxford University Press, 1992.
  • YM Suhov and M Kelbert, Probability and Statistics by Example – 1: Basic Probability and Statistics, 2nd edition, Cambridge University Press, 2014.

On Bayesian statistics, we will only taste a brief introduction, but if you want a book, I recommend:

  • JV Stone, Bayes’ Rule: A Tutorial Introduction to Bayesian Analysis, Sebtel Press, 2013.

For R, there are many excellent resources online.

(For all these books I’ve listed the newest editions, but older editions are usually fine too.)

About these notes

These notes were written by Matthew Aldridge in 2021, and were edited and updated a lot in 2022 and a little bit in 2023. They are based in part on previous notes by Dr Robert G Aykroyd and Prof Wally Gilks. Dr Jason Susanna Anquandah and Dr Aykroyd advised on the R worksheets. Dr Aykroyd’s help and advice on many aspects of the module was particularly valuable.

These notes (in the web format) should be accessible by screenreaders. If you have accessibility difficulties with these notes, contact me.