Science and Statistics Guide 2024 PDF

Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Contents How to prepare for and study for the Science and Statistics Lectures................................................... 2 What is here?...................................................................................................................................... 2 How to study for Science and Statistics.............................................................................................. 2 Mathematics....................................................................................................................................... 2 Recommended Resources for these lectures..................................................................................... 4 Science and Statistics Lesson 1 Outline.............................................................................................. 5 Science and Statistics Lesson 2 Outline............................................................................................ 12 Science and Statistics Lesson 3 Outline............................................................................................ 17 Science and Statistics Lesson 4A/4B Outline.................................................................................... 24 Science and Statistics Lesson 5 Outline............................................................................................ 32 Science and Statistics Lesson 6 Outline............................................................................................ 39 Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. How to prepare for and study for the Science and Statistics Lectures Dr Caleb Owens What is here?  Learning outcomes for each lesson  Suggested readings for each lesson  Brief notes on the core content  Selected slides and images which may be helpful On the reading list on the canvas site you will find a book chapter titled: Research Methods in Psychology + supplement, by Lorelle Burton. You may wish to read the entire chapter, or use it as a reference while learning the content. How to study for Science and Statistics The volume of content in this lecture series is low, however the learning outcomes involve a thorough understanding of each concept and require you to demonstrate you can apply what you have learnt. While examples per se are not assessed, the ease with which you grasp each example during the lessons should give you a good idea how well you understand the material. For example – you need to understand precisely what a “true experiment” is and what it is not. In the exam a study you are completely unfamiliar with may be described in great detail, and you are simply asked “What kind of study is this?” You need to understand the definitions of each research design with enough confidence to be able to apply that understanding to a completely novel situation. With this in mind your study for this lecture stream should allow you to master each concept by learning definitions, details and procedures, and then allow you to apply each concept. The way each lesson outline / lesson is structured is a guide to what you should be doing yourself – start with the concept, attempt to apply it, reflect back on the concept, attempt to apply it again. Mathematics This is NOT a mathematics course in statistics. Try these questions: Which of these numbers is smaller than 0.05? a) 0.450 b) 0.045 c) 0.055 d) 0.005 What is the square root of 16 ? a) 2 b) 4 Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. c) 8 d) 32 e) 256 What is the most common height in the data graphed below? If these three questions were completely effortless, and you didn’t need to use a calculator, then there’s no need to worry at all about maths. The most difficult part of this lecture series is understanding concepts which might stretch your ability to apply mathematics, but not the mathematics itself. Still, numbers themselves can cause anxiety, so if you have any concerns seek assistance from the maths learning hub now: http://sydney.edu.au/stuserv/maths_learning_centre/. If you are majoring in Psychology, and proceeding to second year statistics, you could take a bridging course at the maths learning hub at the end of this year (to be ready for 2000 level statistics courses) to bolster your confidence going into second year. Know that calculators are not allowed in the exam. This means that complex calculations are not assessed; but you can be assessed on your understanding of the origins of statistics, for example: Know how to calculate a standard deviation by hand. Also, questions assessing an understanding of how basic mathematics applies to research can be asked, for example: – Is a p-value of 0.13 statistically significant with a decision criterion of.05? And what does that mean? – If I add up the squared deviation scores of a set of data, divide by the number of participants, and get 9, what’s my standard deviation? – Money and happiness are correlated +0.27. What does that mean? Several of the links in the modules and some of the recommended references contain lots of mathematical formula and symbols, but learning of these is not required. Focus on understanding the logic behind what is happening: the questions within the modules and questions within the quizzes associated with the research methods and statistics tutorials are a good guide of the level of understanding required. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Recommended Resources for these lectures These references (and all references referred to throughout lectures and in the outlines) are highly recommended as they illustrate the concepts taught and allow you to practice applying your knowledge. They are not assessable in and of themselves. Reading list: Book chapter Research Methods in Psychology + supplement OR Any research methods/statistics chapter or appendix in any introductory psychology textbook Reading list: Book Chapter Power and Measure of effect size. NB: Power is a hard concept to understand, so this chapter may be very useful. You do not need to know how to calculate power, but understanding what it is and the factors which affect it is important. All tutorial materials from these tutorials: Statistical Reasoning, Research Designs Jacob Bronowski: “The Ascent of Man” – (BBC TV, 1973) Episode 11: Knowledge and Certainty (for lecture 6) Robert T. Carroll – The Skeptic’s Dictionary (look up logical fallacies) Online http://www.skepdic.com/ Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Science and Statistics Lesson 1 Outline Professional integrity and science Learning Outcomes By the end of this lecture you should be able to: a. Understand that critical thinking skills should be used with care and respect b. Understand the concept of “Professional Integrity” and be able to apply it to the practice of science and psychology c. Distinguish between people in science (authorities), theories in science, and evidence in science, and be able to explain how they interact d. Understand the importance of transparency and scepticism in science and how it allows for the continuous improvement of understanding e. Distinguish between science and pseudoscience and recognise the logical fallacies “argument from authority”, “ad hominem” and “appeal to antiquity” f. Understand the replication crisis and be able to explain several factors contributing to it In this first lesson I express a desire to empower everyone with a knowledge of science which is powerful and useful, but also respectful. Psychology is taught within the Faculty of Science, so whether or not Science is your home Faculty, you are expected to be open to the immense personal challenge science faces us all with. One might argue that we are a civilisation based on science, our population could simply not exist without the advances science has made in health and agriculture; and with that in mind, surely all citizens of the planet should understand science well and respect its findings? However the reality is that ignorance of science abounds – so the first point I want to make is that you cannot expect your friends, families, and others to be as receptive as you are when you are discussing scientific views. If you know people who believe in something rather odd, which perhaps might not have any evidence supporting it, then an empowering course like this makes it tempting for you to ‘educate’ them. You might find yourself in a situation that Tim Minchin found himself (see: https://www.youtube.com/watch?v=HhGuXCuDb1U), surrounded by people ‘trapped in the fog of their inability to google’. However I would strongly urge you not to give in to that temptation. Science is effortful for everyone, and while there are some ‘easy to mock’ pseudo- sciences out there, almost everyone is a sucker for something. While these lessons will give you the tools to rush out and spot errors and issues everywhere, a deeper understanding of science will reveal how it is a system founded on probabilities. Science succeeds better than any other system of thinking ever created, but it does not result in ultimate truths – the last lesson makes this very clear. So my initial advice is, if your best friend or a family member believes something a bit odd, a bit anti- science, just ask:  How much is it costing them, compared to the comfort they are receiving?  Is it placing them in any danger?  Is it placing others in danger? Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Allow your fellow humans their eccentricities, and carefully prioritise the battles you can and should fight. If you plan to enter any profession which uses science or the outcomes of science to support it (e.g. Psychology) then your professional integrity depends on your understanding of the relevant science. If you have strong personal beliefs which are profoundly different to those supported by science, you either need to set them aside (which may be difficult or impossible) while practising, change them, or leave the profession. One of the University of Sydney’s Graduate Qualities concerns understanding this relationship. Integrated professional, An integrated professional, ethical and personal identity is ethical, and personal understanding the interaction between one’s personal and identity professional selves in an ethical context. Imagine you become a well-known Psychologist practising therapy X, but after hours you tweet that therapy X is actually dangerous and therapy Y (for which there is no scientific support) is far better. What would that kind of behaviour do to both your professional and personal integrity? Can you think of examples where a professional might threaten their integrity by their behaviour? Since scientific understanding on every topic will always be evolving, using current science won’t produce perfect outcomes, only the best at the time. Nevertheless, using the best-known practices ensures:  Harm is minimised. Untested and potentially dangerous techniques are not used.  Treatments are defensible. While all treatments carry risk, risk is minimised if the best understood, and most effective treatment is chosen.  Knowledge both for and against such treatments can be systematically accumulated. So, if you want to be an effective practitioner, and/or citizen of a scientifically based society, you need to be trained in science. While anyone can read about a dozen interesting findings concerning science within minutes, the mechanisms of science itself are not immediately apparent and are often unintuitive, so a reasonable amount of training is required. This contrast: findings are easy to read about, but the system which created those findings is harder to understand; explains why those who do not understand science can cause such mischief. Here are some basic ingredients of science: 1. An attitude of humility, wonder, and commitment to understanding the truth. You accept you do not know everything, you appreciate how complex the world is, and you will not rest until the truth is discovered. 2. Understanding the methods used and the methods necessary to generate evidence in science. 3. Being able to distinguish between theories and evidence in science and being committed to fully test the veracity of theories using evidence. 4. A realisation that understandings are distinct from the individuals who propose or support them. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. 5. Accepting that evidence-based conclusions will always be probabilistic in nature, such that science represents a continuous process of improving upon what is known. 6. All theories and evidence must be open to criticism. For this to be possible there must be complete transparency in how evidence was gathered. 7. Systems of standard communication allowing transmission of findings and criticisms to be world-wide. Every science has peer reviewed journals and conferences. 8. Merit based qualifications and grant application procedures which reward skill in research. Skill in research is demonstrated by successfully attacking existing theories and replacing them with better ones. This is certainly not a complete list but gives a sense of what is required for sciences to be healthy and successful. A particular pseudoscience (false science) doesn’t necessarily break all these rules, but if we reverse these requirements, you can probably think of a lot of examples where this happens. Pseudoscience is often characterised by: 1. An attitude of arrogance that the main answer is already known so there is no need for more study and reflection. 2. An acceptance of anecdotal and poorly controlled historical evidence in place of systematically collected evidence. 3. An inability to distinguish between theories and evidence, such that only supporting evidence is gathered (confirmation bias), or that attacking the evidence associated with alternative accounts is all there is supporting the theory (appeal to ignorance). 4. An “important” individual (like a cult leader in the past or present) who created the theory and is forever associated with it. Evidence for the theory may be lacking but the individual’s endorsement is considered sufficient support. 5. Probabilities (and often mathematics itself) are poorly understood and instead are replaced with certainties. 6. The evidence supporting the theory was collected behind closed doors or by “experts” who never explain what they did to collect it. 7. One way transmission of information such that it may be delivered in conferences and journals but cannot be questioned or criticised. 8. Qualifications are not formalised, or have little value, and can be bought with minimal training or supervision. Status and/or promotions are guaranteed by always supporting and never criticising currently held beliefs. And often something extra which is essential for any pseudoscience trying to survive in a civilisation supported by science:  An explanation for why “mainstream science” sees things very differently. Often a conspiracy theory supported as poorly as the pseudoscience itself. Understanding how pseudoscience works by seducing people with certainties, should help you understand how effortful, time-consuming science – which promises probabilistic conclusions at most – is too often ignored. A Psychologist with well-developed professional integrity is not only Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. able to avoid pseudoscience but can assist clients who may be doing themselves and others terrible harm with maladaptive beliefs. The biggest misunderstanding about science is that its strength arises from its collection of knowledge and the expert scientists who understand that knowledge. This is most apparent at university with first year students, because the assumption is made that the lecturer is an expert, and most things an expert says are correct. For every thousand emails asking, “Is X true?”, lecturers might receive just one asking: “Where is the evidence supporting this?”. Ideally, science has no authorities, no experts, and no ‘sacred knowledge’ which is beyond criticism. You commit a logical fallacy (called the “appeal to authority”) if you believe something is true because someone very important said it or endorses it. Conversely you commit a logical fallacy (called an “ad hominem”) if you disagree with what someone says, but instead of attacking their claim or their evidence, you attack them for being of low status or disreputable. In science, statements, theories and evidence should all be considered open to criticism. Something can still be criticised even if it has been believed for thousands of years (‘something is true because it has been known for a long time’ is a logical fallacy: “argument from antiquity”). One of the most useful hidden mechanisms in science which keeps this system working so well is that scientists become more famous for challenging past assumptions and advancing knowledge and understanding. Every scientist is motivated to disprove the old theories (even their own), because of a realisation that the current understanding is not the best. If criticism is not possible then progress, simply cannot be made. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Science operates with an assumption that our current understanding is imperfect. Scientists can certainly seem arrogant when commenting on understandings of the world which are substantially less perfect; but no scientist believes a perfect answer has yet been found to any problem. (Dara O’Briain: “Science knows it doesn’t know everything, otherwise it would stop!”, https://www.youtube.com/watch?v=uDYba0m6ztE ). When you understand this dynamic process, you see where the power of science comes from – it’s all about criticism and progress, and becoming famous by finding problems in theories. An appreciation of this progress can be gained by considering the history of science and understanding. Science’s greatest theories: the theory of Plate Tectonics, the theory of evolution by natural selection, and the germ theory of disease, were all preceded by courageous and earnest attempts to understand phenomena. When considering those histories, empathy with the humans who lived in those times is needed, because they did not know what we know now, yet they wanted to know. And because they still had to make decisions, they needed to make explicit what they thought they did know – their best understanding of the world. Certainly, if someone reaches back to ancient, long disproven concepts and begins to speak of them today as valid, the empathy and historical respect can be dropped. The gain from empathising with prior understandings though, comes from realising that we are now in exactly the same position – there are a great many things we do not understand, and future generations will look back on the scientists working today, not with contempt at their ignorance, but with respect at their earnest attempts to work things out. The realisation that the power of science comes from the process, should better involve every student of science and every citizen of a scientifically founded society, because suddenly you Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. become part of the process of criticizing and evaluating evidence, and proposing and testing alternative explanations. The lack of authority, and the lack of a “sacred knowledge” of science, is a radical change for many students, from the fixed curriculums of high school. The Board of Studies knew best, the knowledge base was passed to your teachers who tried to impart it to you, and then your understanding of this knowledge was assessed. Certainly, at University, there is knowledge to impart; but the greater lessons to be had, are about how to obtain knowledge, how to report it (citation), how to evaluate it, and how to criticize it. If you learn those lessons you can contribute to society much more meaningfully once you graduate, or at the very least, you understand your need to constantly update what knowledge you did acquire at university. Many students struggle with the major PSYC1001 assignment, which, regardless of the specific research topic, is all about evaluating and critiquing research and the conclusions made about it. Students in High School mode panic – “What is the ‘right’ answer, where do I find it…” etc., for them learning is about knowledge. The skill being taught is how to be a person who can make scientific decisions and arguments. A scientific argument involves statements supported by research findings. First year tutors endlessly repeat: “As long as you back it up with evidence…” to any number of enquiries along the lines of “Is this right?”. When you understand science and have an integrated professional and personal identity, you become more independent and empowered. Suddenly search engines you used to use for nothing but finding cute images of cats, become powerful tools for locating peer reviewed research. You also need to develop your research skills, learning how to evaluate sources, how to spot bias, industry intervention, logical errors, lack of evidence, lack of peer review, lack of replication, poor research design, emotive claims etc. None of this happens overnight of course. Often students struggle with the conflict between a professional identity and their “right” to continue to make unsupported claims. Try to understand that the world is still full of possibilities, but that when putting things into practice, you absolutely should use the techniques best supported by scientific evidence. It is not “closed minded” to avoid unsupported or potentially dangerous methods. ‘Closed minded’ implies you are not open to possibilities. However, science must be open to all possibilities – an attitude that everything is open to criticism implies that strongly. However, while you should be open to every possibility, you must be able to rank those possibilities in terms of probability. Open-mindedness should mean considering everything but evaluating it before accepting. Accepting everything before evaluating it is not open-mindedness, it is blind credulity. An excellent video by Qualia Soup makes this clear: http://www.youtube.com/watch?v=T69TOuqaqXI. Accusing someone of being closed minded is also an “ad hominem” – it does nothing to support your claims. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Source: http://www.youtube.com/watch?v=T69TOuqaqXI Note importantly that this lecture series uses multiple examples to illustrate what is and is not science. You might be shocked to learn that something you assumed was true, perhaps is not supported by any evidence. You could pop the idea out of the “acceptance” zone and reconsider it using your new research skills and approach to scientific evidence. If it is something personal to you though, and acceptance of it is not harming yourself or others, I do not want to upset anyone, just wait for another example – there are a LOT of examples given in the lessons and workshops and no offense is intended by their choice. This lecture series aims to teach you how to engage in the process of science. You might not conduct any research in your life, but by the end of this lecture series I hope you can read journal articles with confidence, understand which results were and were not significant, consider for yourself whether you agree with the conclusions made, or whether the design was appropriate, and be able to offer alternative explanations or suggest further studies. While lecturers (especially in large lecture halls) look like ‘super-duper’ teachers of ‘true knowledge’ at first, hopefully your understanding of science will make you realise that, like you, they are just humans struggling to understand what is going on, and you are very much with them as part of the process of science. Replication Crisis Here are some links to resources if you are having trouble understanding the issues: Is most published research false? https://www.youtube.com/watch?v=42QuXLucH3Q https://hardsci.wordpress.com/2016/08/11/everything-is-fucked-the-syllabus/ http://sciencefriday.com/segments/the-replication-game-how-well-do-psychology-studies-hold-up/ http://www.psychologicalscience.org/publications/replication http://news.harvard.edu/gazette/story/2016/03/study-that-undercut-psych-research-got-it-wrong/ http://www.nature.com/news/over-half-of-psychology-studies-fail-reproducibility-test-1.18248 Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Recommended reading for this lesson: * Reading list: Book Chapter: The Fine Art of Baloney detection, from the book "Demon Haunted World" by Carl Sagan. – also a set reading for the Science of Psychology Tutorial so well worth reading. Check eReserve * Reading list: Book Chapter: Knowing where to scratch, V.S. Ramachandran. From Phantoms in the brain: human nature and the architecture of the mind / Chapter 2. Recommended media: YouTube video: Open- Mindedness by Qualia Soup http://www.youtube.com/watch?v=T69TOuqaqXI Useful links: When looking for explanations/examples of logical fallacies: http://www.skepdic.com/ Evaluating Internet Research Resources: http://www.virtualsalt.com/evaluating-internet-research- sources/ And: https://library.georgetown.edu/tutorials/research-guides/evaluating-internet-content Science and Statistics Lesson 2 Outline The power of a name: Measurement and Constructs Learning outcomes By the end of this lesson, you should be able to: a. Understand the need for precision in scientific constructs and concepts b. Be able to identify, give examples of, and explain weasel words c. Consider how each concept or construct we use in Psychology needs to be conceptually defined and established d. Identify and understand the logical errors: ‘Reification’ and ‘the pragmatic fallacy’ e. Appreciate the notion of ‘falsifiability’ f. Understand what an ‘operational definition’ of a variable is, and why it might not always be ideal In this lesson I aim to strongly emphasise the importance of considering the constructs we use in Science and Psychology. A construct is an idea or theory often expressed as a single word, but containing lots of assumptions, and conceptual relationships. Naming something makes it seem real Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. and valid, but we need to ensure the constructs and concepts that we use do correspond in some way to reality and are actually useful. The lesson begins with a comparison of pre-scientific constructs with scientific constructs. You should be able to see that scientific constructs are better approximations of reality, because they are more accurate, and they arise from a more accurate understanding of the world. Next, a series of constructs we often see in advertising is discussed, and because of their lack of meaning it should become clear that care needs to be taken when creating and using constructs. We need scientific constructs to be as clear as possible, because in science we use them to make predictions. The need for precision becomes even clearer when psychological constructs are considered, and hopefully at this point you might suddenly realise why psychology students require careful instructions in research methods and statistics – the things we want to measure are very hard to measure and may not even be real. If you weren’t wondering much earlier “is Psychology a science?” this lesson may start you wondering about the ways in which it is a science and the ways in which it is not. Weasel words consist of vague and misleading terminology, for example: “Scientists say that….” “Clinical studies have shown that….” “This medicine may help with….” Notice how in each case nothing specific is communicated: Do the studies even exist? “May” is a word which implies something probable, but can also be used in a cowardly way: “The world may end tomorrow”….. tomorrow comes “I never said it would, just that it might!” The reason why weasel words are used commonly by people selling things (merchants and politicians) is because they want to avoid being accountable for claims they make. In Psychology we must strive to precise and not vague. There are two ways to define a construct: a conceptual definition and an operational definition. Both are needed to establish a construct before you commence your research but they are very different. A conceptual definition involves describing a construct in terms of what it is and what it is not and how it might relate to existing theories. Examples are given. A conceptual definition also requires careful consideration of how theoretically useful a construct is. Reification occurs when a purely analytic or abstract relationship is treated as if it is a concrete entity. It can occur when an adjective is treated like a noun. For example ‘luck’ is for most people a hazy concept referred to when something good happens: “That was lucky!”, but when it is reified into a thing, then it can become hugely problematic, e.g. a gambling addict may have a lucky medallion which contains ‘luck’, or they may come to believe that the poker machine in a particular locations contains ‘luck’ and will reward them in the future. Reification can be a huge problem in psychology, consider concepts like intelligence and personality. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Another related error when conceptually creating constructs is a failure to consider falsifiability. If you create something which can never be assessed or measured, then there will never be any way to tell if it is real – and it can never be disproven. Constructs, theories and individual predictions can all be un-falsifiable which means progress cannot be made with them and they are arguably beyond the realm of science. Several examples are given to illustrate this point: fairies, invisible forces and many concepts proposed by Sigmund Freud. The pragmatic fallacy: “something is true because it works”, is offered as a further example of the convoluted knots you can get into when conceptual reality is crudely tied to efficacy. For example, just because therapy based on psychoanalytic concepts helps someone feel better, does not mean the concepts behind the therapy (id, ego, superego, unconscious desires) are valid. Instead, the person may feel better having talked to someone about their problems. Similarly, if someone is politely listened to and kindly treated with needles, and feels better, that does not mean any associated concepts (chi, ki, prana, meridians etc.) are valid. Instead, the person may feel better having been treated well and acknowledged. Read folk tales about soup stones if the issues here aren’t obvious. An operational definition of a construct is an explanation of how that construct might be measured. Given there are any number of ways to measure a construct, there are any number of ways a construct might be operationalised. How a construct is operationalised might depend on which aspect of the construct is important to the research, how much money and time the researchers have and following from that, what kind of research design is going to be used to study the construct. They key difference from the lengthy theoretical considerations of a conceptual definition, is that an operational definition involves finding a way in which the construct can be observed. You might ask, if a construct can be observed once operationally defined, then why bother with a conceptual definition, surely the construct must be real and valid? However an operational definition is not a construct, a construct can never be measured directly, so just choosing one of many ways to measure it does not make it real. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. You can use your understanding of the operationalization of constructs to critique research. Was the way the researchers operationalised the construct they wanted to measure the best way to do it? Do the results of the research simply arise because of the way the constructs were operationalised? Was there a better way to measure this construct? Examples at the end of the lesson illustrate the role of the two kinds of definitions in Psychological research. Self-report measures are widely used in Psychology but are not without their flaws, most particularly people simply being dishonest, something a well-hidden social desirability scale may solve. Also, with the internet and individuals happily giving away personal data and ‘likes’ to private companies, the huge mass of data suggests we should be able to advance our understanding of human behaviour by studying it. However, unless that study is founded on a considered approach to constructs, we might find we get better and better at predicting behaviour, but not any closer to understanding it. The Kosinski, Stillwell and Graepel (2013) paper is used as an example to illustrate this point. Recommended reading for this lesson: * Kosinski, M., Stillwell D.J., Graepel T (2013) Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences (PNAS). http://www.pnas.org/content/110/15/5802.full Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Useful links: http://en.wikipedia.org/wiki/Operationalization https://explorable.com/operationalization Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Science and Statistics Lesson 3 Outline Research design: A thousand zeros is still nothing Learning outcomes By the end of this lesson, you should be able to: a. Identify the flaws inherent in ‘anecdotal’ evidence, and understand why a ‘case study’ provides more meaningful data b. Understand a correlation coefficient and be able to interpret its meaning in a research paper c. Appreciate the need for a control condition in ruling out alternate explanations d. Be able to distinguish between ‘true experiments’, ‘quasi-experiments’ and correlational studies, and understand the pros and cons of each e. Use the concepts of ‘random allocation’ and ‘random selection’ to distinguish between different kinds of research design and understand ‘external validity’ and ‘internal validity’ f. Understand the importance of ‘blindness’ and ‘replication’ in scientific research This lesson is all about Research Design and goes with your Research Design tutorial. In practical terms, you need to be able to distinguish between the different types of designs when they are described to you. So, while there is not a large amount of conceptual content (as with most Science and Stats lessons), learning how to apply those concepts precisely is the challenge. The lesson also contains a lot of jargon so the best thing you can do to prepare is to become familiar with the terminology. Anecdotes are interpreted stories about a single occurrence in the past and are usually of no scientific value. Anecdotes share with case studies, a small sample size, but that is not their biggest problem. When an anecdote is created and passed along by non-scientists, only information relevant Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. to the interpretation the anecdote is being used to support is recorded and passed down. In terms of theory and evidence, there is no separation in an anecdote, which means the theory is supported by the only evidence which is mentioned, and the only evidence which is mentioned is in there to support the theory. This built-in bias means that anecdotes, at most, can lead us to consider possible future research. “Collecting anecdotes” which all support one interpretation might give us the impression we are collecting evidence and building a stronger and stronger case, when in fact we are just concentrating the bias and contributing nothing to understanding – this situation gives this lesson its extra title. Other flaws with anecdotes are that they can change with each retelling, and because they are often of only a single instance or observation, whatever happened cannot be replicated by a non- prejudiced observer. Case studies are significantly more systematic than anecdotes because they arise from an earnest attempt to understand what is going on, such that all details, whether they seem relevant, are recorded in a scientific manner. While the sample size is still small, the unbiased recording of all information means that a case study is never irrevocably linked to an explanation the way an anecdote is. Scientific humility (arising from an attitude of wanting to discover something rather than confirm something) means that the person taking down the case notes may or may not realise the significance of the data immediately, or ever in their lifetime. The objective nature of the notes though means that alternative explanations are possible later, and case studies can be cautiously added together to give insights into processes which may be confirmed later with systematic research. Case studies are often the only way to study extremely rare disorders and conditions in psychology. An example is given (Cholera) which highlights the profound advantage case studies have over anecdotes. Correlational studies occur when at least two variables are measured from each case/person with a view to calculating a relationship between the variables. Because the measurements are simply taken and nothing is manipulated or controlled, it is often difficult to know the direction of causation or even if there is evidence of causation. If variable X is correlated with variable Y, then X could cause Y, Y could cause X, or another variable Z, could cause both X and Y and X and Y are otherwise unrelated. It will take a few examples given in the lesson to fully illustrate this. While technically not part of research design, I take the opportunity at this point to run through the basic maths of correlation coefficients, so that you will be able to understand them when you see them in research papers. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Note that the Greek letter Rho (ρ) is the parameter (population value) of a correlation. In papers you will most likely see the roman letter lowercase “r”, the statistic (sample value). If you understand that at a minimum each correlation involves TWO scores from each case, then you can see we can graph each case as a dot on a scatterplot (the score on one variable on the horizontal x axis, the score on the other variable on the vertical y axis). The correlation is then a description of how the dots are grouped. The sign gives the direction of the slope, and the magnitude is how clustered the cases are along the line. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. After a number of examples illustrating the issues with correlation and causation (see also http://www.tylervigen.com/spurious-correlations), the argument is made (and supported with a further example) that a control condition is the first essential requirement of inferring causation. A control condition exists to rule out other causes such as time, fatigue, the body’s own immune system, and even more issues which have not been considered (e.g., size, in the hamburger example given in the workshop). If changes/differences can be measured in the experimental condition which has only one unique, carefully controlled feature, and those changes are not found in the control conditions, then you can begin to attribute the changes to the unique manipulation. To explain all that in a systematic fashion, terminology is needed.  An Independent Variable (IV) is usually the presumed cause in research and is manipulated by the experimenter. You can think of manipulation as the changing of the variable (present/absent, or level 1/level 2/Absent ect.), OR you can think of manipulation as control over a variable such that participants can be randomly allocated to levels of the variable.  A Dependent Variable (DV) is what will be measured by the experimenter to see if the independent variable has had any effect. In Psychology is it often a behaviour or response. Because a DV is simply measured, the researcher has no control over it.  Random allocation occurs when participants in a study arrive at the study not belonging to any level of the Independent Variable and can be given/administered or placed into a level/condition of the Independent Variable by a random process. Complete control over the independent variable to be randomly allocated is necessary. Using that terminology, you can see that in a correlational study there is no control over any variable. Technically all the variables are dependent variables, although something might be considered predictor variables if they are related to the presumed cause. However actually inferring causation strongly is simply not possible in a correlational study. In a true experiment all independent variables of interest are controlled and able to be randomly allocated. A strong causal inference can be made in a true experiment because random allocation makes all variations between the groups cancel out. The only variation which does not cancel out is the systematic difference between levels of the independent variable. If a difference is found in the dependent variable in a true experiment, you can make a strong conclusion that differences in the independent variable caused that difference. In a quasi-experiment at least one key variable of interest cannot be randomly allocated, but others can. In a sense a quasi-experiment is a study involving pre-existing groups. And whenever you have pre-existing groups, you will have confounds which weaken any causal inference. At the end of the study if you find a difference in your DV between people belonging to one pre-existing group over another, you can’t say with certainty what caused that difference. Lots of examples are given of quasi-experiments, because while they are rarely well defined, they are the most common kind of research design in many fields of Psychology. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Understanding random allocation and control over variables is key to classifying each study as either a correlation study, true experiment or quasi-experiment. Notice that the lesson on classifying research designs has finished, but at no point did I need to even mention random selection. This is because random selection is completely irrelevant to classifying designs as true experiments, quasi-experiments, or correlational studies. But I know that many students completely confuse random allocation and random selection, so in every assessment of your understanding of this material random selection will appear as information to distract you. Keep in mind it is irrelevant when classifying research designs and stay focused on the question of “How many independent variables of interest are randomly allocated?” Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. After multiple research design examples are given, the following terminology is explained and examples given.  External validity is the extent to which findings from the study can be generalised to the population at large. It can depend on: o Sample size o How the sample was chosen  Was random sampling used? Did participants self-select for the study? Is the sample likely to be biased? o Where and how the experiment was conducted  How artificial was the testing location? How real did participants think it was? o How variable the effects being studied are  If the effect varies very little from person to person a large sample might not be needed. If an effect varies greatly across cultural groups, they need to be represented in the sample.  Internal validity is the extent to which changes in the dependent variable can be attributed to changes in the independent variable. If a strong causal inference can be made, internal validity is high and vice versa. Based on what we know about research designs it can be said True Experiments have very high internal validity, whereas correlational studies have very low internal validity. Note that both internal and external validity can vary independently. You could argue that across all studies, as studies are brought away from the real world into carefully controlled laboratory settings internal validity is raised at the cost of external validity. (I.e., as a situation becomes more controlled it also becomes more artificial.) However, some of the cleverest social psychology studies you will learn about aim to carefully control all variables, while making participants think they are in real situations. Blindness is essential in research, because if participants and researchers know which conditions they are allocated to, bias can completely undermine the results of the study. A participant who is not blind may know or guess what is expected of them. A researcher who is not blind may not randomly allocate correctly, or their knowledge of who is in which condition may cause them to incorrectly record data. Ideally, both researchers and participants should be blind to which condition is being administered; this is called “double blind”. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Replication is when the same findings are found by an entirely independent party following the method you followed. Therefore method sections in research reports need to be detailed. With enough detail, other scientists should be able to copy your study and find the same things, eliminating fraud and sampling errors as explanations. By this stage of semester at least a few students have usually emailed me one or two studies showing that (whatever I said does not work in lesson 1) works! And it has been shown to work in a double-blind randomised study! This misses the point of replication. There are well-designed, double-blind studies with excellent random allocation that show all kinds of things are effective – but don’t cherry pick those and wonder why a scientific consensus has never been reached. It is probably because those effects could never be replicated by independent researchers, which implies the results were found by pure chance or bias. Be especially wary of groups who insist their results can only be replicated by people who are ‘trained’ in what they are doing. That kind of claim is not transparent and withdraws the field into pseudoscience. NB: Dr Rebecca Pinkus will also cover issues related to replication in her Social Psychology lectures (the “replication crisis”). Recommended reading for this lesson: *Reading list ARTICLE: Efficacy and safety of Echinacea in treating upper respiratory tract infections in children: A randomized controlled trial. (This is recommended only as an example of good experimental design – the substance being tested is not relevant) Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Science and Statistics Lesson 4A/4B Outline Predictions and Descriptive Statistics Learning outcomes By the end of this lesson, you should be able to: a. Understand the role of deriving a hypothesis from a theory and using it to make predictions in science b. Distinguish between an ‘experimental hypothesis’ a ‘null hypothesis’ and an ‘alternate hypothesis’. c. Identify and understand the logical error: ‘Confirmation bias’ d. Be able to calculate and interpret a ‘range’, ‘mode’, ‘median’ and ‘mean’ from a small set of numbers by hand, and understand the pros and cons of each e. Understand that scores vary and be able to describe various shapes of distributions f. Fully understand what a standard deviation is and be able to follow the steps in its calculation by hand While some basic mathematics has occurred in prior lessons you will see a lot more in this one. There are a few important things to say about mathematics in psychology: 1. Fear of maths is a bigger problem than actual maths skill. In the following lessons I do not go beyond squaring a number (multiplying it by itself) and taking the square root (the opposite process) – procedures familiar to a primary school student. However, fear of maths, and lack of confidence in maths are serious issues which can ruin your career at university. So, if you notice any anxiety, act on it quickly. I recommend the Maths Learning Hub at USYD: https://www.sydney.edu.au/students/learning-hub-mathematics.html, and I also recommend you carefully complete all the exercises presented here before this lesson. If you can complete the exercises by hand, there’s not much to worry about this year. However things do ramp up significantly in second year, so a bridging course (at the Maths Learning Hub) or a good read of the 2nd year stats textbook over the Xmas break might be useful (I don’t know what the current textbook is, but any “Statistics for Psychology” book will be fine). 2. There is no avoiding mathematics in Psychology, or science, or social science – and avoiding it in society in general is hard too. As you begin writing your research report and citing research for the first time, you might be wondering just how arguments are solved when two researchers voice different opinions. At some point, you need to calculate probabilities and effect sizes and directly compare what they found. You cannot just avoid controversies at university and write: “They found different things and have a right to their opinions” as the conclusion to every research report. 3. You might assume that any statistics course will be sufficient to replace what you need for Psychology, but that is not the case. A worst-case scenario is a student avoiding statistics lessons because they are doing statistics in mathematics. If you do want to do extra reading Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. for statistics, you want to look for textbooks with titles like “Statistics for Psychology and Education”, or “Statistics for the behavioural sciences”. That should tell you that in Psychology we use statistics developed for our kind of data and situations. That is why in second and third year our statistics courses are internal/core psychology units of study (and they are pre-requisites for almost every other unit of study, emphasising point 2). This lesson is a combination of a theoretical consideration of hypotheses and prediction (4A), and some fairly concrete information about descriptive statistics (4B). Before the lessons you should try each example to ensure you can calculate a mean, mode, median, range, variance, and standard deviation. Descriptive statistics are used to summarise a collection of data. Rather than look at a field of numbers, we want to be able to grasp their gist with a single number, or a graph. I will show you how a complex maze of numbers can be simply described in a frequency distribution: a graph which shows how many scores of each kind were obtained. I also illustrate many of these concepts with statistics from the PSYC1001 course. Here’s the data from an online tutorial quiz from a few years ago. I will use this year’s data in the lesson. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. The descriptive statistics we most often use either measure the central tendency or variability of scores. Measures of central tendency include:  The mode, which is the most common score, the most frequent score. If you’re looking at a frequency distribution the mode is therefore the tallest bar, the top of the mountain or the highest frequency. The advantage of modes is that they are real scores that actually occur, but a disadvantage is that modes depend on how you group the data. For example, if all cancers are grouped together cancer is the most common way to die, but if separate cancers are considered road fatalities have a higher rank.  The median is the middle score, or, if there’s an even number of scores it is the average of the two middle scores. To find a median you need to rank the scores in order and then count down to the halfway point. So the location of the median is N+1/2, where N is the number of scores. If you have 5 scores, the median will be the 3rd score. The key advantage medians have over means is that they are not disproportionately affected by extreme scores. Means remain the most efficient for estimating population means, which is why you rarely hear of modes or medians beyond an introductory lesson like this one.  The mean or average is like the balance point of a distribution of scores. Imagine all the scores on a seesaw, the mean needs to be at the point where all those scores balance. You get a mean by adding up all the scores and dividing by the number of scores. That sentence can be expressed as an equation (but this equation is not assessed). The advantage of a mean, which makes it perfect for statistics, is that it is the best measure for estimating a population mean from a sample. However, means are usually not real scores that occurred and they can also be easily affected by extreme values. Examples are given to illustrate. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. What is the MODE of these sets of scores? Set 1: 4, 5, 5, 6, 6, 7, 7, 7, 9, 10 Mode = _________ Set 2: 4, 4, 5, 6, 7, 8, 8 Mode = _________ What is the MEDIAN of these sets of scores? Set 1: 4, 5, 5, 6, 6, 7, 7, 7, 9, 10 Median = _________ Set 2: 4, 4, 5, 6, 7, 8, 8 Median = _________ What is the MEAN of these sets of scores? Set 1: 4, 5, 5, 6, 6, 7, 7, 7, 9, 10 Mean = _________ Set 2: 10, 10, 11, 12, 13, 14, 14 Mean = _________ To understand variability is it easiest to consider frequency distributions. The way scores are spread out can tell you a lot about your data, so there are many ways to describe this. Your distribution might be symmetrical with the same number of scores above and below the mean, distributed the same on both sides. With raw data this is usually not the case, so we say distributions have a positive skew or negative skew. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. I will show many examples of skewed data (and we’ll generate some ourselves in the workshop). To test your understand, try to predict how certain data will be skewed: Age of students in PSYC1001? Number of sex partners across the lifetime? The shape of a distribution also affects how good a measure of central tendency the mean, mode and median are. In terms of mean, median and mode, these distributions may not differ much at all, but scores in which of these two distributions below (red or black) vary more? The descriptive statistics which describe variability attempt to capture this overall spread or dispersion of scores as a single score. The simplest measure of variability is the range, which is the difference between the highest and lowest score. Easy to calculate, the range does not capture how Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. any of the other scores are dispersed. Finding just one number to summarise this seems simple – because surely you could take each score, calculate how much it varies from the mean (called a deviation score) and get the average of these. The problem is you cannot calculate an average deviation score around a mean. It’s just not possible when you consider that the mean is the balance point of a distribution – all scores above it and below it will always add up to zero when added together. To get around this we square each deviation score before adding them together. A negative number multiplied by a negative number results in a positive number (-2 x -2 = 4) so no longer do the deviations around the mean cancel each other out. The variance of a set of scores, results when you take the sum of all those squared deviation scores, and divide by the number of scores, but it’s not yet a completely meaningful number, because we had to square all the scores before we added them. We then take the square root of the variance to get the standard deviation of a set of scores. The standard deviation is the closest we can get to the average degree to which each score in a distribution varies from the mean, and it is the most widely used measure of variability in statistics. What is the RANGE of these sets of scores? Set 1: 4, 5, 5, 6, 6, 7, 7, 7, 9, 10 Range = _________ Set 2: 4, 4, 5, 6, 7, 8, 8 Range = _________ Calculate the VARIANCE and STANDARD DEVIATION of these sets of scores? (Use a calculator with a square (x2) and square root (√) function) Set 1: 24, 20, 18, 18, 15, 22, 15, 17, 8, 23 Set 2: 3, 20, 11, 2, 22, 19, 14, 7, 26, 16 Set 3: 18, 19, 19, 19, 18, 19, 20, 19, 20, 19 Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Since each standard deviation calculation involves lots of steps, you absolutely need to use the “Standard Deviation Working out Sheets” which are available on Canvas with these notes. You can see me working through one of them in a video in the online module associated with this lesson if you really need help. You might be wondering why such a manual calculation is needed, when you could just enter each number into your smartphone and have it do all the work for you. The reason I want you to do it manually at least a few times is so you appreciate what a standard deviation actually means. It’s going to be the most used measure in statistics from this point on. If it remains a mystery where it comes from, your conceptual understanding of it will be poor to non-existent. Each of these three examples has been crafted to demonstrate both high and low variance/SD. When you reach the end of your calculations, you can look back at your completed sheet and see where your standard deviation came from. And also:  this method of calculating standard deviation by hand is assessable in the exam,  your smartphone or calculator may give you a different number (excel certainly does). If you’re wondering why, it may be because it is dividing by (n-1) instead of (n), because it assumes you want to estimate the population standard deviation. We are getting very close by this stage to marrying together the statistics and our conceptual understanding of research design. However, I still need to explain how we bring our research to the point where we can test things with statistics. You might be familiar with the idea of an experimental hypothesis. This is a prediction of what will happen in your study. Arriving at an experimental hypothesis is a lot more involved than simply making something up or guessing. An experimental hypothesis should be derived either from the previous literature and findings, or from a theory which (if it’s a good scientific theory) allows for specific predictions to be made. Much is made of deriving hypotheses when we mark the PSYC1 research reports, because most of the time students write long literature reviews, but at the end act like they are tossing a coin when formulating hypotheses. You want to be able say which kind of result will support your theory, and also which kind of result will disprove your theory. If you have not even thought about which kind of result will disprove your theory, your experiment may not be designed to be able to produce such a result, in which case it is not a true test of the hypothesis and you are wasting your time1. If someone is motivated to find meaning they will cherry pick something and take it as proof. The issue at the centre of these problems is confirmation bias, the tendency humans have to only ever look for evidence which confirms what they believe. An extended example on Cognitive Dissonance is given to illustrate what happens in research to try to minimise this issue. While most people think of an experimental hypothesis when you mention “hypothesis”, there is another kind which is much less intuitive but relates to issues of confirmation and falsifiability. The null hypothesis is a construct we use in inferential statistics to help us overcome all our natural tendencies to avoid testing what we currently believe. It arises because it is far easier to find evidence to disprove something than it is to prove something. And because, we really want to prove 1 This is also true in argumentation. At this stage of this lecture series you hopefully have developed enough superpowers to begin challenging those around you who make silly claims. But don’t ever get caught in an unproductive argument. Before you even begin arguing about evidence ask: “Explain to me what kind of evidence I would be required to produce to get you to change your mind.” If they look stunned or confused – end the conversation, they clearly have NEVER considered an alternative explanation to their own, why would you expect them to consider yours? Or, if they simply cannot produce anything, it may be that the issue itself cannot be resolved with evidence, their claim cannot be disproven, in which case it was a pointless argument. A good scientist should always be able to tell you what kind of evidence will overturn a theory – because that’s what good scientists look for! Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. our experimental hypothesis, the null hypothesis is set up to fail. The null hypothesis is also very specific, we can usually accurately define what a non-effect looks like. The null hypothesis is a statement of what would be the case if nothing is happening, if the theory is wrong, if there is no effect, if there is no difference between groups, etc. All your research efforts and design skills and statistical calculations are then focused on proving that the null hypothesis is wrong. If you can disprove the null hypothesis of no effect, then you can reject it, and say that you have found something. If you cannot find evidence inconsistent with the null hypothesis though, you must retain it. The cognitive dissonance example is used to illustrate this. Note critically that retaining a null hypothesis is very different to accepting it. Examples are given to illustrate this important distinction: the hamburger study described in the previous lesson, the legal system’s presumption of innocence and even a dating example. Since retention (rather than acceptance) implies you have no loyalty or even respect for a theory, this attitude cuts to the heart of science (i.e., Lesson 1), that scientists will change their minds in a second if the right evidence comes along. Of course, how do you go about mathematically retaining or rejecting null hypotheses? You cannot just do it by looking at means. There is a formal process which integrates everything we have learnt so far and is discussed in the next lesson. Recommended reading for this lesson: * Reading list: Book chapter: Research Methods in Psychology + supplement *Reading list: Book chapter: Chapter 2 The imperfect mind, pp.22-24; from The Rubber Brain by Sue Morris and others. Recommended media: “Media Watch – Australians are the worst gamblers” ABC 2009 – now available within the online module. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Science and Statistics Lesson 5 Outline Variability and Inferential Statistics Learning outcomes By the end of this lesson, you should be able to: a. Understand and be able to interpret a ‘frequency distribution’ in its many forms b. Appreciate the role of variability in research and inferential statistics c. Be able to list the features of a ‘sampling distribution’ and understand how it differs from a distribution of raw scores d. Be able to define the ‘normal curve’ and appreciate the way in which all ‘normal curves’ are the ‘same shape’ e. Understand how a sampling distribution is used to calculate a p-value, and be able to recognise and interpret p-values in real research f. Consider the probabilistic nature of all scientific conclusions If you had not completed your manual calculation of a standard deviation before the previous lesson, do complete it before this one, because I wasn’t just asking you to do that to fill time, but to help you understand what a standard deviation is. There’s a large and very important conceptual leap in this lesson which requires a keen understanding of variability. In this lesson and the next, there is a heavy reliance on graphs of distributions. Don’t go into these lessons not knowing what these mean. You need to understand that they are frequency distributions – each score is on the axis, and how often the score occurred is the height of the line above that access. Most of them have a big bump in the middle, that’s because the most common scores are in the middle (the top of the big bump is the mode). Most of them are low at each end (sometimes called ‘tails’), that’s because scores are far less likely far away from the mean. When the graphs become conceptual (when raw distributions become sample distributions), the height of the line above the axis tells you the likelihood or probability of obtaining a sample mean of that value – it’s Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. obviously more likely in the middle where the big bump is, and increasingly less likely as you move towards the tails. And at a certain point you decide that it is just too unlikely that the sample mean you obtained, could have come from a sampling distribution with the hypothesised (null) mean. I have just described inferential statistics, but you will need to understand a lot more before those previous few sentences make complete sense. The reason why we desperately need to understand statistics in Psychology is because human behaviour varies. There’s a lot of noise in almost any psychology experiment caused by inconsistent behaviour, inexact measurement, and constructs which may not be the best way to consider a problem. The challenge is to develop a method to find patterns in that noise, so that we do not build theories on random occurrences. The way inferential statistics works is we take a sample from a population (because it’s almost always totally impractical to measure an entire population) and we run our study on that sample. When we obtain differences/findings/effects in that sample, we have to ask an important question: Can we infer from the differences/findings/effects in our sample, that there are differences/findings/effects in the population? And the reason why we have to ask that question is because sampling variability alone can cause effects in a sample. And that is why we need to understand variability. The ultimate objective of inferential statistics is to be able to say something like: “On the basis of what we have observed in our sample, we can make this conclusion about the population…” Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. When you are trying to establish that an effect is real, you can no longer rely just on means, you need more information about variability. If drawing inferences from samples about populations is the aim of inferential statistics, then you need to understand the differences between them. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. The next step in trying to understand how things work is the biggest by far. It relies on your understanding of everything I have explained so far about distributions of raw scores. Once you have a thorough understanding of how that works, you are ready for the leap. A distribution of raw scores is based on a real set of data, and I’ve tried to help you understand this is a frequency distribution of actual raw scores (X) by stacking them up in the picture above. So each point on the x-axis (horizontal) represents a raw score value, and the height of the line (or the column of x’s you can actually see here) represents how frequently the score occurred. From what we already know from descriptive statistics, the standard deviation of these raw scores is called: standard deviation. And from the many examples I’ve shown you, you can see that while the shape of a distribution of raw scores can be normal, it is most often skewed or irregular. LARGE CONCEPTUAL LEAP AHEAD I want you to imagine that each time a study is run, lots of raw scores are obtained, and just one mean is taken from them. So for an entire study, one mean is produced (M). It’s important to note that the study did not involve an entire population, so a sample was taken from the population, and that sample had a particular size. Now I want you to imagine the same study is repeated, with the same sample size, but a different sample. Lots of raw scores are obtained, and just one mean is taken from them. So for that entire study, one mean is produced (M). That second mean will differ from the first because it comes from a different sample from the population. Now I want you to imagine that the same study, of a particular sample size, is repeated a million times. Each time the entire study is repeated, a single sample mean (M) is produced. Imagine a frequency distribution of those means. That’s called a sampling distribution. A sampling distribution is a hypothetical distribution based on a hypothetical set of sample means. Each point on the x-axis represents a sample mean value, and the height of the line (or the column of M’s you can see in this image), represents how frequently each sample mean of a particular value is expected to occur. The standard deviation of these means is called: standard error. And the shape of a distribution of a million hypothetical sample means tends be normal regardless of the distribution of the raw scores. Notice how a normal distribution results in a big bump in the middle. The hypothetical sample means cluster around the middle which is the population mean. You can still obtain sample means which diverge greatly from the population mean, but the low height of the line far from the population mean tells you they are far less likely to diverge greatly from the population mean. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Imagine that no interesting effects actually exist in the population, and your null hypothesis is true. If you are drawing different samples each time your results will still vary, and you will still obtain a sampling distribution. Now of course it is unlikely that you will obtain a sample mean which diverges greatly from the null population mean, but it is still possible. What inferential statistics aims to do is to calculate the probability of obtaining the result/sample mean you did, if the null is true. If that probability is high, then most likely nothing interesting is happening and you obtained the result by chance. If the probability is very low, then you may have found something… You might be wondering how we can estimate the probability of obtaining a particular sample mean. We do this by counting on a property of the sampling distribution call normality. Normal distributions are all the same shape in the sense that they are symmetrical, unimodal, and have a particular spread of scores, for example approximately two thirds of all scores in a normal distribution, fall within one standard deviation of the mean. Not all sampling distributions are normal (and there are tests for normality you will learn about in 2nd year), but if we assume that they are, then we can answer the “How likely is it that we would obtain a sample mean of this value if our null is true?” in a mathematical manner. By this stage you might be feeling a bit exhausted with all the mathematics and concepts, so in the lesson I attempt to demonstrate that everyone does understand inferential statistics quite well with the coin toss example. It’s a great exercise to do in a large lesson hall (but in 2022 I’ll try it at the end of the first workshop) to demonstrate how everyone has a decision point: a point at which they stop saying a difference is due to chance and start saying a real effect is present. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. In Psychology, we usually decide something interesting is happening when there is less than 5% chance that we obtained the result we did by chance. The mechanics of working this out usually involve calculating a distance from the hypothesised null mean in standard error units. A t, z , F, χ, all these statistics are measures of that distance on various sampling distributions, and the bigger those numbers, the less likely it is that the sample mean was actually sampled from a population where nothing was happening. For convenience, the actual probability of obtaining a sample value of a particular magnitude is expressed as a p-value. P-values vary between 0 and 1 and express a probability. It is not strictly speaking the probability that we obtained the sample mean we did when the null is true by chance alone, but that’s a good enough approximation for this course. If you understand that, then you’ll understand that researchers want to obtain low P-values, which suggest their null hypothesis is false and they have actually found something. In Psychology if the p-value is less than.05 we would normally reject the null hypothesis. If the p-value is greater than.05 we would normally retain the null hypothesis..05 does not have to be the decision point. If it was 0.01 then your decision making would be more conservative (if you’re re-reading this after the lesson, think of the students – usually males – who leave their hands up right to the end of the coin toss exercise). And if it was 0.1 then your decision making would be more liberal (the students who put their hands down first). Calculating a p-value is beyond the scope of this course, and even in later years you will usually use software to do it. However do understand that the reason why we can calculate it is because we make assumptions about the sampling distribution and its normal shape. Understanding the process and what a p-value is suddenly opens up the results section of every research report to you. You will notice when a p-value of a test is larger than.05 phrases like “did not reach significance” are used, Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. and any sample mean difference associated with the test will never be mentioned again. In contrast when you see p-values less than.05 phrases like “statistical significance” are used and the result can then be freely discussed from that point on. In the lesson I will examine the results section of the paper your research report assignment is based on to demonstrate how a once previously intimidating results section can now be understood. At this point you might be a bit shocked to learn we are not making conclusions based on “solid”, “certain” findings. Every decision seems to be probabilistic in nature. Science does not deal with certainties, however. Any decision-making process cannot afford to wait until every understanding is perfect – that level of conservatism will prevent you from doing anything, ever. At the same time though, a decision-making process cannot afford to be influenced by tiny differences which could be due to chance – that level of liberalism would lead to rapid changes of direction driven mostly by random findings. How do we find the balance? The answer at the end of Lesson 5 is,.05. But then you might also wonder: “if we reject the null hypothesis when then is less than 5% chance nothing is happening, won’t we be wrong 5% of the time?” Recommended reading for this lesson: * Reading list: Book chapter: Research Methods in Psychology + supplement Useful links: http://en.wikipedia.org/wiki/Normal_distribution Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Science and Statistics Lesson 6 Outline Practical Significance and Statistical Power Learning outcomes By the end of this lesson, you should be able to: a. Define and distinguish between ‘statistical significance’ and ‘practical significance’ b. Have a more sophisticated understanding of being liberal (open-minded) or conservative (closed minded) by using concepts such as ‘Type 1 error’, ‘Type 2 error’, ‘hits’, ‘misses’, ‘false alarms’ and ‘correct rejections’. c. Understand ‘statistical power’ and the features of a research study which affect it d. Fully grasp the probabilistic nature of all scientific conclusions and consider how this leaves it vulnerable to the logical errors: ‘Appeal to ignorance’, ‘False Dichotomy’, and ‘Denialism’. In this final lesson all the previous statistical and conceptual ideas are brought together. It is all about finding the right balance. Never again should you characterise science as ‘closed minded’ or ‘liberal’, and hopefully you will better appreciate what exactly science is doing when we draw conclusions using it. The workshop begins with an example illustrating which kinds of questions are and are not scientific. Now you understand the probabilistic nature of decisions, you will be able to see that even questions dealing with mysteries can still be investigated. Returning to the specific nature of that decision making, you should still be wondering why our critical cut-off convention (sometimes called a “decision rule”) is to reject null hypotheses when there is less than 5% chance our result is due to chance, when that implies, we will make an error 5% of the time. Why don’t we always make the critical cut off 0.01 or 0.00001 or just make it 0.0000000? The answer is: there is more than one kind of error. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. The error we have been deciding on: the probability of rejecting the null hypothesis if it is true, is called a Type 1 error or α. The error that inevitably blows out (which we can neither set nor accurately determine) is the probability of retaining a null hypothesis when it is false, called a Type 2 error or β. You can see in the slides that as you reduce one, the other grows. Statistical power is simply 1-β, it is the probably of correctly rejecting a null hypothesis which is false. More power is usually desirable (too much can be a problem though, discussed later). Think of power as the ‘sensitivity’ of an experiment or study, that is, the likelihood it will detect a real effect. We can never know power, but it can be estimated. Factors which affect power include:  The variability of the effect you are looking for (more variability = lower power)  The size of the effect you are looking for (larger effect = more power)  Sample size (larger sample size = more power) Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. Because sample size is the one factor we have most control over, usually it is what we use to control the power of a study. However, it’s not always easy to choose the right amount of power. Too little power caused by too few participants means that the study might not be sensitive enough to detect an effect that might be of importance. But too much power caused by too many participants might make the study too sensitive, which mean tiny effects of no practical significance might be found. Statistical significance is usually proclaimed when the p-value <.05. We use it to say that the result is unlikely to be due to chance. But note that the p-value only indicates the reliability of an effect – it has nothing to say about how large the effect is. Practical significance is determined by the size of the effect and its application. Who cares if the effect is reliable, is it useful? Does it matter? Is it worth spending a lot of money on? Examples to illustrate this idea: The “spider example” from the Stats tutorial; the height and IQ example explained in Minium and in the lesson; the HRT, vaccination and immigration examples in the lesson. This lesson series ends with a discussion of science and uncertainty. Ever since the first lesson the certainty of science has been undermined as a myth, but now you have seen how inferential statistics works, you can appreciate that science does indeed operate via probability based decisions, and simply does not function by claiming or focusing on certainty. Yet many scientists are often arrogantly certain, and act as if they are revealing innate truths: Dr Ross Geller is a good example – his argument with someone called Phoebe Buffay (Series 2: the one where heckles dies) is shown in this lesson and demonstrates his hilarious ignorance on this topic. If you’re wondering why science operates in this manner, think back to type 2 errors and the problem with being too conservative, we need to make decisions and make progress. Rather than expressing certainty, the most science can do is express the boundaries of uncertainty (you will learn about confidence intervals in second year). And yet this probabilistic way of doing things, and the humility that comes with expressing uncertainty, leaves science vulnerable to the appeal to ignorance. Sometimes called the “argument from ignorance”, this logical error or strategy involves thrusting an assertion into a region of uncertainty. When good scientists don’t know something, they do not know it – they go to bed not knowing it – and if anything that’s their motivation for going to work the next day. But that constant state of not quite knowing the exact truth is not for everyone. Lots of people demand certainty, and scientific humility about what is known and not known, is derided by them as ignorance. It looks like this: “If you cannot explain this (perfectly)… then my explanation must be right.” Unpacking to show why it is an error…. “If you cannot fully support your theory with all the evidence in the world…. then my explanation, for which I offer no evidence, must be right.” When you see phrases like: “Scientists are baffled”, or “The science is not settled,” that is the appeal to ignorance. You can see that one of its main characteristics is that one possible explanation is scrutinised and attacked, while the other is not even examined, and is just slipped in as what must be true. If there were only two possible answers, then attacking one would constitute evidence for the other. If you know Jack is in the bedroom or the bathroom, and you check the bedroom, then he must be in the bathroom. That’s a very closed system though. Because most questions in science have far more Copyright 2024, School of Psychology, The University of Sydney and Dr Caleb Owens. Reproducing the content, structure or form of these lesson notes and lesson content for profit or personal gain on trading sites is strictly forbidden. than two possible answers, when a person asserts that there are only two, they are guilty of creating a false dichotomy. This is easily spotted in statements like: “Let’s debate both sides”, “Let’s have a fair and equal debate.”. What they are asking for is:  A situation where they want to be able to attack the other side, and not feel they need to establish their own case, and..  A situation where the actual weight of evidence is not going to be considered. “Fair” is used to over-represent the unsupported side. John Oliver makes a mockery of this idea: https://www.youtube.com/watch?v=cjuGCJJUGsg A more modern term for these kinds of errors is denialism, which implies a broader rejection of a series of claims or an entire body of evidence and theories. Examples include disease denial, vaccine denial, evolution denial, holocaust denial, September 11th denial, climate change denial etc. Despite these highly diverse examples, the pattern is the same: there’s some imperfection in the current explanation we have (as there always will be with any kind of knowledge), and all kinds of nonsense is poked into that gap. I have even seen Heisenberg’s uncertainty principle (which is about electrons) used to deny all knowledge. It does something to our understanding of knowledge that’s for sure, but having some uncertainty does not mean we know nothing. Hopefully this lecture series gives you a good sense of what science is and is not, and allows you both to participate in science, but also to understand it well enough to help those around you see how it works. Recommended reading for this lesson: * Reading list: Book Chapter Power and Measure of effect size. NB: Power is a hard concept to understand, so this chapter may be very useful. You do not need to know how to calculate power, but understanding what it is and the factors which affect it is important. Recommended media: “The Ascent of Man” Jacob Bronowski; Knowledge or Certainty This is an excellent BBC documentary, some of which is available online. Paste the above into YouTube for some excerpts. Useful links: http://en.wikipedia.org/wiki/Normal_distribution

Science and Statistics Guide 2024 PDF

Document Details

Tags

Related

Summary

Full Transcript