Thursday, December 4, 2014

Language Assessment: Topic 4: Basic Principles of Assessment


  1. What are measures that a teacher can take to ensure high validity of language assessment for the primary ESL classroom? List out the measures here. (According to your experience as a teacher in a primary ESL classroom).

    1. Teachers or educators should acknowledge the student's reading ability that for certain have an impact on the validity of an assessment. For example, if a pupil has a hard time reading and comprehending what a question is asking, a test / task will not be an accurate assessment of what the pupils truly knows and understood about a subject or topic learned. The teachers or educators should seriously ensure that an assessment is at the correct reading level of the pupils. Among the pupils they should obtain a self-efficacy that contributes an impact validity of an assessment. If they have low self-efficacies, or beliefs about their abilities in the particular area they are being tested in, they will typically perform poorly. Their own doubts hinder their ability to accurately demonstrate knowledge and comprehension.
      Other measures that should be accounted to ensure a high validity of a language assessment are that a test must reflect what the teacher or educator wants the class to learn. This is usually a judgmental decision. Teacher should also consider the possibilities of having pupils learn from their weaknesses. It locates the exact areas of difficulties experienced by them or the individual pupil so that assistance in the form of additional practice and corrective exercises can be given in the mere future. The test should provide an opportunity for the pupils to show their ability to perform their best in certain language tasks and display their development of the language content.

  2. The ppt slides are obtained from

  3. Based on my experience, there are many best ways of assessing validity, but there are few measures that can ensure high validity.

    Reliability of a test is the best measure to ensure high validity. Reliability of a test is an estimate of the consistency of its marks; a reliable test is one where, for example, a student will get the same mark if he or she takes the test, possibly with a different examiner, on a Monday morning or a Tuesday afternoon. A test must be reliable, as a test cannot be valid unless it is reliable.

    The next measure is to ensure content validity of a test. In school, usually teachers send it to the subject specialists who compare test items with the test specifications to see whether the items are actually testing what they are supposed to be testing, and whether the items are testing what the designers say they are. In the case of a classroom quiz, of course, there will be no test specifications, and the deviser of the quiz may simply need to check the teaching syllabus or the course textbook to see whether each item is appropriate for that quiz.

    Other than that, construct validity is also another measure that needs to be taken by teacher. If a test is supposed to be testing the construct of listening, it should indeed be testing listening, rather than reading, writing and/or memory. To assess construct validity the teachers can use a combination of internal and external quantitative and qualitative methods. An example of a qualitative validation technique would be for the test constructors to ask test-takers to introspect while they take a test, and to say what they are doing as they do it, so that the test constructors can learn about what the test items are testing, as well as whether the instructions are clear, and so on. Construct validation also relates to the test method, so it is often felt that the test should follow current pedagogical theories. If the current theory of language teaching emphasises a communicative approach, for example, a test containing only out-of-context, single-sentence, multiple-choice items, which test only one linguistic point at a time, is unlikely to be considered to have construct validity.

    Ganesan Veerappan
    PPG Tesl
    Sem 6

  4. There is an important relationship between reliability and validity. An assessment that has very low reliability will also have low validity; clearly a measurement with very poor accuracy or consistency is unlikely to be fit for its purpose. But, by the same token, the things required to achieve a very high degree of reliability can impact negatively on validity. For example, consistency in assessment conditions leads to greater reliability because it reduces 'noise' (variability) in the results. On the other hand, one of the things that can improve validity is flexibility in assessment tasks and conditions. Such flexibility allows assessment to be set appropriate to the learning context and to be made relevant to particular groups of students. Insisting on highly consistent assessment conditions to attain high reliability will result in little flexibility, and might therefore limit validity.

  5. To ensure the language assessment to have high validity, I always make sure that the test is within the level of skills of my pupils. Pupils must know the area of skills they have learnt that will be tested by their teacher. For me this is the most important part.

    Teacher should always look back at the materials used by them in designing a test. Look back to the textbook, multimedia resources or even a piece of paper that have been used in the classroom. The syllabus also needs to be look onto so that the assessment made is not surpassing the pupils’ level.

    One way to test the high validity of any language assessment is to assess the pupils before the real text. Give them a bit of instruction before the test. One way of detecting a test without high validity is the tested pupils are asking too many questions about the assessment.

    Mohd Nizam Mohamed
    PPG TESL SEM6 2015

  6. Validity is commonly referred to as the extent to which a test measures
    what it claims to measure. To maintain high validity assessment,I make sure that the assessment test what they had learnt.The assessment also must suitable for good and weak pupils
    Other than that teacher should set up goal to ensure that the assessment only test what he wants to test

  7. A valid assessment is one that measures what it is intended to measure. For example, it would not be valid to assess driving skills through a written test alone. A more valid way of assessing driving skills would be through a combination of tests that help determine what a driver knows, such as through a written test of driving knowledge, and what a driver is able to do, such as through a performance assessment of actual driving.
    Teachers frequently complain that some examinations do not properly assess the syllabus upon which the examination is based but they are, effectively, questioning the validity of the exam. Validity of an assessment is generally gauged through examination of evidence in the following categories such as content, criterion, construct and face.
    A good assessment has both validity and reliability, plus the other quality attributes noted above for a specific context and purpose. In practice in the classroom, an assessment is rarely totally valid or totally reliable. A ruler which is marked wrongly will always give the same (wrong) measurements. It is very reliable, but not very valid. Asking random individuals to tell the time without looking at a clock or watch is sometimes used as an example of an assessment which is valid, but not reliable. The answers will vary between individuals, but the average answer is probably close to the actual time.
    In many fields, such as medical research, educational testing, and psychology, there will often be a trade-off between reliability and validity. A history test written for high validity will have many essay and fill-in-the-blank questions. It will be a good measure of mastery of the subject, but difficult to score completely accurately. A history test written for high reliability will be entirely multiple choice. It isn't as good at measuring knowledge of history, but can easily be scored with great precision. We may generalize from this. The more reliable our estimate is of what we purport to measure, the less certain we are that we are actually measuring that aspect of attainment. It is also important to note that there are at least thirteen sources of invalidity, which can be estimated for individual students in test situations. They never are. Perhaps this is because their social purpose demands the absence of any error, and validity errors are usually so high that they would destabilize the whole assessment industry.
    It is well to distinguish between "subject-matter" validity and "predictive" validity. The former, used widely in education, predicts the score a student would get on a similar test but with different questions. The latter, used widely in the workplace, predicts performance. Thus, a subject-matter-valid test of knowledge of driving rules is appropriate while a predictively valid test would assess whether the potential driver could follow those rules.

    Uma Mageswari D/O Balakrishnan
    Sem 6

  8. This comment has been removed by the author.

  9. In order to ensure my test is in high validity to the pupils, I will always refer to the pupils’ achievement in the particular time. This would show the validity of the test. The flexibility in each test or assessment can make sure the improvement of the validity also. I will ensure that I will only do the test based on the syllabus and depend on what the pupils have learn in order to achieve the goal.

  10. Reliability is validity's cup of tea.
    The reliability of an assessment tool is the extent to which it measures learning consistently. The validity of an assessment tool is the extent by which it measures what it was designed to measure.
    The reliability of an assessment tool is the extent to which it consistently and accurately measures learning. It is fairly obvious that a valid assessment should have a good coverage of the criteria (concepts, skills and knowledge) relevant to the purpose of the examination. The important notion here is the purpose.

  11. What to test. How to do it. Whether to test at all. Why the assessment is being made. What it should contain. The consequences for teaching, learning and administration. The quality of the proposed test material.
    The characteristics of a good test are:
    • Validity - it should measure what it is intended to measure and nothing else.
    • Reliability - (unless valid it cannot be reliable): if administered a second time a reliable test would result in the same order of merit when neither learning nor teaching has intervened.
    • Discrimination: Decide first whether the primary purpose is to discriminate between testees. School exams are generally designed to discriminate as widely as possible among the testees.
    • Backwash: Effects of the test on learning & teaching. Does it have a good influence on the learning & teaching that takes place before the test.

  12. The measures that a teacher can take to ensure high validity of language assessment

    1. the level of the students that the teacher wants to test
    2. the practicality issues of a useful assessment of language ability
    3. the objective of the test refers to the ability of teacher who mark the answer script
    4. the washback effect reffers to the impact that test have on teaching and learning
    5. the authenticity which is whenever possible teacher should attempt to use authentic materials in testing language skills.
    6. the interpretability of encompasses all the ways that meaning is assigned to the scores.

  13. Validity refers to the degree to which assessment scores can be interpreted as a meaningful indicator of the construct of interest. A valid interpretation of assessment results is possible when the target construct is the dominant factor affecting a test-takers performance on an assessment. Teacher should give question according to summarize lesson for End Year Examination, to ensure high validity of language assessment for the primary ESL classroom. There are several different ways to investigate validity, depending on the score interpretations and inferences that an assessment seeks to support.
    First, content validity refers to the extent to which questions and tasks in an assessment represent all important aspects of the target construct.
    Second, construct validity refers to the extent to which inferences can be made about the target construct based on test performance.
    Third, concurrent validity refers to the relationship between test scores from an assessment and an independent criterion that is believed to assess the same construct.
    Finally, predictive validity refers to the extent to which the performance on an assessment can predict a test-takers future performance on an outcome of interest.

    Maria binti Zainal
    PPG Tesl
    Semester 6

  14. The measures that the teacher need to think in ensuring the high validlity are
    the ability of the pupils in the topic understanding.
    the level of interest that teacher want to test of
    the objectives of the test ..
    the praticality of the test for the pupils.