Меню
Поиск



рефераты скачатьTypes of tests used in English Language Teaching Bachelor Paper

additional practice.

With respect to Hicks, we can display some of her useful and practical

ideas she proposes for the teachers to use in the classroom. In order to

incorporate evaluation together with assessment she suggests involving the

students directly into the process of testing. Before testing vocabulary

the teacher can ask the students to guess what kind of activities could be

applied in the test. The author of the paper believes that it will give

them an opportunity to visage how they are going to be tested, to be aware

of and wait for, and the most important, it will reduce fear the students

might face. Moreover, at the end of each test the students could be asked

their reflections: if there was a multiple choice, what helped them guess

correctly, what they used for that – their schemata or just pure guessing;

if there was a cloze test - did they use guessing from the context or some

other skills, etc. Furthermore, Hicks emphasises that such analysis will

display the students the way they are tested and establish an appropriate

test for each student. Likewise, evaluation will benefit the teacher as

well. S/he not only will be able to discover the students’ preferences, but

also find out why the students have failed a particular type of activity or

even the whole test. The evaluation will determine what is really wrong

with the structure or design of the test itself. Finally, the students

should be taught to evaluate the results of the test. They should be asked

to spot the places they have failed and together with the teacher attempt

to find out what has particularly caused the difficulties. This will lead

to consolidation of the material and may be even to comprehension of it.

And again the teacher’s role is very essential, for the students alone are

not able to cope with their mistakes. Thus, evaluation is inevitable

element of assessment if the teacher’s aim is to design a test that will

not make the students fail, but on the contrary, anticipate the test’s

results.

To conclude we can add alluding to Alderson (1996:212) that the usual

classroom test should not be too complicated and should not discriminate

between the levels of the students. The test should test what was taught.

The author of the paper has the same opinion, for the students are very

different and the level of their knowledge is different either. It is

inappropriate to design a test of advanced level if among your learners

there are those whose level hardly exceeds lower intermediate.

Above all, the tests should take the learners’ ability to work and

think into account, for each student has his/her own pace, and some

students may fail just because they have not managed to accomplish the

required tasks in time.

Furthermore, Alderson assumes (ibid.) that the instructions of the

test should be unambiguous. The students should clearly see what they are

supposed and asked to do and not to be frustrated during the test.

Otherwise, they will spend more time on asking the teacher to explain what

they are supposed to do, but not on the completing of the tasks themselves.

Finally, according to Heaton (1990:10) and Alderson (1996:214), the teacher

should not give the tasks studied in the classroom for the test. They

explain it by the fact, that when testing we need to learn about the

students’ progress, but not to check what they remember. The author of the

paper concurs the idea and assumes that the one of the aims of the test is

to check whether the students are able to apply their knowledge in various

contexts. If this happens, that means they have acquired the new material.

Chapter 2

Reliability and validity

1. Inaccurate tests

Hughes (1989:2) conceives that one of the reasons why the tests are not

favoured is that they measure not exactly what they have to measure. The

author of the paper supports the idea that it is impossible to evaluate

someone’s true abilities by tests. An individual might be a bright student

possessing a good knowledge of English, but, unfortunately, due to his/her

nervousness may fail the test, or vice versa, the student might have

crammed the tested material without a full comprehension of it. As a

result, during the test s/he is just capable of producing what has been

learnt by tremendous efforts, but not elaboration of the exact actual

knowledge of the student (that, unfortunately, does not exist at all).

Moreover, there could be even more disastrous case when the student has

cheated and used his/her neighbour’s work. Apart from the above-mentioned

there could be other factors that could influence an inadequate completion

of the test (sleepless night, various personal and health problems, etc.)

However, very often the test itself can provoke the failure of the

students to complete it. With the respect to the linguists, such as Hughes

(1989) and Alderson (1996), we are able to state that there are two main

causes of the test being inaccurate:

. Test content and techniques;

. Lack of reliability.

The first one means that the test’s design should response to what is

being tested. First, the test must content the exact material that is to be

tested. Second, the activities, or techniques, used in the test should be

adequate and relevant to what is being tested. This denotes they should not

frustrate the learners, but, on the contrary, facilitate and help the

students write the test successfully.

The next one denotes that one and the same test given at a different time

must score the same points. The results should not be different because of

the shift in time. For example, the test cannot be called reliable if the

score gathered during the first time the test was completed by the students

differs from that administered for the second time, though knowledge of the

learners has not changed at all. Furthermore, reliability can fail due to

the improper design of a test (unclear instructions and questions, etc.)

and due to the ways it is scored. The teacher may evaluate various students

differently taking different aspects into consideration (level of the

students, participation, effort, and even personal preferences.) If there

are two markers, then definitely there will be two different evaluations,

for each marker will possess his/her own criteria of marking and evaluating

one and the same work. For example, let us mention testing speaking skills.

Here one of the makers will probably treat grammar as the most significant

point to be evaluated, whereas the other will emphasise the fluency more.

Sometimes this could lead to the arguments between the makers;

nevertheless, we should never forget that still the main figure we have to

deal with is the student.

2.2. Validity

Now we can come to one of the important aspects of testing – validity.

Concerning Hughes, every test should be reliable as well as valid. Both

notions are very crucial elements of testing. However, according to Moss

(1994) there can be validity without reliability, or sometimes the border

between these two notions can just blur. Although, apart from those

elements, a good test should be efficient as well.

According to Bynom (Forum, 2001), validity deals with what is tested and

degree to which a test measures what is supposed to measure (Longman

Dictionary, LTAL). For example, if we test the students writing skills

giving them a composition test on Ways of Cooking, we cannot denote such

test as valid, for it can be argued that it tests not our abilities to

write, but the knowledge of cooking as a skill. Definitely, it is very

difficult to design a proper test with a good validity, therefore, the

author of the paper believes that it is very essential for the teacher to

know and understand what validity really is.

Regarding Weir (1990:22), there are five types of validity:

. Construct validity;

. Content validity

. Face validity

. Wash back validity;

. Criterion-related validity.

Weir (ibid.) states that construct validity is a theoretical concept that

involves other types of validity. Further, quoting Cronbach (1971), Weird

writes that to construct or plan a test you should research into testee’s

behaviour and mental organisation. It is the ground on which the test is

based; it is the starting point for a constructing of test tasks. In

addition, Weird displays the Kelly’s idea (1978) that test design requires

some theory, even if it is indirect exposure to it. Moreover, being able to

define the theoretical construct at the beginning of the test design, we

will be able to use it when dealing with the results of the test. The

author of the paper assumes that appropriately constructed at the

beginning, the test will not provoke any difficulties in its administration

and scoring later.

Another type of validity is content validity. Weir (ibid.) implies the

idea that content validity and construct one are closely bound and

sometimes even overlap with each other. Speaking about content validity, we

should emphasise that it is inevitable element of a good test. What is

meant is that usually duration of the classes or test time is rather

limited, and if we teach a rather broad topic such as “computers”, we

cannot design a test that would cover all the aspects of the following

topic. Therefore, to check the students’ knowledge we have to choose what

was taught: whether it was a specific vocabulary or various texts connected

with the topic, for it is impossible to test the whole material. The

teacher should not pick up tricky pieces that either were only mentioned

once or were not discussed in the classroom at all, though belonging to the

topic. S/he should not forget that the test is not a punishment or an

opportunity for the teacher to show the students that they are less clever.

Hence, we can state that content validity is closely connected with a

definite item that was taught and is supposed to be tested.

Face validity, according to Weir (ibid.), is not theory or samples

design. It is how the examinees and administration staff see the test:

whether it is construct and content valid or not. This will definitely

include debates and discussions about a test; it will involve the teachers’

cooperation and exchange of their ideas and experience.

Another type of validity to be discussed is wash back validity or

backwash. According to Hughes (1989:1) backwash is the effect of testing on

teaching and learning process. It could be both negative and positive.

Hughes believes that if the test is considered to be a significant element,

then preparation to it will occupy the most of the time and other teaching

and learning activities will be ignored. As the author of the paper is

concerned this is already a habitual situation in the schools of our

country, for our teachers are faced with the centralised exams and

everything they have to do is to prepare their students to them. Thus, the

teacher starts concentrating purely on the material that could be

encountered in the exam papers alluding to the examples taken from the past

exams. Therefore, numerous interesting activities are left behind; the

teachers are concerned just with the result and forget about different

techniques that could be introduced and later used by their students to

make the process of dealing with the exam tasks easier, such as guessing

form the context, applying schemata, etc.

The problem arises here when the objectives of the course done during the

study year differ from the objectives of the test. As a result we will have

a negative backwash, e.g. the students were taught to write a review of a

film, but during the test they are asked to write a letter of complaint.

However, unfortunately, the teacher has not planned and taught that.

Often a negative backwash may be caused by inappropriate test design.

Hughes further in his book speaks about multiple-choice activities that are

designed to check writing skills of the students. The author of the paper

is very confused by that, for it is unimaginable how writing an essay could

be tested with the help of multiple choices. Testing essay the teacher

first of all is interested in the students’ ability to apply their ideas in

writing, how it has been done, what language has been used, whether the

ideas are supported and discussed, etc. At this point multiple-choice

technique is highly inappropriate.

Notwithstanding, according to Hughes apart form negative side of the

backwash there is the positive backwash as well. It could be the creation

of an entirely new course designed especially for the students to make them

pass their final exams. The test given in a form of final exams imposes the

teacher to re-organise the course, choose appropriate books and activities

to achieve the set goal: pass the exam. Further, he emphasises the

importance of partnership between teaching and testing. Teaching should

meet the needs of testing. It could be understand in the following way that

teaching should correspond the demands of the test. However, it is a rather

complicated work, for according to the knowledge of the author of the paper

the teachers in our schools are not supplied with specially designed

materials that could assist them in their preparation the students to the

exams. The teachers are just given vague instructions and are free to act

on their own.

The last type that could be discussed is criterion-related validity. Weir

(1990:22.) assumes that it is connected with test scores link between two

different performances of the same test: either older established test or

future criterion performance. The author of the paper considers that this

type of validity is closely connected with criterion and evaluation the

teacher uses to assess the test. It could mean that the teacher has to work

out definite evaluation system and, moreover, should explain what she finds

important and worth evaluating and why. Usually the teachers design their

own system; often these are points that the students can obtain fulfilling

a certain task. Later the points are gathered and counted for the mark to

be put. Furthermore, the teacher can have a special table with points and

relevant marks. According to our knowledge, the language teachers decide on

the criteria together during a special meeting devoted to that topic, and

later they keep to it for the whole study year. Moreover, the teachers are

supposed to make his/her students acquainted with their evaluation system

for the students to be aware what they are expected to do.

3. Reliability

According to Bynom (Forum, 2001) reliability shows that the test’s

Страницы: 1, 2, 3, 4, 5, 6, 7, 8, 9




Новости
Мои настройки


   рефераты скачать  Наверх  рефераты скачать  

© 2009 Все права защищены.