Pretest Review:先期測試評估

The Validation Unit at Cambridge ESOL collates and analyses the pretest material.

Listening and Reading pretests

All candidate responses are analysed to establish the technical measurement characteristics of the material, i.e. to find out how difficult the items are, and how they distinguish between stronger and weaker candidates. Both classical item statistics and latent trait models are used in order to evaluate the effectiveness of the material. Classical item statistics are used to identify the performance of a particular pretest in terms of the facility and discrimination of the items in relation to the sample that was used. Rasch analysis is used to locate items on the IELTS common scale of difficulty. In addition, the comments on the material by the staff at pretest centres and the immediate response of the pretest candidates are taken into account.

At a pretest review meeting, the statistics, feedback from candidates and teachers and any additional information are reviewed and informed decisions are made on whether texts and items can be accepted for construction into potential live versions. Material is then stored in an item bank to await test construction.

Writing and Speaking pretests

Separate batches of Writing pretest scripts are marked by IELTS Principal Examiners and Assistant Principal Examiners. At least two reports on the task performance and its suitability for inclusion in live versions are produced. On the basis of these reports, tasks may be banked for live use, amended and sent for further pretesting or rejected.?

Feedback on the trialling of the Speaking tasks is reviewed by experienced examiners, who deliver the trialling tasks, and members of the item writing team who are present at the trialling sessions. The subsequent reports are then assessed by the paper chair and Cambridge ESOL staff.

解讀:在進(jìn)行先期測試采集分析數(shù)據(jù)的基礎(chǔ)上,劍橋考試委員會試題復(fù)核小組(Validation Unit)將匯總各項(xiàng)測試數(shù)據(jù)進(jìn)行分析。所有考生的回答都將進(jìn)行分析以確認(rèn)試題的技術(shù)特征——也即試題的難度及區(qū)分度——而這將用到一系列專業(yè)的數(shù)據(jù)分析統(tǒng)計(jì)方法和工具。通過驗(yàn)證的試題則將被收入題庫,用于將來正式的考試。

Banking of Material: 試題入庫

Cambridge ESOL has developed its own item banking software for managing the development of new live tests. Each section or task is banked with statistical information as well as comprehensive content description. This information is used to ensure that the tests that are constructed have the required content coverage and the appropriate level of difficulty.

解讀:劍橋大學(xué)考試委員擁有自己專門的題庫軟件用于管理新的試卷生成。每個(gè)考試項(xiàng)目在題庫中都備注有詳細(xì)的內(nèi)容說明及統(tǒng)計(jì)分析信息,用以確保所生成的題目在涉及的內(nèi)容和難度水平上符合要求。

Standards Fixing Construction: 評分標(biāo)準(zhǔn)校正

Standards fixing ensures that there is a direct link between the standard of established and new versions before they are released for use at test centres around the world.?

Different versions of the test all report results on the same underlying scale, but band scores do not always correspond to the same percentage of items correct on every test form. Before any test task is used to make important decisions, we must first establish how many correct answers on each Listening or Reading test equate to each of the nine IELTS bands. This ensures that band scores on each test indicate the same measure of ability.?

解讀:新的現(xiàn)場考試題目生成并全球發(fā)布之前,評分標(biāo)準(zhǔn)校正環(huán)節(jié)的存在確保了新的考試與已經(jīng)進(jìn)行過的考試之間在評分標(biāo)準(zhǔn)上有著直接聯(lián)系。由于每次不同的考試的結(jié)果都同樣反映在雅思考試的9分制評分體系上,評分標(biāo)準(zhǔn)校正環(huán)節(jié)確保了每次考試雖然分?jǐn)?shù)段對應(yīng)的正確率不一定相同,但同樣的總分能夠反映出同樣的能力。

Live Test Construction and Grading: 生成現(xiàn)場測試

Live Test Release: 發(fā)布現(xiàn)場測試