Controlling Language Quality


You should look at this section if you have only a limited idea of what controlling language quality involves is and why it is important.”

If students are to have a fair chance to perform well on an assessment instrument, it is important that the language used is suitable for the target students. This means two things:

  • The vocabulary used and the level of complexity of expression must be suitable for the subject and grade level of students.
  • If translated, the translation process must ensure that each language version has the same level of simplicity as the other language versions.

There are three common but incorrect assumptions that are often found in the development assessment instruments. These can lead to some students being disadvantaged while others are advantaged. Such differences can undermine the validity of results, so it is important that they are avoided.

The first mistake is to assume that test developers know what terminology is suitable for students. This may not be the case and there should be a process of verification in place to ensure that the terminology is appropriate.

The second mistake is to assume that one language used in different places is the same. Again, this is not accurate. For example the English used in Australia, Ireland and the United States is not exactly the same. Therefore an assessment instrument for use in all three countries needs to be adapted, including changes in spelling, terminology and grammar.

The third mistake is to assume that a single translation (or reverse translation) is enough to make assessment instruments in different languages equivalent. This is not accurate because languages contain lots of synonyms. Therefore the translation may be correct but might use words that make the assessment instruments easier or harder for students.

All of these problems can only be avoided by using a detailed process of linguistic quality control. Global best practice is to use dual translation, reconciliation and further quality checks including adaptation and verification. Importantly, all different versions of assessment instruments should be piloted with target students to identify any misunderstandings that can be corrected before the main implementation.

Ideally, test developers involve target students in the test development process. This means that focus groups with small groups of students are conducted in which students talk through test items with students and make sure that the language used is clear to students. These are often called ‘cognitive laboratories’ or ‘speak alouds’.

To find out more about linguistic quality control, go to #Intermediate.


You should look at this section if you have some idea of what is involved in controlling language quality but would like more details.

The way that assessment items are written can have a huge impact on student performance.  When an assessment instrument includes words that are unfamiliar to students this has a negative impact on their ability to respond accurately. This can lead to inaccurate interpretations, undermining the validity of an assessment programme.

For example if a group of students performs badly on a mathematics item this might show that they are weak in that area of mathematics. But it might mean that they are good in that area of mathematics but were confused by the terminology used in the item. This indicates that linguistic quality control needs to be used before assessment items are included in a test. There are some key elements that need to be included, with implications for practice:

Regional dialects – When assessing students, the differences in the way that they use the same language needs to be considered. For example there are many millions of speakers of languages such as Bangla, Hindi and Urdu. Not all of these speakers use exactly the same form of the language due to regional dialects. This means that when assessing students in different regions, attention needs to be paid to making sure that assessment instruments are not easier for students in some regions than in others.

This means that representatives from different regions who are also experts in the way that a subject is taught at the grade levels of assessment need to be involved in adapting the assessment instrument.

Technical terminology – When students are assessed in subjects like mathematics and science, lots of technical terminology is used. This includes simple words like ‘triangle’ and complex words like ‘calibrate’ or ‘osmosis’. These word often have synonyms. For example ‘add 2 and 4’ and ‘what is the sum of 2 and 4?’ mean the same. But students may understand the first and not understand the second. It is very important that the terminology used in an assessment instrument is what students in that subject, grade level and region are familiar with.

This means that experts in the way that a subject is taught at the grade levels of assessment need to be involved in adapting the assessment instrument.

Different languages– When students are assessed in different languages, for example in a national or international assessment programme, the different language versions of assessment instruments must be equivalent. This means that they should not be easier or harder for students in one language than in another. This is a difficult outcome to achieve.

This means that a detailed linguistic quality assurance process is required including: a translatability review; a dual translation; reconciliation; adaptation and verification. All changes must be documented for future reference.

Student interpretations – Even if a detailed linguistic quality control process has been used, it is still possible that students will interpret assessment items in different ways to what is intended. The only way to avoid this is to involve target students in the development and linguistic quality control process.

This means that cognitive laboratories should be used in the development of assessment items, and that thorough piloting in all regional and/or language versions of assessment instruments should be conducted, with revisions made according to how items perform during the pilot.

To find out more about designing contextual items, go to #Advanced


You should look at this section if you already know how to implement linguistic quality control and are interested in more details.

Controlling language quality is often excluded from assessment programmes due to time and cost limitations. This is a big mistake as it can lead to differences in how easy or difficult assessment instruments are for different groups of students. This can undermine the ability of the assessment programme to make valid comparisons across groups of students.

The process for linguistic quality control involves a number of key steps. These differ depending on whether the assessment instrument is developed in one or more language:

One language: Drafting in source language à Review (subject experts, grade level experts, experts in regional dialects) à Adaptation à Verification à Piloting à Review à Revision

More than one language: Drafting in source language à Dual Translation to target language à Reconciliation à Review (subject experts, grade level experts, experts in regional dialects) à Adaptation à Verification à Piloting (in each language) à Review à Revision

It is important that sufficient time – at least several weeks – is provided in the assessment programme for these processes to occur. Much of the review work can be done via online technology, meaning that there is no need for people to meet face-to-face. Detailed records need to be kept by all people involved in a common format.

If an assessment instrument is to be translated into multiple languages it is essential that all processes used for each language are in alignment. This means that there must be centralised coordination in place, with careful quality assurance of each process and detailed record keeping. It is also important that all revisions and decisions are documented so that they can be referred to in future.

Reverse translation (also known as ‘back translation’) is sometimes used in assessment programmes to quality assure translations. This is not advisable. There is never a single way of translating a document – every language has synonyms and hence it is unrealistic to expect that a reverse translation will be the same as the original. Reverse translation is also likely to be more expensive than the quality assurance process described above.

Many countries include populations that speak minority languages (sometimes referred to as tribal languages). This raises significant questions for the education sector, including which language it is best to educate students in. Students whose first language is not the language of instruction may perform poorly in assessments. This may not reflect their skills and knowledge in particular subjects but may be a consequence of their language skills.

To collect valid data on the performance of students who speak a minority language, it is important to consider adaptations. This may mean providing a bi-lingual or multi-lingual assessment instrument, although this is often avoided due to expense. Other adaptations may include additional testing time, providing a list of key assessment terms in advance, supplying wordlists during assessment, allowing access to dictionaries and so on.

This issue is also a reminder that it is essential to ensure that language used in assessment materials is appropriate for target students. This means ensuring that all assessment material is reviewed by subject and language experts with familiarity with students at the age and grade level for which the assessment is being developed.

Previous Topic
Next Topic