Approaches to eliminating bias...
One way to identify assessment bias is to have content-knowledgeable review each test item to determine whether the test items could offend or unfairly penalize certain subgroups.
High- stakes testing

For high-stakes tests, a bias review panel of 15-25 reviewers is constructed. Each reviewer is familiar with the content of the test. The panel should include mainly reviewers who are from the subgroups who might be negatively affected by the test (from both genders as well).
Next, members should go through an orientation about assessment bias and have a discussion about it.
Next, members should go through an orientation about assessment bias and have a discussion about it.
After orientation, reviewers should be asked to respond "yes" or "no" to a question similar to this for EACH test item.
"Might this item offend or unfairly penalize any group of students on the basis of personal characteristics, such as gender, ethnicity, religion, or race?"
"Might this item offend or unfairly penalize any group of students on the basis of personal characteristics, such as gender, ethnicity, religion, or race?"
The % of "no" answers per item is calculated so an average per-item absence-of-bias index can be obtained for every test item, and then for the whole test. The more "no" responses, the less bias reviewers think are in the test items. If the test is still being developed, items that were deemed biased by several reviewers are usually thrown out. However, test items deemed biased by even a single reviewer CAN be thrown out. Reviewers are encouraged to leave a comment about why they think the item is biased; from the comment it might be apparent that the reviewer saw an inadequacy that other reviewers may have missed.
After reviewing each test item individually, the reviewers are asked to judge the test as a whole by answering a question similar to this.
"Considering all of the items in the assessment device you just reviewed, do the items, taken as a whole, offend or unfairly penalize any group of students on the basis of personal characteristics, such as gender, ethnicity, religion, or race?"
After reviewing each test item individually, the reviewers are asked to judge the test as a whole by answering a question similar to this.
"Considering all of the items in the assessment device you just reviewed, do the items, taken as a whole, offend or unfairly penalize any group of students on the basis of personal characteristics, such as gender, ethnicity, religion, or race?"
The responses are taken into account, and like with individual test items, the reviewers are encouraged to leave a comment if they believe the test is biased. Usually, if a test reviewer finds a problem it can be corrected quickly. For example, if a reviewer noticed that all of the word problems contained males instead of females some of them could be changed to females. The actual content would not have to be changed, just minor details.
Empirical bias-reduction approach
If a high-stakes test is administered to a large group of students, it is possible to gather evidence which shows the achievement disparities between different subgroups being tested. If there are achievement gaps among certain questions, the test items will be flagged and reviewed for potential bias that may exist later. Even though a question may get flagged, that does not necessarily mean bias exists. It will be reviewed and decided upon later before putting it on the actual test (Popham 135).