Skip to content

As Biology Coursework Analysis


How To Write Science Reports & Science Practicals For Biology, Chemistry & Physics

This guide can be used by GCSE science and AS Level and A Level biology, AS Level and A2 Level chemistry and physics students who need to help to write up science coursework as part of their syllabus. This can apply to AQA, Edexcel, WJEC, OCR, SQA and CCEA specifications. However it also can be used as a general guideline for students who require help, advice and tips on how to write science practicals, scientific experiments and science reports for degree and university levels. It can also help students with the writing of science experiments and reports for Medicine, Biochemistry, Biomedical Science and Forensic Science as well as other subjects including Psychology, Ecology and Environmental Science.

The Basics - How To Write A Science Experiment, Chemistry or Biology Report

There is a general standardised formal structure to writing science, biology and chemistry reports which you will need to follow. This section will deal with the basics that all science students should be aware of when writing up chemsitry coursework or biology coursework. All science practicals should be written in impersonal past tense. Impersonal means that you should avoid using any phrases that include personal terms such as we or I. Past tense means that you must describe the experiment as if it has already been carried out (with the exception of any planning section that you may need to submit as part of your report). Avoid using future tense in any science, chemistry or biology practicals or writing it as if you are describing a method or instructions for others to follow. This can be quite difficult to master at first. Here is a couple of examples of right and wrong.

Wrong Personal - "WE added five cm3 of buffer solution to tubes A and B and then I incubated both tubes in a water bath at 37 degrees centigrade."

Right Impersonal - "Five cm3 of buffer solution WAS added to tubes A and B and THEN both tubes were incubated in a water bath at 37 degrees centigrade."

Wrong Future Tense - "YOU WILL need to label 5 tubes from one to five and add 1cm3 of reagent to each." Or "WE WILL be labelling 5 tubes from one to five and ADDING 1cm3 of reagent to each.""

Right Past tense - "Five tubes WERE labelled from one to five and 1cm3 of reagent WAS ADDED to each."

General Layout For Science Coursework & Scientific Experiments

The sections usually included in science reports are:-

  • Title
  • Aim or Abstract
  • Introduction
  • Method
  • Hypothesis
  • Results
  • Conclusion
  • Discussion/ Evaluation/ Methods of Improvement
  • Reference/Bibliography

Title of your Science Report

Your science report title should be short but detailed enough to accurately describe the work that has been carried out. At the top of your report you should also include the date and your name (the author) and the name of any collaborators if there were any.

Aim or Abstract

Aim - The aim section should describe what the purpose of the biology or chemistry experiment is in no more than two or three sentences. This is fine for most reports for high school up to GCSE level.

Abstract - As you begin to study at a higher level i.e post 16/ undergraduate / university / postgraduate you will need to include an abstract section instead. This is a summary in one paragraph of the entire work including results and conclusion. In academic publications the abstract is useful as it allows others to quickly judge if your work is relevant and of interest to them and warrant more detailed reading. It provides a similar role to the summarised content you find on the back covers of books. The most difficult aspect in writing an abstract is trying to summarise a long and complex report in a short paragraph without leaving out anything important.

Background / Introduction

This shows your understanding of the science behind the report but it must be relevant. In the science coursework background / introduction you include any background information you have found whilst researching your topic. Include relevant previous work done by yourself or others. Diagrams and images can be included if you so wish, but remember if you do use other sources such as books, other publications or internet sites then you MUST list those sources in your reference/ bibliography section.

Scientific Hypothesis

Your science experiment hypothesis should be clear and be in the form of a question that you want to find the answer to. Ideally the question should have a yes or no answer. For example in a chemistry practical the following hypothesis may apply : "Does vitamin C in orange juice oxidise over time when exposed to the air?" A clear and well written hypothesis can help point the way to what data you need to gather to help you find the answer. Once you know what data you need that helps in the experimental design. So as you can see the hypothesis is the foundation around which your report should be designed and built.

From the science experimental data you obtain (if the experiment is well planned and carried out) you should be able to say if the hypothesis has been either supported or not in your conclusion section.


Your science practical method should include a list of the equipment and the method used. Although you must write your method in past tense and not as a series of instructions to others, it must be detailed enough for others to follow and repeat your work if required. This reproducibility of results by others is one of the cornerstones of the scientific method.

Science Experiment Results

Your science experimental results section should be well presented and include your data in table and graphical form. Any calculations you used on your data including statistical tests if required should also be in this section. Presentation is everything and all graphs should have a title and all axis should be labelled. Do not scale your graphs so that they fill the entire page and butt right up against the margin (a pet hate of mine). It is better to divide your scale by two and have a smaller half sized graph in the centre of the page. If you use this approach you may be able to fit more than one graph per page allowing the reader to review your graphical data and spot trends more easily. If the graphs are on separate pages then they have to flip back and forth between them. You need to choose the correct type of graph for the data you are presenting. A histogram is ideal for comparing two groups whilst a line graph is better for showing how enzyme activity varies with temperature.

You must resist the temptation to make any comments on your results in the results section. That is what the conclusion section is for. An experienced scientist will, by looking at the your results, be formulating their own conclusions based on your data and you should not influence your reader by including your own thoughts and comments here. Once the reader has reviewed your data and maybe come up with their own conclusion they can then move on to the conclusion section and see if your conclusion and theirs agree.

Science Coursework Conclusion

This is where you review your data and state your opinions and arguments of what the results show AT LENGTH. A half page conclusion is not going to get you a good grade. You should quote the data in the results section in support of the scientific conclusion you are making. Such as "as can be seen by graph 3 there is a marked difference between group A and group B which allows the conclusion that ....." etc.

You should state if the hypothesis has been supported or not. Your readers can then decide if they agree or disagree with your conclusions. This is the basis of scientific debate. If the data obtained is not sufficient to support or reject the hypothesis state why and propose further work that will help to generate more data allowing you can draw a firmer conclusion.

Science Report Discussion / Evaluation/ Methods of Improvement

You should include here sources of error that might effect the results. Remember a good scientist is always self critical. If you have used measuring apparatus such as weighing scales for mass or glassware for measuring volumes then you may need to calculate the percentage error of the measuring apparatus. The formula is the smallest graduation halved then divide that number by the volume or mass measured. Finally multiply by 100 to get a percentage. If you have used a pipette to measure 20cm3 of solution and the smallest graduations on the pipette are 0.1cm3apart this means that the actual volume dispensed may be 0.05cm3 (which is (0.1/2)) above or below 20cm3. The actual amount dipensed will be between 19.95cm3 and 20.05cm3.

To calculate the percentage error for this measurement

0.1/2=0.05 and (0.05/20)*100 = 0.25%

Also you could propose here further work or investigations that might be able to produce further data that will support your conclusions.

Reference / Bibliography

You must include all the external sources of information you have used in compiling your report. Each source should be described in sufficient detail to allow the reader to locate and read the source themselves. One standard way for example of quoting a section in a book would be to in this format.

Author(s), Book Title and Edition (year of publication),Publisher, page numbers

for example

Stryer L, et al, Biochemistry 5th ed (2002),W.H.Freeman & Co Ltd, p102-105

Or for a magazine or paper in a scientific journal.

Author(s)(year),Title,Publication and volume, page numbers.

for example

Watson J.D. and Crick F.H.C.(1953) A Structure for Deoxyribose Nucleic AcidNature 171, 737-738

If there are more than two authors for a source name the main or first author and use the Latin "et al" which means "and others".

For website sources you should quote the full website address.

We hope you have found this guide on how to write a science practical useful and wish you the very best with your grades.

By Emlyn Price - Home Tutors Directory

This guide is original and has been based on our own experiences of advising students during many years of tuition. It is protected by copyright. You may use this for your own personal use or for teaching purposes. It should not however be re-published wholly or in part on other websites or in written publications and certainly not passed off as anyone else's work. If you have seen this article published elsewhere we would like you to let us know by contacting us here.. We can then take action against them.


Search by Subject

Search for lessons by subject throughout the UK by selecting from the drop down list below.

Search for a tutor by discipline:-


Biochemistry is a difficult subject for most students [1] in part because biochemistry is full of abstract concepts [2] that are difficult to understand if students cannot relate them to everyday experiences. Furthermore, the study of biochemistry is comprised primarily of application of previously learned concepts to new, biological contexts. Therefore, to be successful, students must be able to make connections between the new information and their existing knowledge. However, many students who enter a biochemistry course come with misconceptions or “incorrect ideas,” which they perpetuate in a biochemical context. The result is that students struggle to learn biochemistry and likely further solidify their incorrect ideas. Incorrect ideas, then, must be identified and addressed during biochemistry instruction. Therefore, a multiple-choice instrument is being developed based on typical incorrect ideas students bring from general chemistry and general biology. This instrument can be used as a pretest and as a posttest. If used in both ways, this instrument can provide instructors with information about the extent to which teaching strategies help students to overcome those incorrect ideas identified at the beginning of the course.


Learning and Students' Incorrect Ideas

Student learning has been a topic of research for science educators, including chemical educators, for several decades. Researchers have focused on why students have difficulty learning sciences, including biochemistry [3, 4], biology [5], chemistry [6, 7], genetics [8], and physics [9]. Other researchers have focused on examining how students learn [10–13]. Different theories and models have been developed to examine how students learn and why students have difficulty learning.

In the last two decades, research in science education has focused on the fact that student learning is constructed in the mind of the learner [14, 15]. Researchers have found that students' prior knowledge has a great influence on their ability to build new scientific concepts [6, 16–18]. If students have prior knowledge that is incomplete, poorly understood, or disconnected, they are unlikely to understand the new information. Therefore, students will have difficulties applying or transferring their knowledge [15, 19, 20]. In contrast, students' ability to interpret a concept, apply it to a new situation, and transfer it to other disciplines is evidence of true understanding [20, 21]. For example, students' understanding of the different levels of protein structure (primary, secondary, tertiary and quaternary structure) and the inter- and intramolecular forces that stabilize these structures is essential to learning in biochemistry. Therefore students must understand the fundamentals of intermolecular forces including recognition of hydrogen bonds to be able to understand this biochemistry concept.

For our purpose, incorrect ideas will be defined as any student's conception that is inconsistent with the accepted scientific definition or understanding of that concept. The incorrect ideas that form the foundation of this study have been observed by numerous faculty members while teaching their courses or have been previously reported in the literature [22–24].


Research has shown that some incorrect ideas are stronger than others, and that sometimes, they are resistant to change [17]. That statement has many implications for the way science is taught since as described by Garvin-Doxas et al. “…a new concept cannot be learned until the student is forced to confront the paradoxes, inconsistencies, and limitations of the mental model that already exists in the student's mind” [25]. Different teaching strategies have been reported as a way to overcome incorrect ideas, such as the use of active learning [15, 26] or cooperative learning [27–29], the use of analogies [2, 4, 30], and the use of computer software [31–33]. However, all these teaching strategies may fall short if teachers cannot identify which incorrect ideas students bring to the course. For that reason, targeted assessments are needed to probe prior knowledge [3, 34, 35].

Several assessments, also called concept inventories, are available to measure students' incorrect ideas and learning in different subjects [36–41]. Concept inventories have been useful for science teaching. For example, the force concept inventory (FCI) was published in 1992 [37], and since then, research has demonstrated how instruction has been improved in physics with its use [40]. This instrument has served as the basis for other inventories developed for sciences including biology [36, 42], chemistry [39], genetics [40], and molecular life sciences [38, 41]. These instruments consist of multiple-choice questions that target a specific concept, which has been demonstrated to be difficult for students. The options in each question are based on previously identified misconceptions or incorrect ideas [20]. The identification of these incorrect ideas can be made through interviews [20, 42] and/or based on research literature [40, 42]. These inventories, besides measuring conceptual understanding of students, can be used to compare different types of instructional strategies [40].

Although these inventories are focused on identifying students' incorrect ideas in science, similar to our goal with this instrument, few of them include both general chemistry and general biology concepts in one instrument. Some of the items in those concept inventories have been useful for developing our instrument; however, there is little available information about the psychometric properties of the items. Without psychometric information of the items, we do not know how well they are functioning. In addition, every time an instrument is developed, the psychometric properties and the validity of the instrument scores need to be examined regardless of whether the items are newly developed or modified from another instrument. For these reasons, we are developing this instrument that is designed to measure eight different concepts from general chemistry and biology that are considered prerequisites for biochemistry. It has been developed with a specific structure where three multiple-choice questions are used to measure a single concept. Having three multiple-choice questions for each concept gives us the opportunity to have replicate trials measuring the same concept. Therefore any conclusion made about a specific concept will be based on three answers rather than only one. The instrument's psychometric properties, such as reliability and validity of the scores, will be used to examine the structure of the instrument and the extent to which it is measuring those concepts.


To assess and draw conclusions about the impact that the instruction has on students' incorrect ideas, we need an instrument that produces reliable and valid scores. Validity and reliability are very important aspects of measurement, and the quality of our assessment will depend on them [43].

Validity is defined as the degree to which an instrument's scores measure or reflect the construct the instrument is designed to measure. Validity is very important since the inferences we make about the measures will depend on the instrument scores and how well we are measuring that specific concept [44, 45]. There are different aspects of validity such as content validity, construct validity, convergent validity, criterion-related validity and discriminant validity. At this stage of instrument development, it is appropriate to focus on content and construct validity.

Content validity is concerned with the instrument's ability to include or represent the content of a particular domain [44–46]. This aspect of validity is very important in the development of any instrument, and it does not depend on students' scores. Content validity is determined by experts in the field, and it includes revising the content of an instrument for clarity, correctness, and relevance and deciding to what degree the items reflect the content domain.

Construct validity is concerned with how well the item scores for the constructs or concepts within the instrument are behaving as theory predicts [44–46]. The items in an instrument should reflect the concepts that are being measured; therefore, the scores for related items should be correlated. If the scores are not reflecting the concept as expected, the items should be modified and retested. Factor analysis is one of the most used techniques to examine construct validity. It is concerned with the internal structure of the instrument, since it is based on the relationship between the scores of items measuring the same concept. In the case of this instrument three items are measuring the same concept. Factor analysis uses the covariance among those items' scores to discover a pattern that reflects the relationship between them.

Confirmatory factor analysis (CFA) is used when the intended underlying structure of the items is known. Therefore CFA is used to determine if the number of concepts and the relationships between the items make sense with what it is expected in the proposed model [47]. This instrument was designed with a specific structure, three items per concept. The proposed model for CFA will distinguish among the different concepts and the items that are intended to be a measure of them. To determine how well the proposed model fits the data, different types of fit indices are examined. Absolute fit indices tell us how well the proposed model fits the data, comparing it with the best possible model. One such index is the chi square test of model fit (χ2). This index is very dependent on the sample size, so as sample size increases, the chance to observe significant lack of fit between the proposed model and data increases [47]. Therefore, other indices that are insensitive to sample size are taken into account, such as parsimony indices, which indicate how close the proposed model is to the data. A parsimony index for categorical data is Weighted Root Mean Square Residual (WRMR), with values less than 1.0 representing a good fit [47, 48]. Another type of index is the incremental fit index. This index, different from the others, compares the proposed model with a completely uncorrelated model. Values close to one indicate a good fit. One common incremental fit index is the Comparative Fit Index (CFI). Hu and Bentler (1999) have indicated that CFI values greater than .95 indicate a good fit [49], although some researchers use .90 [50]. These parameter cut-offs give us an idea of how well the proposed model fits the data, so for a model to work we are hoping for values close to or better than those values indicating a good fit.

Reliability is another aspect of measurement that should be studied to determine if the instrument is functioning well. An instrument that is functioning well is one that yields scores that are consistent, that is, items that are related or measuring the same concept should produce scores that are correlated. This type of reliability is known as internal consistency, and one approach to calculate it is using Cronbach's alpha. This coefficient depends on the correlation among items, the number of items and the variance among item scores. The higher the correlation among items measuring the same concept, the higher the value for the alpha. In the same way, Cronbach's alpha increases as the number of items increases. In the literature, there are different cutoff values to determine if the test scores are reliable or not, but it depends on the test purpose [45]. The most common cutoff reported is .70 [45, 51].

This article describes the development of a multiple-choice instrument that captures basic information from prior courses, such as general chemistry and biology. It is assumed that students have completed these courses or the equivalent before entering biochemistry. The results from an administration of the instrument including descriptive statistics, reliability, and CFA are also discussed.


Context for Instrument Development

This instrument has been developed as part of the evaluation of a NSF-funded project, POGIL Biochem: Advancing Active Learning Approaches in Biochemistry. This project is currently engaged in field-testing a coherent set of Process Oriented Guided Inquiry Learning (POGIL) activities for undergraduate biochemistry with the aim of broad dissemination. A set of 20 core collaborators have been involved in different activities, such as the evaluation and the testing of the materials, as part of the dissemination of the project. These core collaborators are experienced biochemistry faculty from a variety of colleges and universities across the US. They also participated in a summer workshop in which items for this instrument were written.

Development of the Instrument

The development of an instrument that produces valid and reliable scores is an iterative process that involves several steps as shown in Fig. 1. The first step involved investigating content validity of the constructs to be measured in the instrument. We started with a list of concepts given as declarative statements. These concepts from general chemistry and general biology are considered prerequisite knowledge for biochemistry courses and are aligned with the core collaborators curricula. Three of the authors, with assistance from a group of five biochemistry instructors, compiled a list of statements that were chosen based on previous teaching experience. Additional information about biology concepts was provided by a sixth biochemist who also teaches general biology. A total of 11 statements were written, six encompassing chemistry concepts, four from biology, and one pertaining to graphing skills. A panel of nine reviewers, which included faculty from eight research universities across the US, examined the statements. Two of the reviewers were biochemists, three of them were chemists, and four were biologists. To examine the validity of the statements, the reviewers were asked to consider three questions for each statement:

  • Are they factually correct?

  • Are they clear and unambiguous?

  • Are they relevant to biochemistry learning?

Modifications to the statements were performed according to experts' suggestions.

After modification of each statement, the authors identified three incorrect ideas that students could have. For chemistry prerequisite knowledge, the incorrect ideas were identified from the literature and from the authors' and core collaborators' teaching experience. For example, students have numerous incorrect ideas related to hydrogen bonding as an intermolecular force, specifically the incorrect idea that all hydrogens are capable of hydrogen bonding. This incorrect idea was also presented in a study by Tarhan et al. [23]. Another topic that students struggle with is bond energy. Some of the incorrect ideas are that “bond formation requires energy while bond breaking releases energy” and that “energy is required in both bond forming and bond breaking” [22, 24].

The 20 biochemistry core collaborators reviewed the chosen incorrect ideas as part of the content validity process, suggesting changes based on their own teaching experiences. The final set of incorrect ideas was used in the second step of instrument development, which involved item (multiple-choice question) development by core collaborators at a summer workshop. They worked in pairs and were assigned to write three multiple-choice questions for each of two concepts, with a specific question structure describe below in Table I. The resulting pool of multiple-choice questions contained 85 items.

StemDirectStatement is presented  directly in the question
InverseStatement is presented in  an inverse way
AppliedStatement is presented in  another context or applied  to a situation
OptionsFour answer optionsOne correct answer and three  incorrect ideas
Consistent incorrect ideas as distractersShould be the same in each  related question

The last step in the development of the instrument involved choosing the items to be tested and the revisions of the final items. This step was performed by the authors. An evaluation of items written for the eleven statements revealed that three of the statements were too broad; as a result items based on these statements were vague or contrived. Therefore, the set of eleven statements was narrowed to eight, leaving items from the most coherent statements as part of the final instrument. Next, from the pool of remaining multiple-choice questions, a total of 24 questions were chosen as the final set to be tested. Three items matching each of the three formats described in Table I were chosen for each of the eight remaining statements. Items were chosen based on their clarity and adherence to the prescribed format. Four of the authors performed the final revisions of the items for content and clarity. The resulting instrument contained 24 multiple-choice questions, each with four options involving three common incorrect ideas as distractors and one correct choice. The definitive set of statements and the incorrect ideas are shown in Table II. We acknowledge that this set of statements represents a small portion of important concepts that are prerequisite for biochemistry, but to have an instrument with a reasonable length, we limited the number of concepts to eight.

Bond energy: When a chemical bond forms, energy is released.Bond formation requires energy.
Bond formation sometimes requires energy and sometimes releases energy.
The strength of the bond determines when energy is released or absorbed when bonds are formed.
Free energy: the free energy change for a process (ΔG) indicates whether or not a process is spontaneous at a given temperature.The free energy change for a process indicates whether or not the process releases heat.
Heat is released in all spontaneous processes.
A spontaneous reaction proceeds quickly.
London dispersion forces: London dispersion forces are the only type of noncovalent interaction that can occur between nonpolar molecules.London dispersion forces are only found in nonpolar molecules.
There are no attractions between nonpolar molecules
A dipole is not involved in the interaction between nonpolar molecules.
pH/pKa: Comparing the pH value of an aqueous solution of a substance to the pKa values of an ionizable group gives information about the ionization state of that group.At the pH = pKa, the group is totally protonated or totally deprotonated.
When pH is below pKa species are deprotonated or when pH is above pKa, species are protonated.
The ionizable groups are unaffected by pH.
Hydrogen bonding: A hydrogen bond is a noncovalent interaction typically between N, O, or F and a hydrogen atom bonded to N, O, or F.All hydrogens are capable of hydrogen bonding.
A covalent bond with a hydrogen in it is a hydrogen bond.
Any polar molecule can make a hydrogen bond.
Alpha helix: The interior of an alpha helix contains atoms from the protein backbone in close contact.The interior of an alpha helix contains the side chains (R-groups) of the amino acid residues.
The interior of an alpha helix contains water molecules.
The interior of an alpha helix is empty.
Amino acids: An internal (not terminal) amino acid in an unmodified peptide has no more than one fully charged group.Internal amino acids have at least two charged groups.
Some internal amino acids have three charged groups.
Free amino acids have the same overall charge as peptide internal amino acids.
Protein function: Changes in amino acid sequence of a polypeptide sometimes change protein function.Changes in amino acid sequence always change protein function.
Changes in amino acid sequence never change protein function.
Changes in amino acid sequence only decrease protein function.

Like other multiple-choice tests intended to provide rigorous assessment of student learning over time, the security of this instrument must be maintained so that it will remain useful and the data meaningful. Wide availability of the test on the internet would compromise security and make results difficult to interpret. Instructors interested in using the instrument in their own classrooms may contact the authors about procedures for accessing and securely administering the test.

Testing and Participants

The instrument was administered as a pretest at a research-extensive university with a focus on undergraduate and graduate education in the Midwestern United States in Spring 2010. The pretest was part of an introductory biochemistry course that is required for students majoring in biochemistry, chemistry, and for most biological sciences majors. The course is also populated with some nutrition and health science, agronomy, and food science majors. The course typically comprises 90% undergraduate students and 10% graduate students. The majority of the undergraduates are in their third year of study (75%), with some in their senior year (25%). The students were given ∼ 30 minutes to complete the pretest, and it was administered at the end of the second day of class. The instructor of the course explained the purpose and the importance of the pretest for instruction and encouraged students to complete it to their best ability. The data collected from the instrument is presented here and used to investigate the validity of the instrument's scores.

Analysis of Data

A total of 185 sets of data were received from the instrument, but only those with responses for all items (166) were used in the analysis. These 166 complete sets of data were coded assigning 0 for the incorrect answers and 1 for the correct answer using SAS statistical software version 9.2. Descriptive statistics were obtained and reliability analyses were performed using SPSS statistical software version 17.

CFA to estimate how well the proposed model fits the coded data was performed in Mplus 5.2. It was run on a first-order model (8-factor solution), where the concepts were set to correlate with each other. Since we are using categorical data, a Weighted Least Squares – Mean and Variance adjusted (WLSMV) method estimation that uses the tetrachoric correlation matrix of the measured items was employed [47, 48].


Descriptive Statistics

Descriptive statistics including mean, standard deviation, skewness and kurtosis are shown in Table III. The mean obtained was 11.1 points out of 24 possible points (46%), a low percentage, indicating that students do not have a strong grasp on the concepts in the test. The distribution of the scores is approximately normal, since its skewness and kurtosis values are within the range of ±1, a desirable characteristic for a good instrument.


Confirmatory Factor Analysis

The construct validity for the instrument was investigated using CFA. By design, we have three multiple-choice items per concept; therefore, the model was proposed to have eight concepts or factors: bond energy, free energy, London dispersion forces, pH/pKa, hydrogen bonding, alpha helix, amino acids, and protein function.

A CFA was performed to determine how well the data fit the proposed model. The results from the different fit indices for the eight-factor solution are shown in Table IV. As we can see, the χ2 and the p value indicate a nonsignificant difference between the proposed model and the data, an excellent fit. Values from the other fit indices, as shown in Table IV, indicate that the values of CFI and WRMR are within the cutoff values. The results indicate that the model fit is very good for the eight-factor solution, which is a positive result, considering that the instrument was designed to measure eight distinct concepts.


To determine how well the items are loading on each factor, the standardized factor loadings for the items in each factor or concept are analyzed. These standardized factor loadings indicate the correlation of each item with the latent variable being measured by the factor when the variance of the factor is set to one. These loadings, measured by the pattern coefficients for each concept, are presented on Table V. The standardized factor loadings for the CFA showed statistically significant loadings for the majority of the items in the eight concepts.

8 0.692a      
14 0.717a      
19 0.896a      
6  0.709a     
10  0.569a     
20  0.756a     
4   0.941a    
15   0.504a    
24   0.534a    
1    0.649a   
12    0.860a   
23    0.222   
7     0.966a  
16     0.963a  
21     0.941a  
2      0.682a 
5      0.595a 
13      0.432a 
9       0.501a
17       0.246
22       0.584a


The reliability of the factors was assessed using Cronbach's alpha coefficient as a measure of internal consistency. The coefficients for the eight concepts are shown in Table VI. As shown in the table, three concepts: hydrogen bonding, amino acids, and protein function, had the lowest coefficients. These results indicate weak correlation among students' responses to the items in those concepts. On the other hand, Cronbach's alpha coefficient depends on the number of items, and since each concept has only three items, that could affect its values [45, 51]. Also, low true variability in students' responses, including random guessing, tends to decrease Cronbach's alpha values. Perhaps the best way to think of this reliability information is that, even in the case of the lowest Cronbach's alpha values, the three items together are still a more reliable measure than a single item would be.

Bond energy0.7980.050.23
Free energy0.6590.230.43
London dispersion0.5420.330.47
Hydrogen bonding0.3060.120.33
Alpha helix0.8780.280.45
Amino acids0.3960.420.50
Protein function0.3080.170.38

The mean score and the standard deviation for each concept are presented in Table VI. The mean for each concept was calculated based on students' correct answers for the three items that are measuring a concept. The range for the mean is from 0 to 1: if the mean for a concept is 0, none of the students answered the three items correctly for that particular concept; if it is 1, all students answered the three items correctly. For example, as shown in Table VI, free energy concept had a mean of .23, which means that 23% of the students answered all the questions related to that concept correctly. In this case, the mean ranges from .05 for the bond energy concept to .42 for amino acids.

To understand the distinction between reliability and evidence of understanding, consider the bond energy concept, which had the lowest mean but good reliability evidence. The items in the concept are shown in Table VII. In this case, the majority of students do not understand the concept, but very few got one of the questions correct and the others wrong, leading to the high reliability.

3. Heat is given off when hydrogen burns in air according to  the equation: H2 + O2 → 2H2O. Which of the following is  responsible for the heat? [39]
 a. Breaking the bonds in H2 and O2 gives off energy. (60%)
 b. Forming bonds to make H2O gives off energy. (16%)
 c. Both a) and b) above are responsible for the heat. (12%)
 d. Breaking bonds in O2 is responsible for the heat because they are stronger than bonds in H2. (13%)
11. Which statement about the breaking of a single chemical bond is true?
 a. Energy is released. (84%)
 b. Energy is absorbed. (6%)
 c. Energy is released or absorbed depending on the polarity of the bond being broken. (2%)
 d. Energy is released or absorbed depending on the strength of the bond being broken. (8%)
18. Which statement is always correct about the energy changes that occur during bond formation? (Cooper and Klymkowski, personal communication)
 a. Depending on the relative electronegativities of the atoms making the bond, energy may be released or absorbed during bond formation. (10%)
 b. During bond formation, energy has to be added and will be restored in the form of a bond. (77%)
 c. Bond formation releases energy to the surroundings. (8%)
 d. The strength of the bond determines whether energy is absorbed or released during bond formation. (6%)


A method for the development of an instrument based on typical incorrect ideas from general chemistry and biology that students have when entering biochemistry has been described. This method includes different measurement aspects to ensure that we are developing an instrument that will produce valid and reliable scores for our intended usage with biochemistry students. One aspect was content validity, which includes the development of the items to be tested. Another aspect was construct validity as measured by CFA and reliability as measured by the internal consistency of the scores using Cronbach's alpha coefficient.

The instrument tested consists of 24 multiple-choice items representing eight distinct concepts: bond energy, free energy, London dispersion forces, pH/pKa, hydrogen bonding, alpha helix, amino acid, and protein function. Each concept is represented by three items that allows us to draw conclusions based on the pattern of the students' answers. Results obtained from a CFA indicated that an eight-factor solution had an excellent fit since values for the different fit indices, CFI, and WRMR, are within the established cutoffs values including a nonsignificant χ2. Cronbach's alpha coefficient for each concept was calculated, and the results indicated low values for three concepts: hydrogen bonding, amino acids, and protein function. Since this method is an iterative one, the next step for the instrument will be tightening the parallel structure of the items within these concepts and re-reviewing its content validity. Finally, the instrument will be used as a pre/post-test to assess gains in student understanding of foundational concepts in chemistry and biology over the course of learning biochemistry in the context of the POGIL Biochem: Advancing Active Learning Approaches in Biochemistry project. Future work will focus on analysis of pre/post-test results as well as strategies that instructors can use to address students' incorrect ideas.

The process of instrument design described here demonstrates that a systematic plan for instrument construction combined with rigorous, iterative analyses, can produce an assessment instrument that functions well according to psychometric standards. Our experience also illustrates the challenges inherent in creation of an instrument. Reflecting on the process, two keys to our success stand out: 1) creation of explicit structure to guide writing of multiple-choice questions and 2) help of a relatively large number of experienced biochemistry instructors in writing instrument items. Therefore, in addition to generating a useful assessment instrument, we hope that our method of instrument development will be generalizable and serve as a model for design of instruments useful to biochemistry educators in the future.