Towards more replicable content analysis for learning analytics

Kitto, Kirsty; Manly, Catherine A; Ferguson, Rebecca and Poquet, Oleksandra (2023). Towards more replicable content analysis for learning analytics. In: LAK23: 13th International Learning Analytics and Knowledge Conference (LAK2023), ACM, New York, NY pp. 303–314.



Content analysis (CA) is a method frequently used in the learning sciences and so increasingly applied in learning analytics (LA). Despite this ubiquity, CA is a subtle method, with many complexities and decision points affecting the outcomes it generates. Although appearing to be a neutral quantitative approach, coding CA constructs requires an attention to decision making and context that aligns it with a more subjective, qualitative interpretation of data. Despite these challenges, we increasingly see the labels in CA-derived datasets used as training sets for machine learning (ML) methods in LA. However, the scarcity of widely shareable datasets means research groups usually work independently to generate la- belled data, with few attempts made to compare practice and results across groups. A risk is emerging that different groups are coding constructs in different ways, leading to results that will not prove replicable. We report on two replication studies using a previously reported construct. A failure to achieve high inter-rater reliability suggests that coding of this scheme is not currently replicable across different research groups. We point to potential dangers in this result for those who would use ML to automate the detection of various educationally relevant constructs in LA.

Viewing alternatives

Download history


Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions