Reading Concordances with Algorithms

📅 Dates: 28 June 2025, 9:00–12:30 AM

📍 Part of: CL2025: Pre-conference workshops

🔗 More Info: CL2025 Homepage

Instructors

Prof. Dr. Stephanie Evert

Prof. Dr. Michaela Mahlberg

Nathan Dykes, M.A.

Dr. Aleksandr Piperski

Workshop description

Concordance analysis via a KWIC (Key Word In Context) display is a mainstay of contemporary corpus linguistics, serving as a bridge between qualitative and quantitative approaches to the study of text. This course offers an introduction to “concordance reading”, i.e. analysis of concordance data supported by computational algorithms. We begin by situating concordance reading in the wider context of qualitative-quantitative research. We introduce key concepts for the description of patterns in concordances (e.g. collocations and colligations) as well as different examples of concordance software (e.g. AntConc, CLiC, and CQPweb). We then focus on specific concordance reading strategies, such as selecting, sorting, and grouping concordance lines, providing formal definitions and corresponding computational algorithms. Participants will gain hands-on experience working with FlexiConc, a computational library for concordance analysis to be released in autumn 2024, and other concordance tools. We will give participants the opportunity to consider the potential of concordance reading for their own research contexts.

Aims:

Introduction to Concordance Analysis

basics of concordance analysis
fundamental terms and concepts
concordance software and its functionalities
strategies for organizing concordances (different types of selecting, ordering, and grouping)

Computational algorithms

algorithmic approaches to concordance reading

Hands-On Practice

hands-on exercise where participants try out different concordance algorithms

Analysis trees for research documentation

Prerequisites

This workshop targets an interdisciplinary audience, including students and researchers in corpus linguistics, general linguistics, computational linguistics, digital humanities, and computer-assisted language learning. We will keep the technical discussion to a manageable level to accommodate participants from both technical and non-technical backgrounds. Those interested in advanced techniques, such as more low-level concordance processing using Python, will be directed toward additional resources and follow-up materials after the session.