N. F. Blake "The Canterbury Tales Project: The New Lineation

The Canterbury Tales Project aims to transcribe into computer form every line found in all fifteenth-century witnesses of the poem. This naturally means having a lineation which enables the computer to recognize the same line in any manuscript and which makes sure that every line has a number which is different from any other line.

At present there are three major systems based on the ten fragments/groups isolated by Furnivall in the nineteenth century. Either the ten fragments/groups are identified by the Roman numerals from I to X following the Furnivall order, or following the Bradshaw shift which places Fragment VII after Fragment II and amalgamates them as a single group, they are identified by the first nine capital letters of the alphabet from A to I (with Fragments II and VII sharing the letter B). The order of the fragments in the latter numbering system is different from that in the former one. These two systems are usually found side by side, since the lines included in any fragment/group are the same whether the editor is using one system or the other. They are used, for example, in The Riverside Chaucer (1987) and are commonly found in most critical works. The third system is that found in Blake's edition of the poem based on the Hengwrt manuscript (1980) and is based on the sectional grouping of the tales found in Hengwrt. Since there are twelve sections there, the numbering in this system runs in Arabic numerals from 1 to 12. This system has not been widely adopted.

When we came to start the transcription of the Wife of Bath's Prologue we followed the traditional fragment numeration, but we soon ran into difficulties. Those who have tried to use the Manley-Rickert edition of the poem (1940) know how difficult it can be, when using their apparatus criticus, to find out precisely what a given manuscript contains at any particular point. They themselves evidently toyed with the idea of developing a new lineation, but eventually decided against it. But their purpose was to produce an edition of what they considered the original master copy of the poem contained and so they were less concerned with all those parts of the poem which they regarded as spurious because they were not in that master copy. We cannot take that view since we are recording everything from the fifteenth century, whether it turns out to be spurious or not. We do not want to prejudice what that outcome may be since we are interested at present in simply providing the data which would allow more informed discussions of the text's development. Therefore, a new system of lineation became imperative and had to be in place before the first CD-ROM was issued. The following paragraphs outline the general principles behind the new system.

In order to compare any line or group of lines across all witnesses, the computer must recognize what lines are identical or based on the same exemplar so that it can build up an apparatus criticus. Equally it must know in what order the lines occur in the different witnesses so that the orders and what has been added or omitted can also be compared. Where one line replaces another, it must recognize that these are variants which occupy one slot though they are not the same. Each line must have a distinctive label and that label must not only be unique but also identify the position of the line in a sequence. Quite apart from lines of text, there are numerous headings, sub-headings, quotations and marginal glosses, and all of these must have labels which allow them to be compared where necessary and their position in the text to be identified.

One aim of the project is to chart the development of the text, both the main body of the poem and the marginal annotations, over the fifteenth century. It is important not to choose a lineation system which prejudices the final outcome by implying one order or arrangement of lines is the starting point of the text's development. This is naturally difficult, for any base manuscript chosen for the lineation system could be taken to be programmatic. Finally, it is helpful if the new system is not too far away from those currently available, but is at the same time sufficiently distinctive that it will not be confused with them.

We have tried to implement these principles in the following ways. After a consideration of the text in all witnesses, we have divided the poem into blocks which are maintained as separate entities in all the extant complete witnesses. This means that the principal units are the tales and links. Some tale units may have the prologue or epilogue included provided that the prologue or epilogue is always attached to the tale and that it does not open or close another tale. Tale units are distinguished by two capital letters referring to the teller or the tale. Thus they may reflect the teller's title: KN for Knight, PL for Ploughman; and WB for Wife of Bath, CY for Canon's Yeoman; or the mane of the tale: TT for Tale of Sir Thopas, TG for Tale of Gamelyn. These abbreviations refer to whatever is included in that tale unit, which is sometimes merely the tale, but is on occasions both the tale and the prologue and/or epilogue, as is true of CY. Where confusion could arise because two pilgrims could share the same two letters (e.g. Friar and Franklin), they are distinguished usually through following traditional abbreviations: FR for Friar and FK for Franklin.

Link units, on the other hand, are distinguished by L and an arbitrary number from 1 to 37. For example, L1 is the KN-MI link. These arbitrary numbers have been allocated because some links may be allocated in different witnesses to different tellers (and to refer to a link by one pair of tellers might prejudice what tales the link originally joined) and because some tales are joined together by different links in different witnesses. In the first case the link unit is treated as a single link even though it refers to different tellers, but in the second case the links are treated as separate units, although they are given continuous numbers in the sequence. Thus RE 95 would refer to the ninety-fifth line of the tale unit allocated to the Reeve, and L4 1 would be the first line of link unit 4, which is the link uniting TG to CO in Rawlinson 141.

For practical purposes it is easier to start with a lineation system based on manuscript chronology, since the early manuscripts are well known and studied. To give line numbers based on a system of the lines in all the witnesses after they had all been transcribed and counted would delay a lineation system until all transcriptions were complete. The base text for lineation purposes is the earliest witness for that unit in accordance with current palaeographical dating. Although there may be discrepancies as to some datings and the order of some manuscripts, there is a general consensus about most. Thus we take Hengwrt as the base for much of the poem since it is widely believed to be the earliest extant manuscript. Where Hengwrt fails for the end of PA and for RT because some of the manuscript is now lost (and the same applies to Corpus Christi Oxford 198), we use Harley 7334 as the base.

The base text is followed scrupulously for lineation purposes except in a few minor instances where we are satisfied from the rhyme or stanza system that a line was intended but was inadvertently not copied. The numbering for each unit starts at 1 for the first line of the English text. Any heading for a unit is numbered 0, however long it may be. The numbering proceeds regularly to the end of that unit, with each unit being identified either by the two capitals of the tale units or L and a number for the link units. Additions in other witnesses have the base number of the preceding line, a slash and then numbering from 1. Thus WB 44/1 and WB 44/2 would be the first and second lines of an addition after WB 44 in a given witness in what is the Wife of Bath's tale unit (which in this case also includes the prologue). If the line numbering goes from WB 42 to WB 45, it would signify that WB 43 and WB 44 were missing in that witness. A variant of a line has the base number of that line, a slash and a lower-case letter starting with a. Thus WB 44/a and WB 44/b would be two variants, each different from the other and from the base, of line WB 44.

Sub-headings within the text are recorded as additions, but marginalia and glosses over the line are treated as glosses. They are identified by gl after the line number: WB 44gl would be a gloss to WB 44 and WB 44/2gl would be a gloss to WB 44/2. Where there is more than one gloss to a line, these are distinguished by a slash and a number after the gl so that WB 44gl/1 and WB 44gl/2 would be two separate glosses to WB 44.

It is hoped that this system is sufficiently transparent and close to previous numerations to make it readily understood by users of the computer texts. A full break-down of the units and their lineation will be provided in the second volume of the Occasional Papers. On the CD-ROMs, users will be able to switch between the old and new numberings.