Digitizing Gothic manuscripts

The advent of digital technology has opened the way for new methods in studying and presenting old texts. A digitized photo is divided into numerous elements (pixels ‘picture elements’) and each of them can be separately manipulated with appropriate software. As a result, the new technology offers new possibilities for filtering and singling out valuable information. The technology also facilitates easy presentation of the material with the help multimedia methods and wide distribution via the Internet and CD ROMs.

In this paper I firstly survey several projects presently conducted in the sphere of the study of old texts in various locations. Those projects are, for the most parts, in their initial stages. The amount of material to be processed is so enormous that present research is still in its pilot stage.

The lion’s share of this paper is devoted to my study of the Gothic manuscripts. The idea of using digital technology for this end was proposed by professor James Marchand in an article published in 1987, and in my work I follow his ideas. The preliminary conclusion I have reached while conducting this research is that, indeed, digital technology is a very useful tool for studying and presenting the manuscripts. I have also realized that the amount of work to be done is longer than one human life time.

1.  Introduction

One result of the spreading use of digital technology is the effort to make cultural heritage information widely available in digital form. A manifestation of this trend is the digitization of visual arts, sciences and cultural resources as well as the creation of online networks to provide access to them. The subject of this paper is how digital technology can be employed in deciphering and studying old texts with emphasis on study of the Gothic manuscripts.

The first part of the paper is a general survey of some projects currently being conducted around the world. I divide the survey into two sections; the first deals with digital study of old texts in a word processing mode and the second with texts in image mode. The former involves mainly the creation of text corpora and the latter involves digital image processing for proper deciphering and presentation. The second part of this paper is a study of the Gothic manuscripts from a digital prospective. I describe the Gothic manuscripts and the research conducted on them from the middle of 16th century up to the present. Every generation of scholars had used the technology of its own time for deciphering and presenting the Gothic manuscripts and the paper includes a short survey of what has been done up until now.

The idea using digital technology for handling the Gothic manuscripts was conceived by Professor James W. Marchand. I became interested in the subject through one of his students, Eugene Holman, who was a teacher of mine at the University of Helsinki. In this paper I follow closely the ideas Marchand proposed in an article he published on the subject in 1987.

The next step is a survey of the Gothic manuscripts and the problems encountered by those who endeavor to make them more transparent. Once the problems are defined, I examined the possibility of using existing character recognition algorithms for better deciphering but concluded that those methods are inapplicable for this study. The research also included experimentation with existing digital filters and, again, I found out them to be of little use for this study.

 

Consequently, I realized that I should develop other methods and appropriate software for advancing the study of the Gothic manuscripts, and possibly other old texts, with the aid of digital technology. In the third part of this paper, I describe the methods and software I developed for this end. Included are illustrations of the work I have conducted with the methods I developed. One task was trying to restore the images of the Codex Argenteus. In this respect I believe I have succeeded; restoring one image takes around two days, which means that since there are 374 plates, they all can be restored in three to four man- years of work. The second task was to try to sort out the chaotic Ambrosian palimpsests. In this respect only the first steps were taken. The amount of work to be done is so overwhelming that any attempt to suggest any time-frame would be futile at present. I end the paper with conclusions and suggestions for further research.

2.  Digitizing Cultural Heritage

The tradition of preserving society’s cultural heritage and enabling access to it, is as long as there has been cultural heritage. The first method, still used nowadays, is oral transmission. In the Passover Haggadah it is written that “… it would still be our duty to recount the story of the coming forth from Egypt; and all who recount at length the story of the coming forth from Egypt are verily to be praised.” Indeed, every Passover eve I recount, very briefly, the story to my children. The next step was mounting the words on some material like stone, clay, sheepskin, or papyrus. Whenever the means of preservation became too fragile and the text it contained was considered to be worthy of transmitting to the next generations, it was copied into another medium. When printing was invented, the process had been streamlined and the amount of material, which could have been preserved, grew considerably. Another innovation, photography, still enlarged the scope of material being preserved.

One theme that comes often forward when the subject of transmitting cultural heritage is being considered, is how to preserve the authenticity of the original item; in other words, how to remain as faithful to the original as possible, when transmitting it into another medium. In the case of the Gothic text, for the first two hundreds years scholars had tried, with various degrees of success, to reproduce the texts in fonts based (not accurately) on the original letters. Starting from the beginning of the 19th century, the original text has been produced as a transliteration in Latin script.

One person who was not content with this state of events was Sydney Fairbanks. As a “fruit of experiments in teaching the scripts of the great Codex Argenteus of Uppsala to students,” he became convinced that “the monument of Gothic should be printed in Gothic letters and not transliterated into Latin characters.“ Following this theme he wrote (1940: 324):

Modern scholarship should in this matter return to the sound practice of past generations. At its best, as in the Codex Argenteus, the Gothic-Wulfilan alphabet is  an  important  and  beautiful  cultural  heritage  of  the  Germanic  peoples.  Its structure emphasizes many points in the history of the Goths themselves: the preponderantly Greek background of the alphabet looks back to early Gothic affiliations with the Eastern Empire; the runic element associates it with the first Germanic writing of any sort; and those letters which are derived from the Latin alphabet recall graphically the victorious passage of the Goths from a Greek to a Roman sphere. Transliteration into Latin letters, however exact and unambiguous this may be phonetically, reduces printed Gothic to a colorless, raceless thing from which significant national and cultural values have been erased.

Adding (p 326):

Indeed, there is a certain grim irony in rendering the literary remains of the greatest of the early Germanic peoples in an alphabet of people and church in conflict with which Gothic culture perished.

However, to these days scholars have been studying the Gothic text mostly in its transliterated form. Although the facsimile edition, done in 1927 (see more details in section 4.6.), is indeed a masterpiece, in numerous places the original text is illegible and reading the Gothic text directly from it is, to certain extent, impractical

The idea of using digital technology for making the Gothic text more accessible was conceived by Professor James W. Marchand. In his article (1987), which is the basis of my work, he writes (p.25):

Our students unfortunately usually read Gothic in transliterated form, since it is so expensive to typeset it, given the fact that the fonts can only be used on those rare occasions when one prints Gothic. It is as if you taught Russian or Greek in Latin letters. Early scholars on Gothic tried to cut or cast their own fonts, with varying degrees of success. What Christopher J. Meyer and I decided to do was a radical departure from 20th century tradition, namely to teach students to read Gothic in the Gothic script. If we wanted to imitate early workers, we could do much better than they, but that would merely be a modernization and one font. We decided early on to put our Gothic out in the original script.

One more reason to work with the source material is the possibility of discovering details that were lost in the process of reproduction of the text in another medium. Eventually, any transliteration of old texts involves editing, no matter how one strives to be faithful to the original. Some information is lost either because it cannot be rendered properly in the target medium, or because the transcriber(s) did not notice certain details. In antiquity, even simple copying of texts was not so simple. It is well known that almost all texts of antiquity were altered during the transmission process, because the scribes who copied them made mistakes, ‘corrected’ earlier scribes’ mistakes, or simply altered the text for one reason or another. In the case of the Gothic manuscripts, there was at least one case of attempted forgery (see below 4.3.).

Working with a digital version is not the same as studying the original image; after all, virtual reality is never identical with the real world. However, in many cases the original piece is, for one reason or another, unavailable and compromises must be done. The key for a successful digital reproduction is faithfulness. In my work, I will try to stick to this theme as much as possible.

3.  Digitizing Old Texts

The idea of using computers for linguistic studies came up quite early during the history of digital technology. One of the first programs to be used was Word Cruncher, which originally was developed at Brigham Young University as a tool for easy retrieval of words and expressions from the Bible. In those days, with the spirit of digital brotherhood still supreme, the creators donated the program to the public domain, and linguists adopted it to their use. The program, which I also used extensively, creates index of words in a given text, with links to the text itself.

With the development of faster computers with bigger storage capabilities, it became possible to digitize pictures in a high quality mode. One way to prepare a digitized photo is to film it on a microfilm and, subsequently, scan the microfilm. Another way is to lay the original directly on the scanner and let the software do the rest of the work. In my work I use both systems: I scanned the facsimile edition (1927)of the Codex Argenteus (around 800 photos) into 4 CD ROMs. Other material which I do not have direct access to, I try to have photos or slides prepared for me and then scan them.

3.1.  Old Text in Image Mode 

For a text in a digital picture to be legible and useful, the text itself must be very clear and the picture of very high quality, that is, high density of pixels per inch. In photo-handling software one can enlarge the picture or manipulate it to a certain degree, but if the photo is in a poor quality, then not much can be done to improve it.

In this section I describe several projects done in this area of research. One feature common to those projects is that each of them covers only a small fraction of the collection of which it is part of. I would imagine that the best pieces were chosen for display.

3.1.1.  The IBM Team 

As mentioned before, a group of IBM’s employees have been working on several projects since 1985 (Gladney et al. 1998). Here are some projects undertaken by the group:

El Archivo General de Indias, Sevilla (AGI), The archive houses 43,000 bundles with 86 millions pages, the most complete documentation of the Spanish administration in the Americas, from Christopher Columbus until the end of the 19thcentury. Until 1992, the team collected 9 million digital-image pages. Original documents, some of them being quite fragile, were scanned to provide fast access to them.

The authors estimate that between 30% and 40% of the original documents pose legibility problems, due mainly to their great age and rough handling. Damage includes faded ink, stains, and seepage of ink from the reverse side of documents. They indicate that sometimes such damage “make it extremely difficult for a scholar to read the document.” To solve the problem, the team investigated procedures involving information filtering. In addition, the end users can improve the output document. One method is zooming, which enables closer look at certain details. The second available tool enables modifying the color palette in order to reduce the effect of stain ink fading, and bleed-through. In addition, assorted nonlinear spatial filtering, which can be performed on any area of a displayed page or a selected portion, are available and enable intensification of faded ink or the removal of distracting background (Figure 3.1).

3.1

Figure 3.1: An example of ink bleed-through reduction and stain removal

A thorough discussion concerning this project: Computerization of the Archivo General de Indias: Strategies and Results written by Pedro González, is available at:

http://www.clir.org/pubs/reports/gonzalez/contents.html.

The Vatican Library. The library consists of 150,000 books and manuscripts which include early copies of Aristotle, Euclid, Homer and The IBM team investigated the practicality of an Internet digital library service, which would allow broader access to the collection, while protecting the Vatican Library’s assets. In order to find out which are the best solutions, the team digitized a significant number of manuscripts, making them available via the Internet, and collected the views of participating scholars. Among the specification provided by the library stuff was ensuring the safeguard of the material, permitting inspection of higher-resolution versions and protecting the images against unauthorized republication.

To solve the first requirement, the team developed a protective scanning environment, avoiding ultraviolet light damage. To comply with the second requirement, the images were scanned at 2600 X 3000 pixels with 24b/pixel for color pages and 8b/pixel for monochrome  pages.  Storage  space  for  such  high-resolution  image  often  exceeds 20MB. In order to prepare the images for Internet service, the images were reduced to 1000 X 1000 pixels and compressed to data sizes of 150KB to 250KB. As for the third demand, the idea of offering online only low-resolution versions of little use to would- be pirates, was rejected because it did not support research in art history. Instead, the team developed visible watermarking, which indicates the owning institution, does not remove details needed by scholars, and is difficult enough to remove. A reference to a more elaborate article written by the IBM team: Toward on-line, worldwide access to Vatican Library materials can be found at: http://www.research.ibm.com/journal/rd/mintz/mintzaut.html.

Klau Library of Hebrew Union College. The collection consists of nearly 750,000 volumes of Hebraica and Judaica from the 10thcentury to present, including Biblical codices, communal records, legal documents and scientific The work on this project involves the same technology used at the Vatican library, with watermarking (see Figure 3.2) and reduced size for access on the Internet.

3.2

Figure 3.2: The First Cincinnati Haggadah from the 15th century, with watermarking

3.1.2.  The Advanced Papyrological Information System

An integrated information system created for dealing with ancient papyri, ostraca (potsherds with writing), tablets, and similar artifacts. The project is sponsored by the American Society of Papyrologists. It is operated by a consortium of several American universities and funded in part by the National Endowment for the Humanities. The system is still under construction and it includes only small portions of the collections of the institutions involved. The idea pursued by the creators of this system is to construct a “virtual” library with digital images and a detailed database of cataloged records. The database will provide information concerning the external and the internal characteristics of each fragment, corrections to previously published papyri, and republications.

Back to Hickey’s procedures for creating a corpus. According to him, the next step is normalization of the text. This process consists of replacing variants of a grammatical form by a single form by external consensus, reached by the corpus compilers. However, there is ‘almost ideological dislike’ of normalization, particularly on the part of medieval scholars. The advantage of normalization is in producing texts without undue linguistic difficulties. Normalization can be achieved by creating a database of all the occurrences of invariant forms vs. normal forms and with proper software to substitute the first with the latter. Normalization is an optional operation and the users of the corpus can perform it at wish, while leaving the original text unimpaired.

Moreover, since tagging must be done ‘by hand’ even different members of the same team may, at least theoretically, mark the same text elements in slightly different manners. Hickey seems to have developed a program designed to tag text in automatic, semi- automatic or manual modes. I will have to use the program myself in order to believe that it is indeed useful. Hickey maintains that one either tags completely or not at all. In fact, tagging a text is such an enormous undertaking that not all corpora are actually being tagged.

Compilers of corpora may include text-relevant information, placed at the top of a file and includes such data as author’s name, name of the sample, text type, dialect, date of the original, etc. One can generate a database with this header information and with proper software be able to look for certain grammatical elements.

When dealing with old texts of Germanic languages, e.g. Old and Middle, English, Gothic, High or Low German, one is encountered with the problem of special characters, for example ash, eth and thorn which are not included in the old 7-bit ASCII set. The compilers of the Helsinki corpus presented them as a+, d+ and t+, respectively. The disadvantage of this method is the awkward readability. In the 8-bit set, these letters appear as æ, ð, and þ.

One useful tool for executing data mining in a corpus is a Corpus manager which includes table of contents, searching facility both for single words or tagged strings, and an editor for correcting mistakes or making changes.

‘Normalization’ of text would involve creating some kind of a unified text, that is, a text where the same words are spelled in the same manner and grammatical structures are identical. The idea of normalizing a text is not new and was practiced by scribes for centuries. One result of this habit is that books of antiquity came to us in different versions. Another problem is reconstruction of ancient text. In many cases when manuscripts remained extant, the handwritten script often became blurred or simply disappeared. One way to solve the problem has been to reconstruct the missing portions. However, there is no way to prove the correctness of the reconstruction. What makes the situation more complicated is that without proper marking, one cannot distinguish between a genuine text and a reconstructed one (which may be completely wrong). When it comes to ‘corrections’ of old texts, the field is widely open. For example, in the case of the Gothic manuscripts, even when the text is clear, there are places where the text includes obvious mistakes done by the original scribes. In the manuscripts there are also corrections but it is not clear who did them, the Goths themselves or latter-days ‘editors.’ Consequently, no one can always be sure whether the corrections are indeed ‘correct.’ I maintain that the term ‘normalization’ of old texts is simply throwing out the baby with the bath water. In fact, philologists are greatly interested in spelling and grammatical variations and changes. From those details they actually deduce their theories.

The Gothic Manuscripts

In the great debate concerning the nature of God, which had flared up in the early Church, the Goths happened to be Arians. As it turned out, they were on the loosing side and as such, were destined to history’s dustbin. In practice, it meant that nobody studied or copied their writings and either by intentional destroying or simply by neglect, very little of their heritage has survived. The main reason why anybody would be interested in what was left extant from their legacy is its linguistic value; Gothic is the oldest Germanic language of which we have written evidence. As a result, the people who actually study Gothic are mostly linguists. I became interested in the subject while studying English philology, that is, the history of the English language, at Helsinki University.

The History of the Manuscripts

Apparently, for a period of 40 years during the fourth century a Gothic Bishop, Wulfila, prepared a translation of almost all the Bible into the Gothic language. In order to accomplish this task he had had to invent letters. None of the original manuscripts has survived and the lion’s share of those texts we have nowadays is apparently from the sixth century. The best preserved manuscript is the so-called Codex Argenteus – the ‘Silver Book.’ The manuscript contained originally at least 336 leaves of which 188 remained extant; 187 are preserved at the Library of Uppsala University and one leaf, which was found in 1970, in Speyer, Germany. The text is written on parchment on both sides; some letters, like the first line of an Eusebian canon, are written in gold and the rest are in silver, hence the name – argenteum, ‘silver’ in Latin.

The other major source of Gothic texts is the Ambrosian codices, which are located, for the most part, in the Ambrosian library at Milan. Those manuscripts are palimpsests. After the Goths were defeated, the parchments they left behind were reused for other writings; the monks washed or scraped off the old Gothic sheepskins clean and wrote over them with Latin texts. In Greek palimpsest means “rescraping.” Fortunately, one can still see, with various degree of difficulty, the Gothic text underneath. The Ambrosian codices contain altogether 346 pages and 10 leaves, divided into five groups.

A 192 pages containing parts of the Epistles

B 154 pages containing parts of the Epistles

C 2 leaves containing fragments of St. Matthew 25-27

D 3 leaves containing parts of Nehemias 5-7

E 5 leaves containing part of the Skeireins.

In addition, there are few more remains of Gothic, which are not palimpsests: the Veronese marginal notes, the Deeds, and the Salzburg-Vienna Manuscript.

The Gothic Alphabet

It is generally assumed that Wulfila himself devised the Gothic alphabet; however, there is no consensus concerning the sources and models the Goth used while designing the letters. The alphabet contains 27 symbols, of which 25 are used both as numbers and letters and two letters which serve only as numbers. Consonant gradation, that is letters deleted in compliance with grammatical rules, is marked with a bar over the place of the deleted letter. As it became a custom, the Gothic text has been rendered with Latin fonts with few changes (see below). With the advent of digital technology, Gothic fonts were created by Boudewijn Rempt of Yamada and are freely available.1 The font can be easily installed on a PC, although not on a Sun terminal with a Unix operating system. Indeed, this is one of the reasons why I have to write this paper on a PC. For the transliteration of the Gothic text, one uses the ISO-8859-1 (ISO Latin1) character set, which includes such characters as Þ, þ, Ï, ï. For rendering the Gothic x one uses hv. As with the Gothic original, the end of a sentence is marked with a point in the middle height of the letters: ·, which is also a character in the ISO-8859-1 standard.

The Gothic Text 

The text Gothic is written in scriptio continua, that is, there are no white spaces between words. To illustrate the problem here is an example from English: GODISNOWHERE can be read as GOD IS NOW HERE or GOD IS NOWHERE. The interpretation depends on the context, a fact which adds the element of semantics into the process of deciphering the text. The ends of a sentences, parts of a sentence, phrases, etc. are marked with points. The division of the text follows the Eusebian canons and an end of such canon is most often marked with a colon. New chapters are numbered with numerical letters and if a new chapter happens to start on a new line, the first letter is capitalized. Other than this, there are no capital letters in the text, not even to mark proper nouns. There are differences in style and form among the various codices.

Wulfila followed an ancient practice and abbreviated terms which were considered to be holy (nomina sacra). This phenomenon is known from Greek and Latin texts which, by all probability, were in front of him. As a result, the Gothic text is abundant with abbreviations: guþ ‘God’ is abbreviated as , iesus is shortened to is, iesu becomes iu, xristus is xs or xaus, xristu is rendered as xu, frauja ‘Lord’ is written fa. The same practice is employed whenever these terms appear in declination modes: guþs become gþs, guþa is gþa, fraujan is rendered fan, fraujins is shortened to fins, iesuis reads iuis.

For the most parts, there are established rules as for how the text is divided into words; however, the transliteration of the text being used are ‘normalised’, that is, they do not necessarily follow exactly the manuscripts. For example, as mentioned above, the original text uses abbreviations for holy terminology; nevertheless the various editors filled in the complete words. There is a very limited vocabulary of known Gothic words, no more than few thousand. To compare, a medium size English-English dictionary contains 70,000-100,000 entries; the behemoth Oxford English Dictionary contains around 450,000 and its on-line edition over a million, including slang, neologisms (new words), etc. In order to fill some of the lexical gapes, philologists reconstruct words, marking them with the asterisk sign (*). Here arise two problems: firstly, different scholars come up with different suggestions; however, where do you find a native speaker of Gothic to tell us which version is correct?

And secondly, during the centuries of the study of Gothic, not all the editors indicated that the readings of blurred spots were actually their reconstruction, and not a genuine reading of the text. As a result, one cannot really trust the existing transliterations.

And there is still another problem: Kleberg (1984: 20) writes:

Once, in the 1670s, a falsifier was at work on the Codex. By scraping out letters and painting over them with silver paint, he made some textual alternations, apparently with the aim of providing support for some of Olof Rudbeck’s theories about “Great Sweden.” It is perhaps impossible to prove who the culprit was. Olof Rudbeck himself would seem to be above suspicion.

Johansson (1955: 16) suggested that the letter w  ‘w’ was altered to a  ‘a’ and a ‘a’ to l ‘l’ (Figure 4.1).

3.3

Figure 4.1: The suggested alternations

The apparent goal was to change the word ‘ubizwai’ which mean “porch” into ‘ubizali, – “Uppsala.” Examining the text in the facsimile edition (Figure 4.2), one can observe that at least the bottom left side of the letter a ‘a’ was deliberately cut and made l:

3.4

Figure 4.2: A photo of the original manuscript

 

 

which make the word ‘ubizwli.’ In the Biblical text (St. John 19:23) it is written that “And Jesus walked in the temple in Solomon’s porch.” The falsifier apparently intended to place Jesus also in Uppsala. From the digital point of view, applying character recognition algorithms and methods as a tool for studying the text is next to impossible (see below 5.1).

Studies and Presentations of the Manuscripts

The first publication concerning a Gothic text, presently known as the Codex Argenteus, appeared in 1569. It was written by Johannes Goropius Becanus (Origines Antwerpianae), who probably obtained his knowledge from Georg Cassander and Cornelius Wouters. In 1597 a Dutchman, Bonaventura Vulcanius, published the text, bearing for the first time the title ‘Codex Argenteus.’ The publication was appended with text in Gothic fonts, prepared in woodcuts, followed by transliteration in Latin fonts and a Latin translation. In the Junius’ edition from 1665, the Gothic script is accompanied with a text in Latin fonts. For the Gothic script, Junius used special fonts which, in many details of design, were quite divergent from the corresponding letters in the Codex Argenteus. In 1677, matrices of those fonts had been presented to Oxford University and were used later in several publications (see below).

In Georg Stiernhielm’s edition (1671) the transliteration of the text is in Latin fonts, the Icelandic and Swedish translations appear in the so-called ‘Gothic’ letters and the Latin translation is, naturally, in Latin fonts. In 1737, Lars Roberg, an Uppsala physician drew and made a woodcut of one page of the manuscript and prepared several impressions of it. The woodcut is still extant at the Linköping Diocesan and Regional Library (Klebrg1984: 23). The page was included in Benzelius’ edition (1750). The few lines below (Figure 4.3) are a mirror rendering I prepared from a photo (Munkhammar 1988: 178).

3.5

Figure 4.3: A woodcut prepared by Lars Roberg in 1737

 

 

In Benzelius’ edition  the text is rendered in Gothic script and it is accompanied with a text in Latin fonts. Elaborate copperplate facsimiles were included in Knittel’s editio princeps (1762) of the Wolfenbüttel fragments (Codex Carolinus) of Paul’s epistle to the Romans. Starting from Ihre_Zahn edition (1805), the tradition of printing the Gothic text with types based on the Gothic writing had gradually vanished. One page of the Codex Argenteus, apparently prepared by an artist, appears in Uppström’s edition of the text (1854-7). Below (Figure 4.4), is a digital rendering of a small portion of the page, which barely conveys the beauty of the drawing in Uppströn’s book.

3.6

Figure 4.4: A rendering from the middle of the 19th century

In 1903 leaf 7, and in 1906 leaves 3, 4, and 8 of Skeireins were made available in a photo facsimile, followed in 1910 by the Giessen fragment and in 1914 the Wolfenbüttel fragments. As time went by, there were plans for a reproduction of the Codex Argenteus by woodcut or copperplates engraving, but none of them was materialized. In 1927, a facsimile edition of the Codex was prepared and published (see below). In 1991, Marchand posted in the Internet one page prepared by digital methods (Figure 4.5). Apparently the rendering is based on a photo from the facsimile edition of 1927.

3.7

Figure 4.5: Marchand’s digital restoration

After restoring the letters Marchand smoothed and filled contour and eliminated flyspecks and thumbprints with a “paint” program (Time, September 9, 1991 p. 9). In 1998, according to an agreement signed between Tampere University of Technology and the Library of Uppsala University, I scanned the facsimile edition of the Codex Argenteus into 4 CD-ROMs, to be used in my research. In addition, I also received two slides containing photos prepared recently with advance photocopying technology. I converted these slides into a digital form. As part of my work, I have been developing software and making digital restoration of several plates from the facsimile edition. (I elaborated on this subject in 6.3.) The first publication of Ambrosian codices appeared in 1809, prepared by Monsignor (later Cardinal) Angelo Mai and Count Carlo Ottavio Castiglione. The text is in the original Gothic script and is followed by transliteration in Latin fonts. For the Gothic script, Mai & Castiglione used matrices of Junian type, the same type which was given to Oxford University. In Figure 4.6 few lines of from the Ambrosian text featuring Junian’s innovation are displayed. The letters are indeed neat, however, they are inaccurate.

3.8

Figure 4.6: The Junian type

In Massmann’s edition of the Skeireins (1833), the Gothic fonts are no longer used and the Gothic text is rendered in Latin fonts. The style that eventually had emerged is a combination of Latin and Scandinavian fonts; for example the Gothic letter v is being render with the Scandinavian Thorn þ (the Latin equivalent of th). In 1936, a facsimile addition of the Ambrosian codices was published by Galbiati and de Vries. In my research I use scanned renderings of some of those photos. In early 1960s, the Ambrosiana collection was microfilmed. Of this collection I have scanned one black and white photo of a very good quality, published by Gabriel (1965). It is a rendering of palimpsest, which contains St. Jerome’s Commentary on Isaias, written over Wulfila’s Gothic version of St. Paul Epistle to the Galatians. It appears that more recently the Codex was photographed again. I have seen some of these photos; however, I have not used them in this study.

Re-examining the Work of Earlier Scholars

One challenge facing contemporary researchers is to re-examine the work of earlier generations of scholars concerning the exact text of the Gothic manuscripts. Ebbinghaus (1985: 30) wrote:

The question of the text is the most distressing one we are facing today in Gothic studies. In the discussion of Streitberg’s ‘Wörterbuch’ above I have pointed out how uncertain we still are about the Italian mss. The editions of codd. Ambross. which were made on the basic of direct inspection (Castiglione, Massmann, Uppström, von der Gabelentz & Loebe [with the help from Castiglione], Bernhardt, and Streitberg [with Braun]) all contain an unknown number of erroneous readings. Only Bennett’s edition of Skeireins is satisfactory.

Skeireins, which Ebbinghaus singled out, is the Gothic commentary of the Gospel of John, composes of eight leaves of palimpsests, each written on both sides and each side has two columns. Under the title The scribal form of the text and its Modern “improvements” Bennett, who did the most recent deciphering, wrote (1960, 1):

Among works that have been metamorphosed by overzealous editing, few compare with the Gothic commentary on the gospel of John. The known leaves of this treatise comprise only 800 lines averaging about 13 letters each. Yet, if every word that scholars have added, deleted, replaced, transposed, or otherwise altered were to be counted separately, the total number of emendations would be approximately 1500. Proportionally, there is at least one modification for every seven letters in the manuscript. The extent to which the commentary has been transformed by emendations, ranging from changes in individual forms to rephrasing of entire passages, can be appreciated only through examining the text of one edition after another, but even the gross statistical evidence is significant.

Bennett’s work with the manuscript had started in 1948 and continued, with pauses, for ten years. He had had access to the original manuscripts and for the deciphering process he used photographic methods, which he found to be dependable (1960, 25). As it turned out, since that time no similar project was undertaken, which means that the rest of the palimpsests, more than 300, need to be thoroughly checked. This state of affairs is the source of Ebbinghaus’ comment that “the question of the text is the most distressing one we are facing today in Gothic studies.”

Features of the Material

The Codex Argenteus was written in silver ink on leaves of parchment and dyed purple. The Swedish linguist Johan Ihre suggested that the Codex was printed with hot stamps. He introduced this theory in a preface to Erik Sotberg’s (his pupil) first dissertation (Ulphilas illustratus Uppsala 1752). However, this theory did not find support among other scholars and although it is mentioned later in less serious literature on the Codex, it apparently died with Ihre himself (Munkhammar 1998: 159, 171). Observing the beauty of the Codex and comparing it to other works of antiquity, one indeed wonders about the way the text was produced. While preparing the facsimile edition (1927), the researchers released the pages from their binding, so each leaf might have more conveniently and accurately been photographed. As a result, it became possible to compare the different pages of the manuscript side by side. Systematic comparison revealed that actually two scribes were engaged in the production of the Codex, one being responsible for the Gospels Matthew- John (manus I) and the other for the Gospels Luke-Mark (manus II). As it turned out, not only there were differences in the style of the writing, but also in the quality of the original writing which resulted in different times of exposures required for the photographing process for the different Gospels. Curiously, the distinction in exposure time did not follow the division between the scribes; the times of exposures were much shorter for the texts of the Gospels of St Matthew and St Luke than for St John and St Mark (Codex Argenteus Upsaliensis 1927: 122, as corrected by Friedrichsen 1930: 190).

Friedrichsen suggested this chain of events. Assuming that both scribes were working side by side, one of them started working on the Gospel of Matthew and the other on Luke. The second scribe seems to be less careful than the first. Once both were done with their respective initial assignments, it was decided to employ an ink containing a higher proportion of silver on the text of the remaining Gospels of John and Mark, while still continuing to use the thinner ink for the decorative columns and arcs at the foot of the text. One possible result of this theory is that during illumination, in case of John and Mark the contrast between the text and the background is greater than the other books. Friedrichsen speculated (p.192) that “this silver was troublesome and costly to prepare, and it may needed more than one application, possibly by the artist-scribes, zealous for the perfection of their work, before a reluctant treasury granted the increase expenditure on silver.” One result of the change of quality of the silver is that in the flourescence photography (see below) the writing in St. Matthew and St. Luke stands out but feebly against the background, whereas in St. John and St. Mark the contrast is much greater.

In general, the leaves exhibit different kind of damages, like discoloration and falling off of the silver, the fall off of the gold and the penetration of the ink from one side of the leaf to the other. After investigating and experimenting with different methods, the scholars, Theodor Svedberg and Ivar Nordlund, concluded that none of them would satisfactorily solve all the difficulties. As it turned out, two methods have proved to be decidedly superior to the rest (Codex Argenteus Upsaliensis, 119). The first method consisted of photographing with reflected ultra-violet light of the wave-length 366 mm and the second was fluorescence photographing, in which the fluorescence is excited with the same wave-length, 366 mm. Since these two methods complement one another in certain respects, the compilers of the facsimile edition decided to include in it a reproduction of each page by both these methods. The scholars rejected the thought of retouching of the plates since it necessarily introduced a factor which tended to make the photograph no longer a faithful reproduction of the original manuscript. It was also evident that those methods did not successfully reproduce the gold in the text. Therefore, it was decided to add a supplementary collection of photographs on a smaller scale done in different methods, which reproduced better the lines inscribed in gold. For this collection, three methods of photography have been used: photography with a yellow filter, with secondary X-rays and with oblique illumination. The second method, using X- rays, turned out to be the most fruitful in the reproduction of traces of gold. This part of the work was done at the X-rays department of the University Hospital in Uppsala.

In addition, to avoid disturbances on the pictures caused by the penetration of the silver from the reverse side, when the flourescence method was used, the time of exposure was increased. However, as a result the lines written in gold, paragraph signs and the decorative columns with their parallel figures, were often exposed too much. This disadvantage was remedied in some degree by covering the text with one or more layer of transparent paper during the printing from the plates

One more feature of the manuscripts that emerged during the work was that of the two sides of the parchment, the flesh-side (that is, of the skin of the animal) was always faded more than the hair-side. As a result, the average time of exposure for the latter was somewhat longer than the former. Another feature was that in the ultra-violet method, the structure of the parchment was revealed in the photograph. The facsimile edition is indeed a masterpiece. One of the scholars who took part in the project, Theodor Svedberg (1884-1971) a chemist, was honored in 1926 with the Nobel Prize for Chemistry for developing the ultracentrifuge to facilitate separation of colloids and large molecules.

The Ambrosian manuscripts are palimpsests, that is, the sheepskins on which the original Gothic script was written, were washed away and written over in Latin. Figure 4.7 displays an example of a portion of a palimpsest.

3.9

Figure 4.7: A palimpsest

 

In this case the Gothic text is relatively clear. Often it is quite hard to distinguish between the original Gothic text and the Latin letters written over it (see 6.4).

A facsimile edition of the Ambrosian Codices was published in 1936 by Galbiati and de Vries. As far as I can see, the introduction, written in Latin, does not give any technical details concerning the photographing methods. Each page is reproduced in one form. For this research, I use digital renderings scanned from photographic slides made from this edition. Apparently, a few years ago the Ambrosiana museum had prepared a new collection of photos of the manuscripts. They guard it zealously and although I have seen some of the photos, I have not used them. In any case, unlike the facsimile edition of the Codex Argenteus, it seems that no special techniques were employed while photographing, and I doubt whether those photos reveal more information than those in Galbiati and de Vries edition.

The Ambrosian codices were written by several scribes. As a result, there are several types of letters. Figure 4.8 displays some variations of letters in the various manuscripts, as rendered in the introduction to the facsimile edition of the Codex Argenteus.

4.0

Figure 4.8: Variations of letters in the various manuscripts

According to Marchand (1987: 26) there are 22 hands in the Gothic manuscripts. In addition, the text is accompanied by several notations. The chapter numbers  are marked with symbols, which include in the middle letters carrying numerical value (Figure 4.9).

4.1

Figure 4.9: Chapter numbers

 

As mentioned before, sacred terminology is abbreviated. In the Codex Argenteus the abbreviation is marked with a bar above it (Figure 4.10).

4.2

Figure: An abbreviation mark

 

In the manuscripts one find ligatures, that is, two or more letters joined together forming one character or type, a ligature. Figure 4.11 displays the word mahts, St Matthew VI: 13, plate 9, line 11 (the Codex Argenteus).

4.3

Figure: A ligature

The text was corrected in several ways by different hands and ,apparently, in different eras. Figure 4.12 demonstrates one such correction.

4.4

Figure 4.12: A correction

During the centuries some of the silver and gold letters of the Codex Argenteus faded or were contaminated with other material, see for example Figure 4.13.

4.5

Figure 4.13: A faded and contaminated text

In several places some of the text is simply missing (Figure 4.14).

4.6

Figure 4.14: A torn parchment

From the digital image processing point of view, the manuscripts present a plentitude of different problems, as a result, different approaches must be sought after and experimented with – for each problem its own solution. It is very possible that present technology does not offer solutions to some of the problems.

Examining Various Digital Image Processing Methods 

In his article (1987), Marchand articulates the idea that following the personal computer revolution, the individual humanist “now has available to him computing power far beyond that of mainframes of the sixties” and as a result, the humanist is liberated from the mainframe, from the keepers of the mainframe and the “tyranny of the programmer.” In this section I will examine different digital image processing methods which might be relevant to my work on the Gothic manuscripts.

The Gothic Text as a Candidate for OCR Methods 

The first step in any OCR process is removing noise from the text, otherwise the optical device will mistake noise for characters or will not be able to distinguish between the letters and the noise. There are two kinds of Gothic text, the Codex Argenteus and the palimpsests. As far as the latter is concerned, the image is in most part ‘noise.’ I cannot imagine any way one can separate between the Gothic letters and the other elements in the page to such an extent that OCR method can be tried (see chapter 7). As for the Codex Argenteus, it takes me around two days to clean one plate (see chapter 6). However, by the time the text is clear no more character recognition is needed. Moreover, the amount of text to be cleaned is finite and final. The best tool for character recognition is still the human eye and not the computer. As a human being, I see no reason to distress over this state of affairs.

Studying the Gothic Manuscripts 

In his article (1987: 18), Marchand describes one project where digital image enhancement techniques were employed in order to read manuscripts which cannot be read by the naked eye. The first part of the process is obtaining a picture of the manuscript, digitizing it and downloading it into the computer’s memory. Marchand writes that the software assigns to the picture 256 levels of gray, each of them addressable. One can then ask the machine to do whatever one wants with each level of gray, give it any color, ignore it, etc. If, for example, the original manuscript is a palimpsest, in which one text is written over another, it is possible to separate the scripts. We are frequently able to read letters and words in Gothic manuscripts, for example, which have resisted all previous efforts. This takes, of course, enormous computing power,… The recent advent of FFT (fast Fourier transform) programs for the personal computer will lighten our burden tremendously.

In this chapter I examine existing filters as well as trying to follow Marchand’s ideas concerning digital handling of the Gothic manuscripts.

Examining Existing Filters

It seems that the path to proceed in this research is through the spatial domain, which is the composite of pixels (picture elements) constituting the image. Spatial domain methods are procedures that manipulate directly on those pixels (Gonzales & Woods, 1992: 162). In general, the fundamental function is:

g(x,y) = T[f(x,y)]

where f(x,y) is the output image, g(x,y) is the processed image and T is an operator on f(x,y). In most cases, the operator is a mathematical calculation, such as convolution or correlation, where the multiplier is a square matrix of the size of 3×3 or 5×5 whose center is moved from pixel to pixel, starting at the upper left corner. One such method, known as mask processing, allows the values of f in a predefined neighborhood determine the value of g at (x,y). The value of this predefined neighborhood determines the nature of the process, such as image sharpening. Another approach involves transforming the values of pixels above and below certain threshold or range of pixels in order to produce higher contrast in the vicinity of that certain threshold or range. Since the transformation at each point depends only on the gray level at that point, this method belongs to the category of point processing.

Eventually, the aim of this study is to highlight a specific range of gray levels, which constitutes the Gothic writing and eliminate everything else. One approach would be the enhancement of the edges of the letters. There exist various sharpening spatial filters which were designed for enhancing edges that have been blurred. One such category is derivative filters (Gonzalez & Woods, 1992, 197). With the help of Matlab, I will examine the applicability of three such filters: Roberts, Prewitt and Sobel, to the decipherment of the Gothic Manuscripts.

Figure 6.1 displays in grayscale mode an image taken from St. Mark, chapter 5.

4.7

Figure 6.1: One line from the Codex Argenteus

 

 

Initiatiating the image into matlab with the command:

I = imread(‘path/file_name.file_mode’);

and the appropriate Matlab command:

I = edge(I, ‘sobel’);

Figure 6.2 renders the results obtained in this process.

4.8

Figure 6.2: The line after being processed by the sobel filter (Matlab)

 

 

Matlab requires that before the filter is applied, the photo should be turned into a black and white mode. The result of the filtering is a binary picture where 1 is white and 0 black. Using the Roberts and Prewitt filters on Matlab produces identical results to those of Sobel. The results produced by Matlab are inadequate since important information; the bars over the abbreviations which mark ‘Jesus’(i-s-) and ‘God,’(g-v)- which are quite clear in the original photo, have not been reproduced in the filtered version. The program Gimp (an acronym for GNU Image Manipulation Program), which works on the Unix platform, enables the use of Sobel filter and the results are better (Figure 6.3).

4.9

Figure 6.3: The line after being processed by the sobel filter (Gimp)

For examining a palimpsest, I used a portion of the Gothic calendar (Figure 6.4).

5.0

Figure 6.4: A photo of a portion of the Gothic calendar before filtering

The  results  obtained  by  Matlab’s  Sobel  (Figure  6.5)  are,  for  all  practical  purposes, useless).

5.1

Figure 6.5: Matlab’s rendering of a portion of the Gothic calendar with sobel filter

Gimp’s Sobel produces a better image (Figure 6.6) than the one created by Matlab, however not adequately useful

5.3

Figure 6.6: Gimp rendering of a portion of the Gothic calendar with sobel filter

Whith Laplacian filtering Gimp produces an image (Figure 6.7) which is of no practical use.

5.4

Figure 6.7: Laplacian filtering

Those filters are too crude for dealing with the Gothic manuscripts. The main problem is the abundance of noise, which the filter is unable to properly separate from the crucial data. One reason for the filter inadequacy is the fact that the difference between important and irrelevant pixels may constitute merely one or very few integers. In other words, there is a need for a filter which enables fine tuning. Since such filter does not seem to exist, I figure I should device my own software for better handling of the problems.

Manipulating Grayscale Images

Following Marchand’s idea, I wrote in C++ a program which does whatever one wants with each level of gray, give it any color, ignore it, etc.” The first step in manipulating an image is turning it into an ascii mode. Next, the program reads the input file, retrieves the first four lines and transfers them, unaltered, to the output file. After that, the program reads each number from the input file, turns it into an integer and checks it against the arguments given to it. If the condition is fulfilled, the pixel receives a new value given as a parameter:

if ((Pixel <= Upper_Pixel) && (Pixel >= Lower_Pixel)){
Pixel = Target_Pixel;
}

In other words, if the input pixel is equal to the upper or lower given values or between them, it receives a new value given by the variable Target_Pixel. The basic idea is to threshold the photo. The program enables thresholding any combination of pixels. The key for being able to use the program is knowledge of Gothic. One has to know what to look for, what to emphasize and what to erase. The process of handling the manuscripts is slow and tedious, recursively examining each step, that is, if the results are insufficient, there is a need to return to the previous step and modify the arguments. The biggest advantage of this method is that it allows the examination and the alteration of even one single pixel. Once overlapping Gothic and Latin letters are identified, one can search for the borders among them by carefully calibrating the amount of change. That method of progressing does not guarantee a successful decipherment, but it may give a better grasp of the text.

The Gothic text was preserved in different form and quality, and each one of them needs its own process. Here are few examples. To start with, Figure 6.8 displays a line from the Codex Argenteus:

5.5

Figure 6.8: A line from the Codex Argenteus

Figure 6.9 presents the histogram of the line above.

5.6

Figure 6.9: The histogram of the line in Figure 6.8

After a trial and error process, I gave all the pixels between 40 and 90 the value of 0 which, in practice, means that the letters were darkened (figure 6.10).

5.7

Figure 6.10: The line of Figure 6.8 after pixel manipulation

The new histogram is displayed in Figure 6.11.

5.8

Figure 6.11: The histogram of the photo in Figure 6.10

Next, I gave all the pixels between 205 and 254 the value of 255 (Figure 6.12).

5.9

Figure 6.12

and its histogram is presented in Figure 6.13.

6.0

Figure 6.13: The histogram of the photo in Figure 6.12

The tool I prepared has enabled me a full control of the values of the pixels. One may argue whether the processed image is better than the original; in any case, during the process, no significant information was lost. In the next example (Figure 6.14), the original photo was partially distorted by stains and possibly seepage from the other side of the parchment.

6.1

Figure 6.14: A contaminated line from the Codex Argenteus

Figure 6.15 displays the same line after darkening the pixels between 0 and 80 and giving all those between 150 and 255 the value of 255 (white).

6.2

Figure 6.15: The line in Figure 6.14 after pixel manipulation

To clarify one letter, I handled it separately. Using the original photo, I gave the pixels between 64 and 69, 53 and 57, 106 and 112 the value of 255 (see Figure 6.16).

6.3

Figure 6.16: Handling one letter

In the ‘negative’ form one can distinguish the letter u . The text reads, as expected, “xs sunus gþs” (‘Xristus son of God’).

The next example (Figure 6.17) is from a palimpsest (Ephesians 1: 21). The Gothic text is between two lines written in Latin.

6.4

Figure 6.17: A line from a palimpsest

Figure 6.18 displays the same line after blackening the pixels between 0 and 100 and whitening all those above 130.

6.5

Figure 6.18: The line of Figure 6.17 after blackening the letters

With the helps of tools provided by Photoshop, I cropped those Latin letters which could clearly be separated from the Gothic ones (Figure 6.19). The Gothic text reads: “ja allaize namne·” (‘and every name’).

6.6

Figure 6.19: The line in figure 6.18 after cleaning

In the next example I endeavor to show that, despite an accepted reading, a certain word does not appear in the text. Figure 6.20 displays the months-line of the Gothic calendar (frames added).

6.7

Figure 6.20: The months-line of the Gothic calendar

In the right frame, one can read ‘frumajiuleis ·l·’. In the left frame, following a reading from 1833, it is generally accepted that the word ‘naubaimbair’ exists. After whitening all the pixels above 100, almost all the pixels in the right frame remain extant, however, the left frame is void of intelligible Gothic letters (Figure 6.21).

6.8

Figure 6.21: The months-line of the Gothic calendar after whitening the pixel above 100

The most difficult task is to clarify those Gothic lines which are just under the Latin text. Below (Figure 6.22) is an example taken from the calendar.

6.9

Figure 6.22: A line from the Gothic calendar

One way to separate the different fonts is to take a small portion, one letter or two at a time, and manipulate the pixels until the borders of the Gothic letters are distinguishable from the Latin ones. The first two letters are displayed in figure 6.23.

7.0

Figure 6.23: Two letters before ‘clearing’

After whitening the pixels between 110 and 254, 44, 45, 52-55, 62-65, some borders have emerged (figure 6.24), which are the Gothic ku.

7.1

Figure 6.24: The letters after ‘clearing’

Another way would be to detect the edge by giving the pixels between 60 and 80 the value of 0 (Figure 6.25).

7.2Figure 6.25: Edge enhancement

Obviously, this is not much improvement on the original, however it assists in reading the text. One should also take into account the possibility that in some places the Gothic and the Latin texts had merged so thoroughly that there is no way to separate them.

Manipulating Color Photos

Enlarging the scope of bits from 8 (grayscale – 256 variations) to 24 (RGB – truecolor) enables a more delicate manipulation of the pixels. At first I simply enlarged the program to handle three successive numbers the same way as it did with one number. However, I soon found out that instead of automatically changing the value of each number, a far more practical solution would be to insert the three numbers into a vector and restrict the conditions for their manipulation. The resulting C++ code is:

If ((RGB[0] <= Upper_Pixel_Red) && (RGB[0]  >= Lower_Pixel_Red)&& (RGB[1] <= Upper_Pixel_Green)&& (RGB[1] >= Lower_Pixel_Green)&& (RGB[2] <= Upper_Pixel_Blue)&& (RGB[2]>= Lower_Pixel_Blue))
{
RGB[0] =  Target_Pixel_Red; RGB[1] =          Target_Pixel_Green; RGB[2] =          Target_Pixel_Blue ;
}

In other words, only if the three colors are each between certain ranges, than the parameters given for new values are applied.

Digital Restoration of the Codex Argenteus

In 1940 Fairbanks wrote (314) that “since modern photography has brought the chief manuscripts to almost everyone’s door, these monuments of the artistic and cultural past of one of the greatest of the Germanic peoples cannot be ignored.” As it has turned out, the facsimile editions of both the Codex Argenteus (1927) and the Ambrosian Codices (1936) did not advance the study of Gothic. One reason is, no doubt, shifting interest in linguistics. Gothic was a major topic of studies in the 19th century, however, starting with the Swiss linguist F. de Sausseur (1857-1913), the founder of structural linguistics, the foci of interest moved to other linguistic fields. The second reason is that one cannot really read directly from the photos. After examining the facsimile edition of the Codex Argenteus I concluded that around 70-80% of the text is legible, around 5% is illegible and the rest can be read with varying degrees of difficulty. For reading the more difficult lines, one should compare and examine the flourescent, ultra-violet and, if there is a photo in the supplement section, of the X-rays, or yellow filter or oblique illumination photos (see 4.6).

Fairbanks also suggested the creation of Gothic fonts based on the script of the Codex Argenteus. He wrote (1940: 327):

It is easily within the power of the capable and sensitive designer on the staff of more than one modern type-foundry to produce on the basis of the Codex Argenteus a showing which would make it forever unnecessary to design another Gothic font. CA, now happily within the reach of many in photofacsimile, provides the model of a definitive Gothic printing font, and it is to be hoped that we may not have long to wait for so important and attractive an adjunct to Germanic philology.

As it turned out, Fairbanks’s hope did not materialize until a new technology – digital – has become ubiquitous. As mentioned before, a TrueType Gothic font was created by Boudewijn Rempt of Yamada Language Center at the University of Oregon. The font is free, easily obtainable4 and installed. With it I can write Gothic on this computer:

A b g d e q z h v i k l m n j u p r t v f x o

Indeed, one can use this font for writing Gothic, although, I must admit, I could not find on the keyboard all the promised letters. If I understand correctly, they may be found on a Mactintosh.

In principle, with this set of letters one can reconstruct the Gothic text. However, although I have not tried it, I figure it would be quite hard to create the same form as the original manuscript. In order to restore the original style, there are, in my opinion, two ways. The first is simply to redraw the text. There exists such a restoration, attached to Uppström’s edition (see above 4.4.). I assume that this certain page (St. Matthew. VI: 9-16) was chosen because it includes the Lord’s Prayer. The second way is digital restoration, as suggested by Marchand. He also created one such page (see also above 4.4.). Using the software I prepared and assembled, I set about to follow this path. Up to now I have restored 4 plates and my expertise in this is obviously not polished yet. I figure that with more experience I will become more articulate. After all, the software accomplishes a great deal of the work; the rest is digital handicraft.

 

The first step in the restoration process is to choose the mode of the photo to work with, flourescent, ultraviolet or, if one exists in the supplement, a third one. From the limited experience I gathered, the flourescent image is easier to work with than the others. If one of the other alternatives seems to be of better quality, I negate the colors so the writing is in dark colors and the background bright. With the aid of the software the threshold of the image is adjusted as much as possible. The text is rendered black and the background white. The basic idea is to threshold different segments one at the time. The main problem is that different parts of the page have different qualities. One way to solve the problem is to cut the page into different segments, however, as far as the Codex Argenteus is concerned, I have not done it yet; while dealing with the palimpsests, I actually cut each line into different photos (see below). Experience shows that after several thresholdings, no more than five, extra cleaning in one spot may cause deletion of essential information in other spots of the photo.

The next step is going through the processed photo, cleaning the noise that was left, checking that no information has disappeared, and restoring with a digital brush the letters or part of letters that disappeared from the original leaves during the centuries passed since the Codex was prepared. This part of the process takes around two days. I assume that with better experience it might take less, but, at the same time, since for the time being I restore pages which are in relatively good condition, it is very likely that as my work proceeds, I will have to spend more time on restoring individual letters. After the cleaning and checking is done, I change the color of the letters into silver, and those which were originally written in gold into yellow, following a tradition established by Uppström and Marchand. The reconstruction of missing letters or parts of them is marked green. The original Codex was written on purple parchment and for this reason both Uppström and Marchand painted the background in this color. Marchand wrote: (1987: 26):

Imitating hands did not suit us, for we wanted more. We wanted to reconstitute a manuscript, to make a replica of the Codex Argenteus which looked more like the Codex than it itself did. That is, we wanted a manuscript which looked like the original ca. 520 A.D., one which people could read … The results are good enough to fool experts.

In this respect, I decided to deviate from the established tradition. In my view, a computer screen, with due respect, is not a parchment and what looks beautiful on one medium, does not necessarily look the same on another. Moreover, since most printers are still black and white, and, in addition, even with color printers one does not always know what comes out of the printer, instead of a purple background I mark the contour of the letters with purple pixels. In this manner I enhance the edge of the letter and, in general, sharpen the appearance of the text (Figure 6.26), even if the printing is done on a black and white printer.

7.3

Figure 6.26: A restored portion of the Codex Argenteus

In order to enhance the contour I wrote another function. The basic idea is that in the vicinity of the edge one pixel has the color of the background, in this case white, and the pixel on the edge of the letter is not white. The function retrieves two pixel at a time, push them into a vector of six places (three for each pixel) and than checks whether the first three have the same color of the background and the other three do not. If the condition is true, than the color of the edge is changed to the required color:

if((RGB[0] == First_Pixel_Red) && (RGB[1]==First_Pixel_Green)&& (RGB[2] == First_Pixel_Blue)&&

(RGB[3] != First_Pixel_Red)&&(RGB[4] != First_Pixel_Green)&& (RGB[5] != First_Pixel_Blue))

{

RGB[3] = Second_Pixel_Red; RGB[4] = Second_Pixel_Green; RGB[5] = Second_Pixel_Blue ;

}

The program checks a situation where the first pixel is part of the letter and the second belongs to the background. In this case, too, the edge of the letter received the color of the contour given in the parameters:

else if ((RGB[0] != First_Pixel_Red) && (RGB[1] != First_Pixel_Green)&& (RGB[2] != First_Pixel_Blue)&&
(RGB[3] == First_Pixel_Red)&& (RGB[4] == First_Pixel_Green)&& (RGB[5] == First_Pixel_Blue))
{
RGB[0] = Second_Pixel_Red; RGB[1] = Second_Pixel_Green; RGB[2] = Second_Pixel_Blue ;
}

If the pair of pixels in the vector does not fulfill any of these alternatives, the program leaves the value of the pixel unchanged.

To cover a situation were both pixels are just before the edge are just after it, I added a function which skips the first pixel in the photo and starts retrieving two pixels at a time from the second pixel of the photo. Another problem is that the functions advance row after row, and in this manner do not check the top and bottom parts of the letters. To solve the problem I simply rotate the image so the colons become rows. Running both functions again, all edges are being enhanced. After this process is complete, all which is left to do is to rotate the photo back to its original position. The last step is to change the mode of the photo into one readable by HTML and insert it in my homepage: http://www.cs.tut.fi/~dla/C_A_Restore/restore.html.

In 1970, one leaf of the Codex Argenteus was found in the Cathedral of Speyer, Germany. A good photo of it was published in Munkhammar book (1998: 98). I scanned the photo and for few hours tried to use the software I develop for cleaning it, but to no avail. My conclusion is the some kind of filtering like ultraviolet must be performed, before one attempts further manipulation.

Conclusion; the measurements and tests they performed have shown that the best contrast can be achieved in the near-infrared band of spectrum.

Deciphering the Palimpsests

Success in deciphering the palimpsests means the failure of someone else to accomplish his task given to him, namely, the erasing any traces of Gothic from the expensive parchments. The opposite is also true, that is, failure to decipher the Gothic text means that somebody had done his work properly, without taking a stand on the assignment given to him. What I am trying to say is that the palimpsests are a chaos. To study a photo of a palimpsest, I first cut it into lines and each line I divide into three parts. Using the software I have in my disposal, I try to find segments of letter. Unlike the handling of the Codex Argenteus, in studying the palimpsests I do not try to remove the noise, even from the simple reason that, from the point of view of the monks who washed away the original Gothic text, the noise is what I am interested in.

Here is the study of the first few lines of plate 50a of Codex Ambrosiana A (Romans XIII:  13,14  and  XIV:  1-5).  If  I  agree  with  the  present  reading  I  reproduce  the accompanied letter in black; if I cannot confirm the reading, I reproduce it in an outline form. Places that I think are problematic, I mark with a question mark. For finding possible similar words I use the searching engine of Project Wulfila (see above 3.2.2.). For comparing to the Biblical text I use the King James Version.

In order to enable more efficient examining of the photo, I divide each line into three parts and enlarge them.

7.4

Figure 6.27: Plate 50a line 1

7.5Figure 6.28: Plate 50a, line 1, part 1

7.6

Figure 6.29: Plate 50a, line 1, part 2

7.7

Figure 6.30: Plate 50a, line 1, part 3

 

It seems as if there are some letters in the third part (Figure 6.30). However, according to the Biblical text there should be none. Apparently, these are traces of letters from the other side of the parchment.

7.8

Figure 6.31: Plate 50a line 2

7.9

Figure 6.32: Plate 50a, line 2, part 1

8.0

Figure 6.33: Plate 50a, line 2, part 2

8.1

Figure 6.34: Plate 50a, line 2, part 3

8.2

Figure 6.35: Plate 50a line 3

8.3Figure 6.36: Plate 50a line 3, part 1

The present reading leikis ‘body’ (Figure 6.36) is well attested and semantically correct. However, the space between the letters seems to be too wide and, in addition, it seems that there is something else there. I raise the possibility of having the word leihtis ‘flesh’ there, with the combination of ‘h’ and ‘t’ as one letter (ligature, see above 4.6.). In fact, the King James version uses the word ‘flesh’: “…and make no provision for the flesh” (Romans XIII: 14). The word leihtis appears once in the Gothic text in II Corinthians 1: 17: “… do I purpose according to the flesh…”

8.4

Figure 6.37: Plate 50a line 3, part 2

In this section (Figure 6.37) I could not discern any Gothic letter. I would suggest that the transliterated text “munni taujaiv is a reconstruction.

8.5

Figure 6.38: Plate 50a line 4

8.6Figure 6.39: Plate 50a line 4, part 1

Since this is the beginning of the line (Figure 6.39), some text must have been written there. One possibility is that the reconstructed word taujaiv is divided into two parts and the second part is at the beginning of this line.

8.7

Figure 6.40: Plate 50a line 4, part 2

 

It seems as if there are still Gothic letter in this line (Figure 6.40), however, according to the Biblical text there should be none.

In the process of deciphering the Gothic manuscripts the best tools still are sharp eyes, experience and a good knowledge of Gothic. One also need a concordance or a search engine for exploring different alternatives in other sections of the text. Parts of the text are highly unintelligible and my guess is that it was unintelligible also almost two hundreds years ago when the first studies of the Ambrosian codices were conducted. Without better photos such as those prepared for the facsimile edition of the Codex Argenteus, it would be extremely hard to discern more details in those spots. However, careful digitization may help in distinguishing which part of the text was actually being read and which part is a reconstruction.

Preliminary specifications for Wulfila

When making the preliminary plan for this research, my original intention was to try to conduct it with Matlab. However, very soon I realized that I would need to write my own software. In addition to this software, I use three different image processing programs: Photoshop, xv and the Gimp; the first has only a PC version and the other two are designed to work on the Unix operating systems. Presently I work in front of two screens, one of a PC and the other one is a Unix terminal. To transfer data from one system to the other I use the program WS_FTP. Obviously, it is not very convenient to move from one program to another and from one operating system to the next. Eventually I will have to combine all the functions I need into one program, which works on a single operating system. I call the designated program Wulfila in honor of Gothic Bishop who created such an excellent translation of the Bible into a language that before him did not have letters. In this chapter I specify the features I plan to include in this program. The format of this specification document is based on the Finnish Standard Association standard SFS-EN ISO 9001, which follows the international standard ISO 9001:1994 “Quality systems – Model for quality assurance in design, development, production, installation and servicing.” The form here is an adaptation done at Tampere University of Technology. The specifications here are actually outlines. In the next stage, this section will be separated from this paper into an independent document, and enlarged.

Conclusions

Digital technology offers new possibilities in handling old texts. However, there are no magic solutions for reading and presenting old texts. One can only hope that the current hierarchy of priorities, which put high on the ladder communication and medical applications, will leave some room for studies in the domain of cultural heritage.

In this paper I try to demonstrate that digital technology assists in restoring images of old texts and may help in deciphering those hard spots in the manuscripts which escaped tools used by earlier generations. In addition, this technology offers a high level of transparency to the process; in earlier stages, very few people had had access to the original text and the concerned scientific community should have trusted their judgments. With the advent of photography, the text became more widely available. Indeed, for this work I use photographs of the originals. However, with the wide use of CD ROMs and the easy access to the Internet, there are greater possibilities for creating access to the manuscripts and their studies. In fact, each scholar can, even with existing software, execute a great deal of digital manipulation.

The Gothic manuscripts introduce two main challenges, restoring and deciphering. The Codex Argenteus has been widely and extensively studied and the transliterated text used is generally reliable. Earlier attempts to restore it for smooth reading did not go beyond a few plates. With the software I have prepared, it takes me around two days to restore one plate. I figure it would take me three to four full years to restore all the leaves. Since the last person who used Gothic in daily life died well over one thousand years ago, the restoring process will end there.

With the palimpsest, any discussion of restoring is in vain. The most pressing problem is producing filtered photos to work with. Since the holders of the manuscripts, the curators of Ambrosiana Museum in Milano, are uncooperative, one possible path of research would be to try to imitate optical filters with digital methods. To imitate the affect of such filter on a 24 bit picture would be impossible, but to flatten the picture into 8 bits (256 colors may produce some results. The first step would be to take a photo with, e.g. an ultra-violet filter, of a palette of 256 colors and compare the transformed colors with the original. The next step would be to prepare a program that transforms a photo into a ‘filtered’ one. Writing a program with one ‘if’ and 255 ‘else if’ is quite tedious and time consuming, and the only consolation is to know that the computer will have to work much harder every time it processes an image. The main drawback for such an attempt is that one actually needs the original manuscript. Genuine ultraviolet and infrared radiation can sometimes reveal information that cannot be seen or photographed with white light. However, since the original manuscript is not available and, moreover, ultraviolet radiation may cause damage to the parchment, one must experiment also with unconventional methods.

It is that digital technology cannot solve all the problems encountered while studying the old manuscripts; however, this technology offers new methods and possibilities which were unavailable to earlier generations of scholars