Computer Analysis of Pentateuchal Composition

According to this Yahoo article, a group of Israeli scholars has developed an “authorship attribution” computer program that has divided the Pentateuch into two authors, aligning around 90% of the time with the traditional academic divisions (that identify a priestly and non-priestly source). The results were presented at a recent Association for Computational Linguistics conference in Portland, Oregon. The control for the program was to jumble up Ezekiel and Jeremiah and see how well the program could pull them apart. The program, according to the researchers, divided the books “almost perfectly.” A couple other interesting results from the experiments:

– Genesis 1 is non-priestly

– Isaiah is divided into two authors responsible for Isa 1–32 and 33–66, respectively

In related news, I was speaking today with Kent Clarke about the beta release of BibleWorks 9 (he is in charge of the BibleWorks Manuscript Transcription Project) and he mentioned the software is making it possible to instantly compare all similarities and differences between manuscripts. They’ve only been able to run a limited number of full New Testament manuscripts so far, but he mentions that the main manuscript traditions are clearly aligned (Alexandrian, Byzantine, Western, Caesarean [a minor group only attested for Mark?]), as well as subgroups that have been suggested by scholars in the past. His project

seeks to provide over the next four years scholarly-produced transcriptions of approximately two hundred New Testament papyri, uncial, and minuscule manuscripts. These new transcriptions, which will be based upon high quality digital images of the actual manuscripts or their facsimiles, will serve as the foundation for the development of a New Testament Textual Criticism software application. This project will (1) develop a module that incorporates new technology and processes that more accurately and effectively allow for manuscript transcription and collation; (2) provide extremely accurate representations of the manuscripts being transcribed; (3) make the raw transcription data availablewithout charge for personal or academic use; (4) effectively enable program users to immediately compare, contrast, and fully collate any desired selection—either partial or full text—of these manuscripts; (5) allow for a broad range of detailed statistical queries relating but not limited to such issues as textual affiliation and congruity of the New Testament manuscript tradition; (6) link directly to the manuscript transcriptions their corresponding high resolution digital images where publication permissions have been granted; (7) be “open-ended” in that ongoing transcription and imaging work, as well as the recording of relevant extra-textual features such as sociological and codicological details, can be seamlessly incorporated into the software; and (8) serve as the groundwork for published volumes containing full transcriptions of each manuscript, as well as complete collations for each book of the New Testament.

10 responses to “Computer Analysis of Pentateuchal Composition

  • diglot

    Wait, so are you saying that the program did come up with a Caesarean text type in Mark? That is definitely interesting. I remember reading that some have claimed there is evidence for the Caesarean text type in the Catholic Epistles as well, as well as John.

    Though, most hypotheses concerning the Caesarean text center around the gospel of Mark and P45 & W.

    • Daniel O. McClellan

      Kent told me that the four main text types are represented, as well as some proposed subgroups. I told him I was only familiar with three text types. He told me about the fourth and said it was limited to Mark. I’d have to talk to him again before I could say much more on that specific issue.

  • Bryan Kerr

    I was wondering how they set up the program. Do you know if they programmed their computer to search for possible authorial sources based on the same parameters that scholars use to separate the text currently? If that is the case, wouldn’t it make sense that the computer program would come up with similar findings? Just wondering.

  • John Anderson

    While this MIGHT be helpful in confirming suspicions of biblical scholars, all it ultimately reveals is that there are literary fractures in the text. This we knew, and this we identified quite ‘easily’ and often throughout history. Unfortuately, this program tells us nothing about WHAT the divisions are (i.e., P, non P, pre-P, etc.).

  • Alan Hooker

    What do you think about the result that Gen 1 is non-priestly? If that’s true, some recent essays I’ve handed in mis-attribute Gen 1 😉

    • Daniel O. McClellan

      I read through the actual publication, and I couldn’t find any discussion of this. I think it’s an interesting find, but without knowing anything more about it I would say it could be part of that margin of error.

  • Gavriel

    I’d like to see this program used on:

    –Modern authors
    –With multiple works
    –Decades of writing
    –Multiple types of works (poetry, non-fiction, fiction, etc.).

  • The BibleWorks Blog » Blog Archive » BibleWorks 9 - The Manuscripts Project IV

    […] field. Based on BibleWorks’ information, the Academic Lead for the project is Kent Clarke. Daniel McClellan posted Kent Clarke’s description of the project from his website: BibleWorks Biblical […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: