Help:Transcription Guidelines: Difference between revisions

Transcribe Bentham: A Collaborative Initiative

From Transcribe Bentham: Transcription Desk

Find a new page to transcribe in our list of Untranscribed Manuscripts

Help:Transcription Guidelines: Difference between revisions

TB Editor (talk | contribs)
No edit summary
TB Editor (talk | contribs)
 
(98 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Introduction ==
== Introduction ==
----
----
This page offers Guidelines for users of the Transcribe Bentham Transcription Desk.  Here, you can find specific directions about how best to transcribe Bentham's writings, and how to [[#Why_Encode.3F|encode]] specific features and phenomena of the manuscripts.


Some of this may seem daunting, but do not be afraid of having a go at transcription - it is impossible to break anything, and any errors you might make can easily be reverted!  And you can always [[mailto:transcribe.bentham@ucl.ac.uk send an email]] to the Transcribe Bentham team with any queries.
''[http://blogs.ucl.ac.uk/transcribe-bentham/guidelines/ Download a printable PDF of these guidelines]''


We recommend opening this page in a new browser tab or window so you can refer to the guidelines while you are transcribing.
This page offers Guidelines for users of Transcribe Bentham. Here you can find directions about how best to transcribe Bentham's writings, and how to encode specific features of the manuscripts.
Some of the information below may seem daunting at first, but do not be afraid to have a go—it is impossible to break anything, and any errors you might make can easily be reverted.


This document was first written in May 2010 and was last updated in December 2017. The guidelines have evolved slightly over time in line with editorial discussions about transcription and encoding.
The Guidelines are divided into four sections:
* The [[Getting Started]] guide gives an overview of the transcription process, from start to finish.
* The [[#Basic Principles| Basic Principles]] guide explains the essentials of transcription and encoding.
* The [[#Core Guidelines| Core Guidelines]] describe the manuscript features that users will encounter most frequently, and how to deal with them.
* The [[#Supplementary Guidelines| Supplementary Guidelines]] discuss the treatment of less-frequently occurring features of the manuscripts, such as ligatures, symbols, and foreign-language words.


The guidelines are divided into three sections: Getting Started, Core and Supplementary guidelines.
We are very grateful to our transcribers for their hard work. Visit our [http://blogs.ucl.ac.uk/transcribe-bentham/credits/ Credits] page to find out how our volunteers are acknowledged for the work they do.
* [[Getting Started]] gives an overview of the transcription process with instructions and video tutorials.
* [[#Core Guidelines| Core Guidelines]] describe the manuscript features that users will encounter most frequently, and how to deal with them. Such features include additions, deletions, and notes.
* [[#Supplementary Guidelines| Supplementary Guidelines]] discuss the treatment of less-frequently occurring features of the manuscripts, such as ligatures, symbols, and foreign-language words.


== User preferences ==
Need more information? Check out our [[Help:Contents|Help pages]], or [mailto:transcribe.bentham@ucl.ac.uk email us] - we are always happy to help.


Once you have logged in and created an account, by clicking on 'preferences' near your username, it is possible to localise how you view Transcribe Bentham.
__TOC__
 
== Basic Principles ==
=== Transcribing ===
 
Transcription refers to the text that the user reads from the Bentham manuscript and then copies into the Transcription Box. When transcribing text, your aim should be to produce a transcription which represents the text of the manuscript as accurately as possible. Reproduce Bentham’s spellings, capitalisation, and punctuation exactly as they appear on the manuscript, even if they seem incorrect. For instance, Bentham and his scribes frequently got accents on foreign letters incorrect or omitted them altogether. These mistakes should not be corrected, nor should any contracted words be expanded (e.g. ‘Mr’ to ‘Mister’) or any symbols be rendered as words (e.g. ‘&’ to ‘and’).


* '''User profile''': Here you can add your real name, and change the language in which the website is displayed.
Do your best to transcribe as accurately as possible, but do not worry too much if you cannot transcribe everything on a page. The TB Editor may be able to help improve your transcript, as may other volunteers.
* '''Date and time''': Here you can change the timezone in which you are based, to ensure that the time of edits is localised for you. Under 'Time zone', find the nearest city to you and click 'save'.


== Getting Started ==
Once you have transcribed a page, a final proofread often makes it easier to spot any errors. Thinking about the sense of the words on the page can help too—although please bear in mind that some of Bentham's papers do not make a lot of sense right away, especially when taken out of context.


Visit the [[Getting Started]] page for some brief guidelines on transcription, and video tutorials on using the transcription interface and encoding.  
For help in deciphering Bentham’s handwriting, please take a look at our Palaeography Skills page.


== Using the transcription interface ==
If you are unsure about how to transcribe something, do what you think is correct or send us an email at transcribe.bentham@ucl.ac.uk and we will try our best to help you as soon as possible.


For a more detailed description of the transcription interface, and the use of its various features, please visit the [[using the transcription interface]] page.
=== Encoding ===


__TOC__
We ask volunteers to encode their transcripts in Text Encoding Initiative (TEI)-compliant XML format, an industry standard method for encoding electronic texts. Encoding can be done simply by clicking the buttons on your transcription toolbar.
<!-- == Video Tutorial ==
----
Before you begin to read the Guidelines, '''please watch the video below'''. To view a full-screen version of the video, click on the video while it is playing to open it in YouTube, and then click the 'Full Screen' button at the bottom right of the video window.


<videoflash>_B-TLHaXxJ4</videoflash> -->
By encoding your transcripts, you are helping to create a richer learning resource—owing to your efforts, researchers who are interested in Bentham's writing process, deletions and revisions, will gain valuable insight they might not otherwise have had.
Encoded transcripts also allow for more powerful and refined searching.


== Why Encode? ==
Tags are used to identify parts of the transcription and usually come in pairs, known as ‘opening’ and ‘closing’ tags. If, for instance, a user wished to note that the word ‘utility’ was deleted from a manuscript they were transcribing, it would be tagged as <del>utility</del>.
----
In the past, scholars who have transcribed the manuscripts of Jeremy Bentham have done so with a standard word-processing tool (most recently, Microsoft Word). These transcriptions were undertaken for the purpose of providing text for the editors of the various volumes of ''The Collected Works of Jeremy Bentham''. This is still one of the intended benefits of the Transcribe Bentham Initiative, but this will result in slightly different transcriptions.


For one, because the earlier transcriptions were always produced with an eye towards their eventual publication in print format, diplomatic transcriptions (which aimed to represent faithfully every textual aspect of the manuscript) were not a priority. This was particularly the case if the editor of the volume was doing the transcription work: the editor might see a deleted passage in the manuscript, realise that it would not form part of the printed volume, and thus leave it out of the transcription.
To apply a set of tags, highlight a word or passage and click the appropriate button in the toolbar. Opening and closing tags will appear around your highlighted text. A closing tag can be identified by a forward slash after the ‘<’. Sometimes only single tags are needed (such as for the ‘<gap/>’ and ‘<lb/>’ tags)—to apply these, simply click the relevant button in the toolbar.


The implicit assumption of the Transcribe Bentham Initiative (which is prioritising the production of diplomatic transcriptions) is that while the transcriptions are a means to the publication of further volumes of ''The Collected Works of Jeremy Bentham'', they are also a valuable and interesting end, in and of themselves. The embodiment of this assumption will come with the linking of the transcriptions produced in the Transcription Desk to the [http://www.benthampapers.ucl.ac.uk/ Bentham Papers Database].
To see how the markup has been applied to your transcript, click the ‘Preview’ tab in the transcription interface.


Not only will the future Bentham editor be supplied with fully transcribed manuscripts from which to produce new editions, but the scholar who is interested in examining Bentham's writing processes, his deletions, revisions, and marginalia, will be afforded the opportunity to pursue this interested in an unmediated fashion.  
Encoding may seem challenging at first but you will soon get the hang of it. As you become more confident, you may prefer to type the markup on your keyboard rather than using the transcription toolbar.


The value that the encoded transcriptions will add includes the possibility of more powerful and refined searching than a simple full-text search. Rather than simply searching for every occurrence of the word "panopticon", a user may wish to see where "panopticon" occurs only in marginal summaries - encoding, which can identify and mark features such as marginal summaries, will facilitate such a search. The transcriptions that result from the Transcribe Bentham Initiative will be encoded in TEI-compliant XML, which is the de facto standard for encoding electronic texts in the humanities academic community, and is a widely-used and -supported non-proprietary format. To learn more about TEI, visit the website of the [http://www.tei-c.org/index.xml/ Text Encoding Initiative].
For more information, please take a look at our Encoding page. If you are interested in learning more about TEI, we recommend visiting the TEI by Example website, which contains a number of tutorials and exercises.


== Core Guidelines ==
== Core Guidelines ==
----
----
=== Transcribing ===
When transcribing the text, your aim should be to produce a transcription which represents the text of the manuscript as accurately as possible. Reproduce Bentham's capitalisation and punctuation exactly as it appears on the manuscript, even if it seems incorrect to you. Bentham and his scribes frequently get accents on foreign letters (accents, umlauts, codicilles etc) wrong or miss them out altogether, and they should not be corrected, or added to the text if missing. Do not expand any contracted words (Mr./mister) or verbalise symbols (<nowiki>&</nowiki>/and). Changes to the text may only be made in the case of line-end hyphenation, as described below.
Emendation may be appropriate in a critical text, but the primary purpose of Transcribe Bentham is not to produce critical texts: it is to represent, in typographic form, the textual inscriptions of Bentham's manuscripts. At a future time, these transcriptions may form the basis for critical editions of Bentham's writings, at which point the editor may choose to emend material, to normalise punctuation, and so on, but such practices should not occur here.
Transcription might seem so self-explanatory that it does not require a definition. Since some scholarly editing manuals have, in the past, advocated (and in some cases, encouraged) standardisation, alteration, and silent correction of certain manuscript features, it is very important to know Transcribe Bentham's policies on such matters before you begin transcribing.


=== Headings ===
=== Headings ===


If the page you are transcribing includes a title or heading, you may identify this feature by highlighting the transcribed text of the heading and clicking the [[File:Jb-button-heading.png]] button on the toolbar. This will surround the heading with <nowiki><head></head></nowiki> tags. Bentham occasionally provides more that one heading: in this instance, simply apply separate <nowiki><head></head></nowiki> tags to each heading.
If the page you are transcribing includes a title or heading, you may identify this feature by highlighting the transcribed text of the heading and clicking the   button on the toolbar.
This will surround the heading with <head></head> tags. Bentham occasionally provides headings that span multiple lines—in this instance, simply add a <lb/> tag to mark the line break/s in the heading.
Using multiple heading tags on each manuscript may create problems for the TB Editor when converting the text to XML and may slow down the checking process, so should be avoided wherever possible.


[[File:Heading_example.jpg|500px|thumb|right|Headings]]
[[File:Heading_example.jpg|500px|thumb|right|Headings]]
Line 77: Line 73:




  <nowiki><head>Annuity Notes Proposed Advertisement on proposed publication on painless fees</head></nowiki>
  <nowiki><head>1818 April 15<lb/>
Annuity Notes Proposed Advertisement on proposed publication on painless fees</head></nowiki>


===Footers===


====Footers====
Page numbers and other details which sometimes appear in footers on Bentham's manuscripts should be marked-up in the same manner as notes (i.e. using <note></note> tags).


Page numbers and other details which sometimes appear in footers on Bentham's manuscripts should be marked-up in the same manner as headings.
=== Paragraphs ===


Bentham occasionally quartered or divided his page into sections by drawing lines across the page. To identify when Bentham begins a new section, users should insert a [[#Page_Breaks|page break]] by clicking [[File:Jb-button-pagebreak.png]]. If a section has a footer, mark this up as a heading before inserting the page break.  
Once a paragraph from a manuscript has been transcribed, it may be identified by highlighting the text of the paragraph with the cursor and clicking the [[File:Jb-button-paragraph.png]] button on the toolbar. This surrounds the text with <p></p> tags.


=== Paragraphs ===
All text that is not included in heading or note tags should be enclosed by <p></p> tags.


Once a paragraph from a manuscript has been transcribed, it may be identified by highlighting the text of the paragraph with the cursor and clicking the [[File:Jb-button-paragraph.png]] button on the toolbar. This surrounds the text with <nowiki><p></p></nowiki> tags. Any text not included in heading or note tags should be enclosed by <nowiki><p></p></nowiki> tags, even if a discrete paragraph is not physically represented on the manuscript image (i.e. even if the text is a continuation of a paragraph from the previous manuscript of if it continues to the next manuscript).
If the word at the beginning of a paragraph has been indented, the indentation does not need to be reproduced in the transcript. The first word of the paragraph can be typed right next to the opening <p> tag.


=== Line Breaks ===
=== Line Breaks ===


In order to preserve the lineation of the manuscripts, a line break should be inserted directly after the final word or punctuation mark of each line. In order to do this, click the [[File:Jb-button-lb.png]] button on the toolbar: this inserts an <nowiki><lb/></nowiki> tag. It is important to note that the <nowiki><lb/></nowiki> tag does not have opening and closing tags, as it is a [http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#CORS5 milestone element], which marks a place in a text and does not have any content.
In order to preserve the lineation of the manuscripts, a line break should be inserted directly after the final word or punctuation mark of each line. In order to do this, click the [[File:Jb-button-lb.png]] button on the toolbar—this inserts an <lb/> tag.  
 
It is important to note that the <lb/> tag does not have opening and closing tags, as it is what is called a ‘milestone element’, which marks a place in a text and does not have any content.
Once you have added an <lb/> tag, please press return and begin the next line of the transcript on a new line in the Transcription Box. This will make it easier for you to follow the text you are transcribing, and for the TB Editor to check your work quickly.
 
If the tag is written as <lb> rather than <lb/>, then all text following the incorrect <lb> tag will not be displayed by the Transcription Desk when the transcript is saved. To correct this problem, simply find the incorrect line break tag and add the '/' to it.


Some users have noted an error occurring when using line break tags. If the tag is written as <nowiki><lb></nowiki> rather than <nowiki><lb/></nowiki>, then all text following the incorrect <nowiki><lb></nowiki> tag will not be displayed by the Transcription Desk when the transcript is saved. To correct this problem, simply find the incorrect line break tag and add the '/' to it.  
Line break tags are not required at the end of a paragraph or at the end of a manuscript—simple </p> or </note> tags are fine.


==== Line-end Hyphenation ====
==== Line-end Hyphenation ====


When a hyphenated word appears at the end of a line, transcribe the word without the hyphen, and insert the <nowiki><lb/></nowiki>, by clicking [[File:Jb-button-lb.png]] after the complete word.
When a hyphenated word appears at the end of a line, transcribe the full word without the hyphen, along with any punctuation that immediately follows it, and then insert the <lb/> tag by clicking [[File:Jb-button-lb.png]].
 


[[File:Line-end_hyphenation.jpg|500px|thumb|right|Line-end hyphenation]]
[[File:Line-end_hyphenation.jpg|500px|thumb|right|Line-end hyphenation]]
Line 110: Line 114:
  in which a difference in the point<nowiki><lb/></nowiki>
  in which a difference in the point<nowiki><lb/></nowiki>


If the hyphenated word is followed directly by a punctuation mark, include the punctuation mark before the <nowiki><lb/></nowiki> tag.
By transcribing hyphenated words in this way, you are making it easier for them to be picked up in keyword searches.


=== Page Breaks ===
=== Page Breaks ===


Like line breaks, a page break is indicated in markup with a milestone element: <nowiki><pb/></nowiki>. When transcribing a folio that contains more than one page ([[JB/027/124/001]], for example), a page break should be inserted to mark the point at which one page ends and another begins.
Like line breaks, a page break is indicated in markup with a milestone element: <pb/>. When transcribing a folio that contains a double page (JB/027/124/001, for example), a page break should be inserted to mark the point at which one page ends and another begins.
 
In order to do this, position the cursor at the relevant point in the transcription and click the [[File:Jb-button-pagebreak.png]] button on the toolbar: this will insert a <nowiki><pb/></nowiki> tag.


To do this, position the cursor at the relevant point in the transcription and click the [[File:Jb-button-pagebreak.png]] button on the toolbar: this will insert a <nowiki><pb/></nowiki> tag.
A <nowiki><pb/></nowiki> tag does not need to be inserted at the end of a single page.


In [[JB/027/124/001]], the page break would be recorded thus:
In [[JB/027/124/001]], the page break would be recorded thus:
Line 123: Line 129:
   positively assert</p>
   positively assert</p>
   <pb/>
   <pb/>
   <head>C</head>
   <head>C<lb/>
   <head>Prefat.</head>
   Prefat.</head>
   <p>England has long been regarded...</p></nowiki>
   <p>England has long been regarded...</p></nowiki>


Bentham occasionally quartered or divided large sheets into sections by drawing lines across the page. To identify when Bentham begins a new section, users should insert a page break by clicking [[File:Jb-button-pagebreak.png]]. If a section has a [[#Footers|footer]], mark this up as a [[#Headings|heading]] before inserting the page break.  
Bentham occasionally quartered or divided large sheets into sections by drawing lines across the page. To identify when Bentham begins a new section, users should insert a page break by clicking [[File:Jb-button-pagebreak.png]]. If a section has a [[#Footers|footer]], mark this up as a note before inserting the page break.  


Lines drawn across single sheets (e.g. [[JB/071/049/002]]) should not be considered as page breaks.
Lines drawn across single sheets (e.g. [[JB/071/049/002]]) should not be considered as page breaks.


=== Additions ===


=== Additions ===
The [[File:Jb-button-add.png]] button in the toolbar is used to mark a part of the text that was added to the manuscript after the surrounding text was written. This method may be used to mark additions, whether they are added above or (very rarely) below the line. The exception to this is [[#Marginal_Notes_.26_Summaries|marginal additions]], which are described below.


In its simplest form, the [[File:Jb-button-add.png]] button in the toolbar is used to mark a part of the text that was added to the manuscript after the surrounding text was written. The exception to this is [[#Marginal_Notes_.26_Summaries|marginal additions]], which are described below. This method may be used to mark additions, whether they are added above or (much less frequently) below the line. Highlight the addition and click the [[File:Jb-button-add.png]] button to surround it with <nowiki><add></add></nowiki> tags, as in the example below:
Highlight the addition and click the [[File:Jb-button-add.png]] button to surround it with <nowiki><add></add></nowiki> tags, as in the example below:


[[File:Addition_example.jpg|500px|thumb|right|Addition]]
[[File:Addition_example.jpg|500px|thumb|right|Addition]]
Line 153: Line 160:
=== Deletions ===
=== Deletions ===


Where a word or a sequence of words has been deleted in the manuscript, highlight the relevant text and click the [[File:Jb-button-del.png]] button in the toolbar. This will surround the text with <nowiki><del></del></nowiki> tags.
Where a word or a sequence of words has been struck-through in the manuscript, highlight the relevant text and click the [[File:Jb-button-del.png]] button in the toolbar. This will surround the text with <nowiki><del></del></nowiki> tags.


[[File:Deletion_example.jpg|500px|thumb|right|Deletion]]
[[File:Deletion_example.jpg|500px|thumb|right|Deletion]]
Line 165: Line 172:
  artificial: <nowiki><del>tables of it's population:</del></nowiki> tables of the
  artificial: <nowiki><del>tables of it's population:</del></nowiki> tables of the


Exercise common sense when deciding on the extent of deletions. Where the strikethrough does not physically cancel a punctuation mark that is apparently part of the deletion, you may assume that it forms part of the deletion. If in doubt about a particular example, you may email the editors at transcribe.bentham@ucl.ac.uk.
Just do what you think is best when deciding on the extent of deletions. Where the strikethrough does not physically cancel a punctuation mark that is apparently part of the deletion, you may assume that it forms part of the deletion. If in doubt about a particular example, you may [mailto:transcribe.bentham@ucl.ac.uk send an email] to the TB Editor.
 
In some instances, entire pages or paragraphs are crossed out (e.g. [[JB/027/029/003]]), which indicate where Bentham or his scribes have used a particular passage when putting together a work. Text which is struck through in this manner should not be enclosed in deletion tags.


In some instances, entire pages or paragraphs are crossed out (e.g. [[JB/027/029/003]], which indicate where Bentham or his scribes have used a particular passage when putting together a work. Text which is struck through in this manner should not be enclosed in deletion tags.
Just as it is best to avoid additions within additions, please also avoid using deletions within deletions—both of these practices prevent the transcription preview from displaying correctly and cause issues when we save your transcripts as .xml files.


=== Complex Additions and Deletions ===
=== Complex Additions and Deletions ===


Transcribers will quickly become aware of instances of more complex intervention in the manuscripts, often where there is a combination of added and deleted text. One such example might be called 'substitution', where text added above the line is intended to replace text that is deleted with a strikethrough.
Transcribers will quickly become aware of instances of more complex intervention in the manuscripts, often where there is a combination of added and deleted text. One such example is called 'substitution', where text added above the line is intended to replace text that is deleted with a strikethrough.
 
In marking substitutions, simply identifying text that has been added and text that has been deleted in its proper sequence will suffice.


[[File:Substitution_example.jpg|500px|thumb|right|Substitution]]
[[File:Substitution_example.jpg|500px|thumb|right|Substitution]]


The [http://www.tei-c.org/index.xml TEI] provides guidelines about encoding such phenomena with the <nowiki><subst></nowiki> element, but for the purposes of this project, simply identifying text that is added and text that is deleted will suffice.
The [http://www.tei-c.org/index.xml TEI] provides guidelines about encoding such phenomena with the <nowiki><subst></nowiki> element, but for the purposes of this project, simply identifying text that is added and text that is deleted will suffice.
Transcribers are advised that when ordering substitutions like this, the deleted text should be transcribed first, followed by the added text, following the implicit order in which the respective parts originally appeared in the manuscript.


For example, once the relevant parts of text from the example above have been tagged, the transcription will look like this:
For example, once the relevant parts of text from the example above have been tagged, the transcription will look like this:


  <nowiki><del>[To bring]</del><add>I will reduce</add></nowiki> the question at once
  <nowiki><del>[To bring]</del> <add>I will reduce</add></nowiki> the question at once
 
For the sake of consistency, transcribers are advised that when ordering substitutions like this, the deleted text should be transcribed first, followed by the added text, following the implicit order in which the respective parts originally appeared in the manuscript.


=== Catchwords ===
=== Catchwords ===
Line 197: Line 208:




   <nowiki><p>...in the <add>act</add> can not <add>be</add></p></nowiki>
   <nowiki><p>in the <add>act</add> can not<lb/>
<add>be</add></p>
</nowiki>


=== Illegible Text ===
=== Illegible Text ===


In the course of transcribing, you may encounter text that is illegible, either because Bentham's handwriting is difficult to read, or because it has been obscured by a strikethrough. There are slightly different ways to deal with each instance.
In the course of transcribing, you will inevitably encounter text that is illegible, either because Bentham's handwriting is difficult to read, or because it has been obscured by a strikethrough or cut off the edge of the manuscript. There are slightly different ways to deal with each instance.
 
If a word or sequence of words on the manuscript is illegible, but has not been deleted, it may be identified by clicking the [[File:Jb-button-gap.png]] button in the toolbar. This inserts a <nowiki><gap/></nowiki> tag.  


==== Undeleted ====
Note that <gap/>, like <lb/>, is a milestone element, and does not have any content—as such it does not have opening and closing segments.


If a word or sequence of words on the manuscript is illegible, but has not been deleted, it may be identified by clicking the [[File:Jb-button-gap.png]] button in the toolbar. This inserts a <nowiki><gap/></nowiki> tag. Insert one <nowiki><gap/></nowiki> tag for each illegible word, if it is possible to distinguish the number of illegible words in a sequence.
If it is possible to distinguish the number of illegible words in a sequence, insert one <gap/> tag for each illegible word.


==== Deleted ====
==== Deleted ====


If the word or sequence of words is illegible because it has been deleted or struck through on the manuscript, you should use the <nowiki><gap/></nowiki> tag in conjunction with <nowiki><del></nowiki> tags to indicate the reason for illegibility.
If the word or sequence of words is illegible because it has been deleted or struck through on the manuscript, you should use the <gap/> tag in conjunction with <del> tags to indicate the reason for illegibility.


[[File:Illegible_example.jpg|500px|thumb|right|Illegible text]]
[[File:Illegible_example.jpg|500px|thumb|right|Illegible text]]
Line 220: Line 235:


  But of that which remained, <nowiki><del><gap/></del></nowiki> as not
  But of that which remained, <nowiki><del><gap/></del></nowiki> as not
Note that <nowiki><gap/></nowiki>, like <nowiki><lb/></nowiki>, is a [http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#CORS5 milestone element], and does not have any content.


=== Questionable Reading ===
=== Questionable Reading ===


Where you have provided a transcription that you are not entirely certain about, this uncertainty may be registered by highlighting the word or sequence of words in question, and clicking the [[File:Jb-button-unclear.png‎]] button on the toolbar. This will surround the relevant text with <nowiki><unclear></unclear></nowiki> tags.
Where you have provided a transcription that you are not entirely certain about, this uncertainty may be noted by highlighting the word or sequence of words in question, and clicking the [[File:Jb-button-unclear.png‎]] button on the toolbar. This will surround the relevant text with <nowiki><unclear></unclear></nowiki> tags.


[[File:Unclear_example.jpg|500px|thumb|right|Questionable reading]]
[[File:Unclear_example.jpg|500px|thumb|right|Questionable reading]]
Line 239: Line 252:
   
   
  as <nowiki><unclear>particular</unclear></nowiki> as <nowiki><unclear>possible</unclear></nowiki>
  as <nowiki><unclear>particular</unclear></nowiki> as <nowiki><unclear>possible</unclear></nowiki>
=== Ampersands ===
Bentham uses the ampersand sign (i.e. ‘&’) quite frequently in his manuscripts. When it occurs in a manuscript that you are transcribing, click the [[File:Jb-button-ampersand.png]] button on the toolbar—this will add a piece of code (‘&amp;’) which will render the ampersand correctly in the saved transcription.
The reason you cannot simply type a '&' character on your keyboard is that an ampersand is what is called an ‘escape character’ in TEI encoding, which applies an alternative interpretation to any subsequent characters in a given sequence. Escape characters are not used within Transcribe Bentham.


=== Marginal Notes & Summaries ===
=== Marginal Notes & Summaries ===


Bentham wrote in the margins of a manuscript for two main purposes: to add text to a portion of the manuscript that was already written, or to provide a summary of the text opposite.
Bentham often wrote in the margins of a manuscript for two main purposes: either to add text to a portion of the manuscript that was already written, or to provide a summary of the text adjacent to it for the purpose of structuring his work.
 
In the first of these instances, Bentham often used a symbol in the main text of the manuscript to identify the point of attachment of the note: the symbol would then be reproduced at the text of the note in the margin. When this occurs, transcribe the text of the marginal note at the relevant point of attachment in the main text of the manuscript. Then, in order to identify it as a marginal note, highlight the text, and click the [[File:Jb-button-note.png]] button. This will surround the text with <nowiki><note></note></nowiki> tags.


==== Marginal Notes ====
You can include <lb/> tags inside the <note></note> tags to indicate if the note includes several lines of text.


In the first of these instances, Bentham often used a symbol in the main text of the manuscript to identify the point of attachment of the note: the symbol would then be reproduced at the text of the note in the margin. When this occurs, transcribe the text of the marginal note at the relevant point of attachment in the main text of the manuscript. Then, in order to identify it as a marginal note, highlight the text, and click the [[File:Jb-button-note.png]] button. This will surround the text with <nowiki><note></note></nowiki> tags.
When a symbol is not provided for the note at the point of attachment, you should encode the note at the point in the main text at which you think it is relevant. If in doubt, it is best to place the <note> at the end of the paragraph which it appears alongside.


[[File:Marginal_note_example.jpg|500px|thumb|right|Marginal note]]
[[File:Marginal_note_example.jpg|500px|thumb|right|Marginal note]]
Line 262: Line 283:
  complement of punishment that is judged
  complement of punishment that is judged


When a symbol is not provided for the note at the point of attachment, you should encode the note at the point in the main text at which you think it is relevant.
The <note> tags will generally be nested within <p> tags. In rare circumstances, a note will apply to a heading, and will then appear nested within <head> tags.
 
Marginal summaries are intended to provide a brief summary of adjacent text in Bentham's manuscripts. They are usually written in pencil and can be difficult to transcribe. Marginalia does not need to be transcribed. But if you would like to, it should be transcribed and encoded in the same fashion as marginal notes and placed before the paragraph to which it corresponds.
The <nowiki><note></nowiki> tags will generally be nested within <nowiki><p></nowiki> tags. In rare circumstances, a note will apply to a heading, and will then appear nested within <nowiki><head></nowiki> tags.


==== Marginal Summaries ====
==== Marginal Summaries ====


Marginal summaries are intended to provide a brief summary of adjacent text in Bentham's manuscripts. They are usually written in pencil, and should not be included in your transcription. If the marginalia is written in ink it should be transcribed and encoded in the same fashion as marginal notes, as suggested above.
Marginal summaries are intended to provide a brief summary of adjacent text in Bentham's manuscripts. They are usually written in pencil and can be difficult to transcribe. Marginalia does not need to be transcribed. But if you would like to, it should be transcribed and encoded in the same fashion as marginal notes and placed before the paragraph to which it corresponds.  


=== Underlined Text ===
=== Underlined Text ===


When a word has been underlined in the manuscript, you may identify it by highlighting the relevant text and clicking the [[File:Jb-button-underline.png]] button on the toolbar: this will surround the text with tags containing an attribute: <nowiki><hi rend="underline"></hi></nowiki>
When a word has been underlined in the manuscript, you may identify it by highlighting the relevant text and clicking the [[File:Jb-button-underline.png]] button on the toolbar—this will surround the text with the following tags: <hi rend="underline"></hi>
 
You may occasionally encounter pieces of text that have double or multiple underlinings. You may simply tag these in the same fashion as single-underlined text.


[[File:Underline_example.jpg|500px|thumb|right|Underline]]
[[File:Underline_example.jpg|500px|thumb|right|Underline]]
Line 289: Line 311:
  But <nowiki><hi rend="underline">where</hi></nowiki> and <nowiki><hi rend="underline">when</hi></nowiki>
  But <nowiki><hi rend="underline">where</hi></nowiki> and <nowiki><hi rend="underline">when</hi></nowiki>


You may occasionally encounter pieces of text that have double or multiple underlinings. You may simply tag these in the same fashion as single-underlined text.


=== Superscript ===
=== Superscript ===


Text in superscript is distinct from text that has been added to the manuscript after the surrounding text has been written; the latter should always be marked up by using the [[File:Jb-button-add.png]] button. A common example of superscript is seen in ordinal numbers, where the letters often appear above the line (3<small><sup>rd</sup></small>).
Text in superscript is distinct from additions, where, as described above, alternative text has been added to the manuscript after the surrounding text has been written.  
A common example of superscript is seen in ordinal numbers, where the letters often appear above the line (e.g. 3<small><sup>rd</sup></small>).


To encode an instance of superscript, highlight the relevant text and hit the [[File:Jb-button-superscript.png]] button on the transcription toolbar. This will surround the text with this piece of code: <nowiki><hi rend='superscript'></hi></nowiki>, as in the examples below:
To encode an instance of superscript, highlight the relevant text and click the [[File:Jb-button-superscript.png]] button on the transcription toolbar. This will surround the text with a piece of code (‘<hi rend='superscript'></hi>), as in the following examples:


[[File:Superscript example1.jpg|500px|thumb|right|Superscript 1]]
[[File:Superscript example1.jpg|500px|thumb|right|Superscript 1]]
Line 319: Line 341:
  <nowiki>a 5<hi rend='superscript'>th</hi> ingredient</nowiki>
  <nowiki>a 5<hi rend='superscript'>th</hi> ingredient</nowiki>


=== Unusual Spellings ===
=== Unusual Spellings and Abbreviations ===
 
TThere are instances in the manuscripts where Bentham employs an unusual spelling for a familiar word—these may include previously-acceptable spellings which are no longer in use or idiosyncratic misspellings.
Where unusual words or abbreviations occur, they may be encoded by highlighting the relevant word and clicking the [[File:Jb-button-sic.png]] button on the toolbar. This will result in <sic></sic> tags being added around the word or words.
 
If you encounter a word that appears to have an unfamiliar spelling, you may refer to this list of [[Alternative_spellings|unusual spellings]] to see whether it is one that Bentham used frequently.
 
The <sic></sic> tags should not be used for familiar contractions like ‘it's’, ‘don't’, ‘they're’, and so on. Bentham also uses abbreviations or contractions in words such as ‘employ'd’ or ‘suppos'd’—these should generally be tagged as unusual spellings, but abbreviations such as ‘Ch.’ for ‘Chapter’, for instance, do not need to be tagged as such.
 


There are occasional instances in the manuscripts where Bentham employs an unusual spelling for a familiar word: these may include previously-acceptable spellings which are no longer in use, or idiosyncratic misspellings. Where they occur, they may be encoded by highlighting the relevant word and clicking the [[File:Jb-button-sic.png]] button on the toolbar. This will result in <nowiki><sic></sic></nowiki> tags surrounding the word.


[[File:Unusual_spelling.jpg|500px|thumb|right|Unusual spelling]]
[[File:Unusual_spelling.jpg|500px|thumb|right|Unusual spelling]]
Line 334: Line 363:
  <nowiki><sic>compleat</sic></nowiki> code of laws
  <nowiki><sic>compleat</sic></nowiki> code of laws


<nowiki><sic></nowiki> should also be used to encode archaic contractions in words such as employ'd or suppos'd. You should not use it for familiar contractions like it's, don't, they're, and so on.


If you encounter a word that appears to have an unfamiliar spelling, you may refer to this list of [[Alternative_spellings|unusual spellings]] to see whether it is one that Bentham used frequently. You may also add words to this list to benefit other transcribers.


== Supplementary Guidelines ==
== Supplementary Guidelines ==
Line 342: Line 369:
=== User Comments ===
=== User Comments ===


In the event that you encounter something in the course of your transcription that is not covered by these Guidelines, you should email the Transcribe Bentham project at transcribe.bentham@ucl.ac.uk stating the name of the manuscript (found at the top of the page, e.g. JB/088/002/003) and describing the precise nature of your discovery.
In the event that you encounter something in the course of your transcription that is not covered by these Guidelines, you should insert a comment in the transcription to alert Transcribe Bentham editors and other transcribers to the issue. In order to do this, you should click the [[File:Jb-button-comment.png]] button on the toolbar. This will generate these characters: <!-- -->. You should type your comment between the dashes. The text of your comment will not appear in the saved transcription but will remain present in the Transcription Box.  


It may also be useful to insert a comment in the transcription to alert Transcribe Bentham editors and other transcribers to the problem. In order to do this, you should type your comment inside these characters: <nowiki><!--   --></nowiki>, which are generated by clicking the [[File:Jb-button-comment.png]] button on the toolbar.
<nowiki><!-- There is an unusual feature at this point in the manuscript --></nowiki>


<nowiki><!-- There is an unusual feature at this point in the manuscript --></nowiki>
If you have questions about an unusual feature in a manuscript, [mailto:transcribe.bentham@ucl.ac.uk send an email] to the TB Editor with the name of the manuscript (found at the top of the page, e.g. JB/088/002/003) and information about the nature of your discovery.


The text of your comment will not appear in the saved transcription once you have saved the page, but will remain present in the transcription box.


=== Non-English Language ===
=== Foreign Language ===


While transcribing Bentham's manuscripts, you may encounter languages other than English: this may occur in isolated words, brief passages, or longer sections of writing. You may encode such instances by highlighting the relevant non-English text, and clicking the [[File:Jb-button-foreign.png]] button on the toolbar. This will surround the text with <nowiki><foreign></foreign></nowiki> tags.
While transcribing Bentham's manuscripts, you may encounter languages other than English: this may occur in isolated words, brief passages, or longer sections of writing. You may encode such instances by highlighting the relevant non-English text, and clicking the [[File:Jb-button-foreign.png]] button on the toolbar. This will surround the text with <nowiki><foreign></foreign></nowiki> tags.
Where non-English words include diacritics such as accents (é) or circumflexes (ô), these should be transcribed wherever possible. You can produce such characters in Microsoft Word (or a similar programme) using keyboard shortcuts or the 'Symbols' menu. You can then copy and paste the character into the Transcription Box. Alternatively, you could copy and paste the character from another website.


[[File:Non-English_language.jpg|500px|thumb|right|Non-English language]]
[[File:Non-English_language.jpg|500px|thumb|right|Non-English language]]
Line 365: Line 394:
  <nowiki><foreign>d'une fantaisie contrariée</foreign></nowiki>
  <nowiki><foreign>d'une fantaisie contrariée</foreign></nowiki>


=== Ampersands ===
=== Dashes ===
 
You will often encounter dashes of varying lengths in the manuscripts—either hyphens (-), en-dashes (–), or em-dashes (—).


Bentham uses the ampersand sign (&amp;) quite frequently in his manuscripts. When it occurs in a manuscript you are transcribing, you should click the [[File:Jb-button-ampersand.png]] button on the toolbar: this will add a piece of code (& amp;) which will render the ampersand correctly in the saved transcription.
Use your discretion to determine whether a Bentham dash is best represented by a hyphen, an en-dash or an em-dash. There is no need to worry too much about which is which, as long as some form of dash is included in the transcript.


The reason you cannot simply type a '&amp;' character on your keyboard is that in markup, '&amp;' is an escape character which invokes an alternative interpretation on subsequent characters in a character sequence.
For a hyphen or en-dash, you may simply type a hyphen (-) into the transcription box. For an em-dash, you should click the [[File:Jb-button-mdash.png]] button on the toolbar. This will insert a Unicode character code (‘&#x2014;’) which will enable the representation of the em-dash in your web browser.


=== Dashes ===
=== Pencil markings ===


Occasionally, you will encounter dashes of varying lengths in the manuscripts. In general, these may correspond to the en-dash and the em-dash. In printing houses, pieces of type that held the letters 'n' and 'm' were used as units for measuring and estimating the amount of printed matter in a line or page. Thus, a dash the width of a letter 'n' became known as a en-dash, while a longer dash, the width of a letter 'm', was called an em-dash.
Most manuscript pages were imprinted with a University College London stamp in the process of being catalogued. Two numbers are usually written in pencil inside this stamp to indicate the box and folio number of that particular page. Both the stamp and the pencil numbers should not be transcribed.


Use your discretion to judge whether a Bentham dash is best represented by an en- or em-dash. For an en-dash, you may simply type a hyphen (-) into the transcription box; if inserting an em-dash, you should click the [[File:Jb-button-mdash.png]] button on the toolbar. This will insert a Unicode character code (& #x2014;) which will enable the representation of the em-dash in web browsers.
Any other pencil markings which appear on a page—which may include marginal summaries, headings, and corrections—do not need to be transcribed. However, if you can read and would like to transcribe any text written in pencil, you are free to do so. Please add a User Comment before any text written in pencil, like so: <nowiki><!-- text written in pencil --></nowiki>


=== Ligatures ===
=== Ligatures ===


Bentham occasionally uses ligatures in his writing, e.g. &#230; and &#339;. These do not occur very frequently, but should you encounter one, you should simply transcribe the individual letters of the ligature ('oe' rather than '&#339;'), and insert a User Comment containing the word 'ligature' directly afterwards.
A ligature or diphthong is a character where two or more letters are joined together (such as ‘æ’). Bentham occasionally uses ligatures in his writing. Should you encounter one, you should simply transcribe the individual letters of the ligature ('oe' rather than 'œ'), and insert a User Comment containing the word 'ligature' directly afterwards.


  oeconomy<nowiki><!-- ligature --></nowiki>
  oeconomy<nowiki><!-- ligature --></nowiki>
Wherever an ‘æ’ diphthong appears, this may be inserted using the Special Characters drop-down menu on the toolbar.


=== Symbols ===
=== Symbols ===


The case with symbols used in the manuscripts is similar to that of ligatures: Bentham uses a number of symbols (e.g. section sign: &#167;), with varying regularity. If it possible to reproduce a symbol from the keys on your keyboard, you should do so, otherwise, you should simply register its presence with a User Comment: <nowiki><!-- symbol --></nowiki>
Bentham used a number of symbols in his writings, including section symbols (§) and manicules (☞). These may be added using the Special Characters drop-down menu on the toolbar. If it is possible to reproduce others symbol from the keys on your keyboard or by copying and pasting from another website, you should do so. Otherwise, you should simply register the presence of a symbol with a User Comment: <!-- symbol -->.
 
=== Brackets ===
 
In your transcription, you may represent the various types of brackets used in the manuscripts, including parentheses ( ), square brackets [ ], or braces { }. Take care not to use angle brackets < >, as these are used only for markup elements.
 
== Terms Used in the Guidelines ==
----
The following is an explanation of some of the terms used in these Guidelines, which may be unfamiliar to some users. It is important to understand these terms in order to grasp the full import of the Guidelines.


=== Transcription ===
=== Tables ===


Transcription refers to the text that the user reads from the Bentham manuscript and then copies into the Transcription Box.
Bentham sometimes presented information in tabular form, in rows and columns.
It is difficult to replicate the format of a table in Transcribe Bentham, so you should concentrate on making sure that the text from the table is reproduced accurately in your transcription.
Depending on the shape of the table, it may make more sense to transcribe the text row-by-row or column-by-column. You should note the presence of tabular text in the transcript by inserting a User Comment before the table: <nowiki><!-- the following text appears in a table--></nowiki>.


=== Encoding / Markup ===
[[File:Table.png|500px|thumb|right|Text written in the form of a table]]


Encoding and Markup are terms which may be used interchangeably. They refer to tags that are included in the transcription in order to identify features of the text and manuscript in a manner that allows them to be processed by a computer.


If you click the buttons on the transcription toolbar, markup will appear in the transcription box alongside the text that you have entered, for example:


whatever <nowiki><add>just</add></nowiki> remark may


It is very important that you do not delete or alter any of the markup that appears in the transcription box.


=== Tags ===


A tag is a string of characters surrounded by angle brackets, i.e. "<" and ">". Tags are used to identify part of the transcription, and usually come in pairs, known as "opening" and "closing" tags. A closing tag can be identified by a slash after the "<".


If, for instance, a user wished to note that the word "utility" was deleted from a manuscript they were transcribing, it would be tagged thus: <nowiki><del>utility</del></nowiki>.


Users will not have to type tags into the Transcription Box: they will be automatically generated by highlighting the relevant part of the text, and clicking a button in the Transcription Toolbar.


=== Elements ===


The element is the core part of the tag, and occurs after the "<". In <nowiki><del>utility</del></nowiki>, "del" is the element.


=== Attributes ===


Tags may also contain attributes, which describe the element in more detail. The attribute appears after the element in the opening tag (it never appears in the closing tag), and is followed by an attribute value (see below).


For example, to note the manner in which the word "utility" was deleted, the following attribute may be used: <nowiki><del rend="strikethrough">utility</del></nowiki>. An element may have multiple attributes, each separated by a single space.


=== Values ===


An attribute value is a word or short phrase that classifies the element in terms of a particular attribute. It is contained within quotation marks and preceded by an equal sign. In the example above, "strikethrough" is the value.
=== Printed text ===


=== Nesting ===
The Bentham collection contains a significant number of printed texts, including Parliamentary bills and contemporary pamphlets. Sometimes pages contains a mixture of printed and handwritten text, for which you can include a User Comment to note when printed text appears on a page.


In order for a computer to be able to process a TEI document effectively, it must be well-formed. This means that it must obey certain syntax rules, one of which is that tags are nested properly, and do not overlap.
Printed text may be transcribed according to the same guidelines as handwritten manuscripts. Transcribing printed text is a good way to get used to TEI encoding, but please be aware that, unlike Bentham’s own text sheets, printed texts within the Bentham
The best way to think about nesting is that it works on a radial principle, from the centre outwards, without overlapping.


The following example is not well-formed, because the <nowiki><del></nowiki> element opens before the <nowiki><add></nowiki> element closes; thus, the tags are not correctly nested:
Papers will not be included in The Collected Works of Jeremy Bentham.


<nowiki><add><del></add></del></nowiki>
=== Brackets ===


A correctly-nested formulation of the same tags might look like either of the following:
In your transcription, you may represent the various types of brackets used in the manuscripts, including parentheses ( ), square brackets [ ], or braces { }. Take care not to use angle brackets < >, as these are used only for markup elements.
 
This document was first written in May 2010 and was last updated in January 2020.  
<nowiki><add><del></del></add>
<del><add></add></del></nowiki>
 
Consider the following example, in which the word "direct" has been added to the manuscript, and subsequently deleted:
 
[[File:Deleted_addition_example.jpg|500px|thumb|right|Deleted addition]]
 
If we consider the sequence of actions logically, "direct" must first have been added to the manuscript, and then deleted. The encoding can implicitly register this sequence, by first registering the addition and then the deletion. Two sets of tags are used for this purpose, and they must be nested properly:
 
 
of immediate <nowiki><del><add>direct</add></del></nowiki> use,
 
== Additional Advice ==


*[[Help:User_Accounts|Manage your user account]]
The Guidelines have evolved slightly over time in line with editorial discussions about transcription and encoding.
*[[Help:Contents|General advice on using the site]]
If anything in these Guidelines is unclear, please send an email to the TB Editor at [mailto:transcribe.bentham@ucl.ac.uk send an email] to the Transcribe Bentham Editors.
----


[[Category:Help|Transcription Guidelines]]
[[Category:Help|Transcription Guidelines]]

Latest revision as of 18:58, 6 December 2023

Introduction


Download a printable PDF of these guidelines

This page offers Guidelines for users of Transcribe Bentham. Here you can find directions about how best to transcribe Bentham's writings, and how to encode specific features of the manuscripts. Some of the information below may seem daunting at first, but do not be afraid to have a go—it is impossible to break anything, and any errors you might make can easily be reverted.

The Guidelines are divided into four sections:

  • The Getting Started guide gives an overview of the transcription process, from start to finish.
  • The Basic Principles guide explains the essentials of transcription and encoding.
  • The Core Guidelines describe the manuscript features that users will encounter most frequently, and how to deal with them.
  • The Supplementary Guidelines discuss the treatment of less-frequently occurring features of the manuscripts, such as ligatures, symbols, and foreign-language words.

We are very grateful to our transcribers for their hard work. Visit our Credits page to find out how our volunteers are acknowledged for the work they do.

Need more information? Check out our Help pages, or email us - we are always happy to help.

Basic Principles

Transcribing

Transcription refers to the text that the user reads from the Bentham manuscript and then copies into the Transcription Box. When transcribing text, your aim should be to produce a transcription which represents the text of the manuscript as accurately as possible. Reproduce Bentham’s spellings, capitalisation, and punctuation exactly as they appear on the manuscript, even if they seem incorrect. For instance, Bentham and his scribes frequently got accents on foreign letters incorrect or omitted them altogether. These mistakes should not be corrected, nor should any contracted words be expanded (e.g. ‘Mr’ to ‘Mister’) or any symbols be rendered as words (e.g. ‘&’ to ‘and’).

Do your best to transcribe as accurately as possible, but do not worry too much if you cannot transcribe everything on a page. The TB Editor may be able to help improve your transcript, as may other volunteers.

Once you have transcribed a page, a final proofread often makes it easier to spot any errors. Thinking about the sense of the words on the page can help too—although please bear in mind that some of Bentham's papers do not make a lot of sense right away, especially when taken out of context.

For help in deciphering Bentham’s handwriting, please take a look at our Palaeography Skills page.

If you are unsure about how to transcribe something, do what you think is correct or send us an email at transcribe.bentham@ucl.ac.uk and we will try our best to help you as soon as possible.

Encoding

We ask volunteers to encode their transcripts in Text Encoding Initiative (TEI)-compliant XML format, an industry standard method for encoding electronic texts. Encoding can be done simply by clicking the buttons on your transcription toolbar.

By encoding your transcripts, you are helping to create a richer learning resource—owing to your efforts, researchers who are interested in Bentham's writing process, deletions and revisions, will gain valuable insight they might not otherwise have had. Encoded transcripts also allow for more powerful and refined searching.

Tags are used to identify parts of the transcription and usually come in pairs, known as ‘opening’ and ‘closing’ tags. If, for instance, a user wished to note that the word ‘utility’ was deleted from a manuscript they were transcribing, it would be tagged as utility.

To apply a set of tags, highlight a word or passage and click the appropriate button in the toolbar. Opening and closing tags will appear around your highlighted text. A closing tag can be identified by a forward slash after the ‘<’. Sometimes only single tags are needed (such as for the ‘’ and ‘
’ tags)—to apply these, simply click the relevant button in the toolbar.

To see how the markup has been applied to your transcript, click the ‘Preview’ tab in the transcription interface.

Encoding may seem challenging at first but you will soon get the hang of it. As you become more confident, you may prefer to type the markup on your keyboard rather than using the transcription toolbar.

For more information, please take a look at our Encoding page. If you are interested in learning more about TEI, we recommend visiting the TEI by Example website, which contains a number of tutorials and exercises.

Core Guidelines


Headings

If the page you are transcribing includes a title or heading, you may identify this feature by highlighting the transcribed text of the heading and clicking the button on the toolbar. This will surround the heading with tags. Bentham occasionally provides headings that span multiple lines—in this instance, simply add a
tag to mark the line break/s in the heading. Using multiple heading tags on each manuscript may create problems for the TB Editor when converting the text to XML and may slow down the checking process, so should be avoided wherever possible.

Headings








<head>1818 April 15<lb/> 
Annuity Notes Proposed Advertisement on proposed publication on painless fees</head>

Footers

Page numbers and other details which sometimes appear in footers on Bentham's manuscripts should be marked-up in the same manner as notes (i.e. using tags).

Paragraphs

Once a paragraph from a manuscript has been transcribed, it may be identified by highlighting the text of the paragraph with the cursor and clicking the button on the toolbar. This surrounds the text with

tags. All text that is not included in heading or note tags should be enclosed by

tags. If the word at the beginning of a paragraph has been indented, the indentation does not need to be reproduced in the transcript. The first word of the paragraph can be typed right next to the opening

tag.

Line Breaks

In order to preserve the lineation of the manuscripts, a line break should be inserted directly after the final word or punctuation mark of each line. In order to do this, click the button on the toolbar—this inserts an
tag.

It is important to note that the
tag does not have opening and closing tags, as it is what is called a ‘milestone element’, which marks a place in a text and does not have any content. Once you have added an
tag, please press return and begin the next line of the transcript on a new line in the Transcription Box. This will make it easier for you to follow the text you are transcribing, and for the TB Editor to check your work quickly.

If the tag is written as <lb> rather than
, then all text following the incorrect <lb> tag will not be displayed by the Transcription Desk when the transcript is saved. To correct this problem, simply find the incorrect line break tag and add the '/' to it.

Line break tags are not required at the end of a paragraph or at the end of a manuscript—simple

or </note> tags are fine.

Line-end Hyphenation

When a hyphenated word appears at the end of a line, transcribe the full word without the hyphen, along with any punctuation that immediately follows it, and then insert the
tag by clicking .


Line-end hyphenation

In the example opposite, the word 'circumstance' is hyphenated at the end of the first line. The transcription should read as follows:



customs, religion of the inhabitants, every circumstance<lb/>
in which a difference in the point<lb/>

By transcribing hyphenated words in this way, you are making it easier for them to be picked up in keyword searches.

Page Breaks

Like line breaks, a page break is indicated in markup with a milestone element:
---page break---
. When transcribing a folio that contains a double page (JB/027/124/001, for example), a page break should be inserted to mark the point at which one page ends and another begins.

In order to do this, position the cursor at the relevant point in the transcription and click the button on the toolbar: this will insert a <pb/> tag.

A <pb/> tag does not need to be inserted at the end of a single page.

In JB/027/124/001, the page break would be recorded thus:

 <p>...we who are not of the Profession of the Law, cannot<lb/>
  positively assert</p>
  <pb/>
  <head>C<lb/>
  Prefat.</head>
  <p>England has long been regarded...</p>

Bentham occasionally quartered or divided large sheets into sections by drawing lines across the page. To identify when Bentham begins a new section, users should insert a page break by clicking . If a section has a footer, mark this up as a note before inserting the page break.

Lines drawn across single sheets (e.g. JB/071/049/002) should not be considered as page breaks.

Additions

The button in the toolbar is used to mark a part of the text that was added to the manuscript after the surrounding text was written. This method may be used to mark additions, whether they are added above or (very rarely) below the line. The exception to this is marginal additions, which are described below.

Highlight the addition and click the button to surround it with <add></add> tags, as in the example below:

Addition







whatever <add>just</add> remark may

Deletions

Where a word or a sequence of words has been struck-through in the manuscript, highlight the relevant text and click the button in the toolbar. This will surround the text with <del></del> tags.

Deletion




artificial: <del>tables of it's population:</del> tables of the

Just do what you think is best when deciding on the extent of deletions. Where the strikethrough does not physically cancel a punctuation mark that is apparently part of the deletion, you may assume that it forms part of the deletion. If in doubt about a particular example, you may send an email to the TB Editor.

In some instances, entire pages or paragraphs are crossed out (e.g. JB/027/029/003), which indicate where Bentham or his scribes have used a particular passage when putting together a work. Text which is struck through in this manner should not be enclosed in deletion tags.

Just as it is best to avoid additions within additions, please also avoid using deletions within deletions—both of these practices prevent the transcription preview from displaying correctly and cause issues when we save your transcripts as .xml files.

Complex Additions and Deletions

Transcribers will quickly become aware of instances of more complex intervention in the manuscripts, often where there is a combination of added and deleted text. One such example is called 'substitution', where text added above the line is intended to replace text that is deleted with a strikethrough.

In marking substitutions, simply identifying text that has been added and text that has been deleted in its proper sequence will suffice.

Substitution

The TEI provides guidelines about encoding such phenomena with the <subst> element, but for the purposes of this project, simply identifying text that is added and text that is deleted will suffice.

Transcribers are advised that when ordering substitutions like this, the deleted text should be transcribed first, followed by the added text, following the implicit order in which the respective parts originally appeared in the manuscript.

For example, once the relevant parts of text from the example above have been tagged, the transcription will look like this:

<del>[To bring]</del> <add>I will reduce</add> the question at once

Catchwords

A catchword is the first word of the following page inserted at the right-hand lower corner of a manuscript folio, below the last line. They appear quite frequently in Bentham's writings, and should be encoded in the same fashion as an addition, as in the example below:

Catchword





 <p>in the <add>act</add> can not<lb/> 
<add>be</add></p>

Illegible Text

In the course of transcribing, you will inevitably encounter text that is illegible, either because Bentham's handwriting is difficult to read, or because it has been obscured by a strikethrough or cut off the edge of the manuscript. There are slightly different ways to deal with each instance.

If a word or sequence of words on the manuscript is illegible, but has not been deleted, it may be identified by clicking the button in the toolbar. This inserts a <gap/> tag.

Note that , like
, is a milestone element, and does not have any content—as such it does not have opening and closing segments.

If it is possible to distinguish the number of illegible words in a sequence, insert one tag for each illegible word.

Deleted

If the word or sequence of words is illegible because it has been deleted or struck through on the manuscript, you should use the tag in conjunction with tags to indicate the reason for illegibility.

Illegible text




But of that which remained, <nowiki></nowiki> as not

Questionable Reading

Where you have provided a transcription that you are not entirely certain about, this uncertainty may be noted by highlighting the word or sequence of words in question, and clicking the button on the toolbar. This will surround the relevant text with <unclear></unclear> tags.

Questionable reading






as <unclear>particular</unclear> as <unclear>possible</unclear>

Ampersands

Bentham uses the ampersand sign (i.e. ‘&’) quite frequently in his manuscripts. When it occurs in a manuscript that you are transcribing, click the button on the toolbar—this will add a piece of code (‘&’) which will render the ampersand correctly in the saved transcription.

The reason you cannot simply type a '&' character on your keyboard is that an ampersand is what is called an ‘escape character’ in TEI encoding, which applies an alternative interpretation to any subsequent characters in a given sequence. Escape characters are not used within Transcribe Bentham.

Marginal Notes & Summaries

Bentham often wrote in the margins of a manuscript for two main purposes: either to add text to a portion of the manuscript that was already written, or to provide a summary of the text adjacent to it for the purpose of structuring his work.

In the first of these instances, Bentham often used a symbol in the main text of the manuscript to identify the point of attachment of the note: the symbol would then be reproduced at the text of the note in the margin. When this occurs, transcribe the text of the marginal note at the relevant point of attachment in the main text of the manuscript. Then, in order to identify it as a marginal note, highlight the text, and click the button. This will surround the text with <note></note> tags.

You can include
tags inside the tags to indicate if the note includes several lines of text.

When a symbol is not provided for the note at the point of attachment, you should encode the note at the point in the main text at which you think it is relevant. If in doubt, it is best to place the at the end of the paragraph which it appears alongside.

Marginal note





a former chapter be true <del><add>just</add></del>, that <nowiki><note>even 

in a civilised
life</nowiki> the whole<lb/>

complement of punishment that is judged

The <note> tags will generally be nested within

tags. In rare circumstances, a note will apply to a heading, and will then appear nested within <head> tags. Marginal summaries are intended to provide a brief summary of adjacent text in Bentham's manuscripts. They are usually written in pencil and can be difficult to transcribe. Marginalia does not need to be transcribed. But if you would like to, it should be transcribed and encoded in the same fashion as marginal notes and placed before the paragraph to which it corresponds.

Marginal Summaries

Marginal summaries are intended to provide a brief summary of adjacent text in Bentham's manuscripts. They are usually written in pencil and can be difficult to transcribe. Marginalia does not need to be transcribed. But if you would like to, it should be transcribed and encoded in the same fashion as marginal notes and placed before the paragraph to which it corresponds.

Underlined Text

When a word has been underlined in the manuscript, you may identify it by highlighting the relevant text and clicking the button on the toolbar—this will surround the text with the following tags:

You may occasionally encounter pieces of text that have double or multiple underlinings. You may simply tag these in the same fashion as single-underlined text.

Underline







But <hi rend="underline">where</hi> and <hi rend="underline">when</hi>


Superscript

Text in superscript is distinct from additions, where, as described above, alternative text has been added to the manuscript after the surrounding text has been written. A common example of superscript is seen in ordinal numbers, where the letters often appear above the line (e.g. 3rd).

To encode an instance of superscript, highlight the relevant text and click the button on the transcription toolbar. This will surround the text with a piece of code (‘), as in the following examples:

Superscript 1
Superscript 2









Happ.<hi rend='superscript'>ss</hi> and Unhapp.<hi rend='superscript'>ss</hi>
a 5<hi rend='superscript'>th</hi> ingredient

Unusual Spellings and Abbreviations

TThere are instances in the manuscripts where Bentham employs an unusual spelling for a familiar word—these may include previously-acceptable spellings which are no longer in use or idiosyncratic misspellings. Where unusual words or abbreviations occur, they may be encoded by highlighting the relevant word and clicking the button on the toolbar. This will result in tags being added around the word or words.

If you encounter a word that appears to have an unfamiliar spelling, you may refer to this list of unusual spellings to see whether it is one that Bentham used frequently.

The tags should not be used for familiar contractions like ‘it's’, ‘don't’, ‘they're’, and so on. Bentham also uses abbreviations or contractions in words such as ‘employ'd’ or ‘suppos'd’—these should generally be tagged as unusual spellings, but abbreviations such as ‘Ch.’ for ‘Chapter’, for instance, do not need to be tagged as such.


Unusual spelling





<sic>compleat</sic> code of laws


Supplementary Guidelines


User Comments

In the event that you encounter something in the course of your transcription that is not covered by these Guidelines, you should insert a comment in the transcription to alert Transcribe Bentham editors and other transcribers to the issue. In order to do this, you should click the button on the toolbar. This will generate these characters: . You should type your comment between the dashes. The text of your comment will not appear in the saved transcription but will remain present in the Transcription Box.

<!-- There is an unusual feature at this point in the manuscript -->

If you have questions about an unusual feature in a manuscript, send an email to the TB Editor with the name of the manuscript (found at the top of the page, e.g. JB/088/002/003) and information about the nature of your discovery.


Foreign Language

While transcribing Bentham's manuscripts, you may encounter languages other than English: this may occur in isolated words, brief passages, or longer sections of writing. You may encode such instances by highlighting the relevant non-English text, and clicking the button on the toolbar. This will surround the text with <foreign></foreign> tags.

Where non-English words include diacritics such as accents (é) or circumflexes (ô), these should be transcribed wherever possible. You can produce such characters in Microsoft Word (or a similar programme) using keyboard shortcuts or the 'Symbols' menu. You can then copy and paste the character into the Transcription Box. Alternatively, you could copy and paste the character from another website.


Non-English language





<foreign>d'une fantaisie contrariée</foreign>

Dashes

You will often encounter dashes of varying lengths in the manuscripts—either hyphens (-), en-dashes (–), or em-dashes (—).

Use your discretion to determine whether a Bentham dash is best represented by a hyphen, an en-dash or an em-dash. There is no need to worry too much about which is which, as long as some form of dash is included in the transcript.

For a hyphen or en-dash, you may simply type a hyphen (-) into the transcription box. For an em-dash, you should click the button on the toolbar. This will insert a Unicode character code (‘—’) which will enable the representation of the em-dash in your web browser.

Pencil markings

Most manuscript pages were imprinted with a University College London stamp in the process of being catalogued. Two numbers are usually written in pencil inside this stamp to indicate the box and folio number of that particular page. Both the stamp and the pencil numbers should not be transcribed.

Any other pencil markings which appear on a page—which may include marginal summaries, headings, and corrections—do not need to be transcribed. However, if you can read and would like to transcribe any text written in pencil, you are free to do so. Please add a User Comment before any text written in pencil, like so: <!-- text written in pencil -->

Ligatures

A ligature or diphthong is a character where two or more letters are joined together (such as ‘æ’). Bentham occasionally uses ligatures in his writing. Should you encounter one, you should simply transcribe the individual letters of the ligature ('oe' rather than 'œ'), and insert a User Comment containing the word 'ligature' directly afterwards.

oeconomy<!-- ligature -->

Wherever an ‘æ’ diphthong appears, this may be inserted using the Special Characters drop-down menu on the toolbar.

Symbols

Bentham used a number of symbols in his writings, including section symbols (§) and manicules (☞). These may be added using the Special Characters drop-down menu on the toolbar. If it is possible to reproduce others symbol from the keys on your keyboard or by copying and pasting from another website, you should do so. Otherwise, you should simply register the presence of a symbol with a User Comment: ‘’.

Tables

Bentham sometimes presented information in tabular form, in rows and columns. It is difficult to replicate the format of a table in Transcribe Bentham, so you should concentrate on making sure that the text from the table is reproduced accurately in your transcription. Depending on the shape of the table, it may make more sense to transcribe the text row-by-row or column-by-column. You should note the presence of tabular text in the transcript by inserting a User Comment before the table: <!-- the following text appears in a table-->.

Text written in the form of a table








Printed text

The Bentham collection contains a significant number of printed texts, including Parliamentary bills and contemporary pamphlets. Sometimes pages contains a mixture of printed and handwritten text, for which you can include a User Comment to note when printed text appears on a page.

Printed text may be transcribed according to the same guidelines as handwritten manuscripts. Transcribing printed text is a good way to get used to TEI encoding, but please be aware that, unlike Bentham’s own text sheets, printed texts within the Bentham

Papers will not be included in The Collected Works of Jeremy Bentham.

Brackets

In your transcription, you may represent the various types of brackets used in the manuscripts, including parentheses ( ), square brackets [ ], or braces { }. Take care not to use angle brackets < >, as these are used only for markup elements. This document was first written in May 2010 and was last updated in January 2020.

The Guidelines have evolved slightly over time in line with editorial discussions about transcription and encoding. If anything in these Guidelines is unclear, please send an email to the TB Editor at send an email to the Transcribe Bentham Editors.

UCL Home » Transcribe Bentham » Transcription Desk
  • Create account
  • Log in