★ Keep up to date with the latest news - subscribe to the Transcribe Bentham newsletter; Find a new page to transcribe in our list of Untranscribed Manuscripts
TSX is an exciting new development in crowdsourced transcription, in which you can - if you so wish - take advantage of Handwritten Text Recognition technology in the course of transcribing. By using TSX you will be, as well as contributing to the work of Transcribe Bentham by transcribing manuscripts written and composed by Jeremy Bentham, also contribute to our understanding of how HTR can best be of benefit to transcribers.
Below, you will find instructions and guidelines on how to use TSX. They include:
Some of this may seem daunting, particularly in introducing HTR technology. But please, do not be afraid of having a go at using TSX: it is impossible to break anything! We are very much looking forward to hearing what you think via the optional online survey, or via email at transcribe.bentham@ucl.ac.uk.
Users may also wish to consult:
We recommend that you access TSX using Google's Chrome web browser, running on Windows, for the optimum experience. TSX will run on Firefox, though navigating images using the mouse-wheel currently does not work.
There are known issues with running TSX in Internet Explorer, and in MacOS. Future development will ensure compatibility with all major browsers and operating systems. Please do accept our apologies for any inconvenience which this current incompatibility may cause.
Manuscripts are accessed via a file tree in the ‘Manuscript collections’ tab.
Simply click the white arrow next to the ‘Bentham’ collection, and a list of manuscript collections will be displayed. You may find that occasionally the white arrow next to 'Bentham' does not appear. If this happens, please refresh your browser window - it may take a few refreshes, but it should appear eventually. Click on any of the following collections to access Bentham manuscripts:
These batches contain a total of 800 images. More images will be added in due course. Please note: images uploaded to TSX have not been uploaded to Transcribe Bentham, so there is no risk of the duplication of work. Transcripts produced using TSX will also go towards the general aims of Transcribe Bentham, namely the transcription of the Bentham Papers.
To select a manuscript, simply click on the desired thumbnail and the transcription interface will load. The default status for manuscripts is 'In Progress'. If the transcript area already contains text when you load a manuscript, then someone has already transcribed it. In future development, we will indicate more clearly which manuscripts are untranscribed, partially-transcribed, and complete.
The image pane consists of the image viewing window, and four buttons. The image and transcription areas are linked: clicking on a line of text in the manuscript image will highlight a line in the ‘Transcript’ pane, and vice versa, ensuring that you can easily keep your place in both.
You can turn on TSX's auto-scroll feature by clicking , after which the image viewer will automatically move as you transcribe each line. Click .
To navigate the image, move the mouse pointer to an area where there is no text. When the pointer turns to a four-way arrow, hold down the left mouse button and drag the image.
You can zoom in or out using by holding down and , or by using your mouse wheel. Pressing will return the image to its default position and zoom level.
The ‘Transcript’ pane consists of the transcription area, and a toolbar by which you can add TEI mark-up to your transcript to indicate the various features of the manuscript. It also features three tabs, ‘Edit’, ‘Preview’, and ‘Diffs’, which will be explained below.
We recommend that you click at regular intervals to save your transcript as you go along.
Even if you do not wish or need to utilise the HTR technology when transcribing, you can still take advantage of a number of other useful features offered by the TSX interface.
The manuscript has been automatically segmented into lines by the HTR technology. If you click on the first numbered line in the transcription area, the line of the manuscript to be transcribed will be highlighted (and vice versa). Once you have transcribed a line, simply press 'Return' or the down arrow on your keyboard to move to the next line.
Please note: on occasion, some lines may have been imperfectly segmented or be out of reading order. This issue is caused by the technology being unable to correctly identify individual lines (it has particular trouble with text written in pencil). We apologise for any inconvenience which this may cause. A future version of TSX should, we hope, provide the facility to re-order and identify more clearly lines in the images.
Those familiar with Transcribe Bentham will instantly recognise the toolbar above the transcription area, which allows you to add text encoding to indicate the various features of the manuscript. To apply the tags simply highlight a portion of the text, or a place in the text, and press the relevant button on the toolbar.
Unlike in Transcribe Bentham, in TSX the toolbar contains no line-break or page-break tags: the segmentation of images into lines automatically applies these tags. A further of advantage of the TSX interface is that you can easily distinguish between your transcribed text, which is black, and the text encoding, which has coloured formatting. For more details on encoding, please visit our encoding with TSX page.
You can switch between the three tabs in the transcription interface at your convenience.
'Edit' is the default tab, in which you transcribe and encode a manuscript.
Preview
Clicking the 'Preview' tab will show you how your encoded transcript will look when saved, and shows the various encoded features of the manuscript.
Differences tab
Clicking the 'Diffs' tab will show how your version of the transcript differs from the last saved version (if there is one).
One of the key features of TSX is the facility to request a transcript of a manuscript, which is generated automatically by the HTR technology. It is very unlikely that the HTR transcript will be perfect, and so you should check and correct the transcript against the original manuscript.
If you are new to transcribing, or do not have the time to fully transcribe a manuscript, this option may be particularly useful. Simply click on the transcript pane in order to request a transcript.
The accuracy of the HTR transcript will vary depending on a number of factors, including the page layout, the neatness of the handwriting, the complexity of the language used, and whether the text is written in ink or pencil. It is unlikely, for instance, that the HTR technology will be able to cope with complex manuscripts written by Bentham himself.
In the example above, the HTR engine has produced a useful transcription of the manuscript. However, it does require correction. For instance, the HTR engine things that the final part of the fifth line reads 'In tally there is no getting one', whereas it in fact reads 'In fact there is no getting on'.
Once you have been through the the transcript and corrected the errors, you can add the text encoding, and then save your transcript for the final time. Click the to send an email indicating that your transcript is ready for checking.
The other key feature of TSX is interactive transcription. Anyone who has ever transcribed a historic manuscript will have encountered a word, several words, or sections of the manuscript which are frustratingly indecipherable. TSX's interactive transcription facility seeks to help transcribers with this problem, by offering multiple suggestions for a word or words in a particular line. Interactive transcription may also help new transcribers to get to grips with deciphering the peculiarities of Bentham's hand (and that of others).
Imagine, in the example below, that you are able to transcribe the first five words of the eighth line, but are unable to read the following word. To take advantage of interactive transcription, ensure that there is a space after your last transcribed word and press 'Tab' on your keyboard.
A drop-down menu will now appear, presenting you with potential alternatives for the next word. If any of the suggestions look correct, you can either click the word with your mouse, or use the arrow keys on your keyboard to highlight the word, and then press 'Tab' again. You can, if you wish, continue to work your way through the line a word at a time.
It is also possible to request suggestions for an entire line at a time (or the remainder of a line if you have already transcribed the first few words). Simply press 'Shift' and 'Tab' together on your keyboard, and the drop-down menu will appear with a number of suggested alternatives.
Again, to select the suggestion which you think looks correct (if any) either click on it, or highlight it using the arrow keys on the keyboard and press 'Tab' again.
Once you have transcribed and encoded the transcript, click 'save' for the final time. Click the to send an email indicating that your transcript is ready for checking.