Handheld dictionaries or PEDs

Handheld electronic dictionaries, also known as "pocket electronic dictionaries" or PEDs, resemble miniature clamshell laptop computers, complete with full keyboards and LCD screens. Because they are intended to be fully portable, the dictionaries are battery-powered and made with durable casing material. Although produced all over the world, handheld dictionaries are especially popular in Japan, Korea, China, and neighbouring countries, where they are the dictionary of choice for many users learning English as a second language. Some of the features of hand held dictionaries include stroke order animations, voice output, handwriting recognition for Kanji (a system of Japanese writing using Chinese characters) and Kana (the system of syllabic writing used for Japanese, having two forms, hiragana and katakana), language-learning programs, a calculator, PDA-like organizer functions, encyclopedias, time zone and currency converters, and crossword puzzle solvers. Dictionaries that contain data for several languages may have a "jump" or "skip-search" feature that allows users to move between the dictionaries when looking up words, and a reverse translation action that allows further look-ups of words displayed in the results. Many manufacturers produce hand held dictionaries that use licenced dictionary content that use a database such as the Merriam Webster Dictionary and Thesaurus while others may use a proprietary database from their own lexicographers. Many devices can be expanded for several languages with the purchase of additional memory cards.

Online dictionaries

There are several types of online dictionary, including:

ü Aggregator sites, which give access to data licensed from various reference publishers. They typically offer monolingual and bilingual dictionaries, one or more thesauruses, and technical or specialized dictionaries. Examples include TheFreeDictionary.com and Dictionary.com;

ü 'Premium' dictionaries available on subscription, such as the Oxford English Dictionary;

ü Dictionaries from a single publisher, free to the user and supported by advertising. Examples include Collins Online Dictionary, Duden Online, Larousse bilingual dictionaries, the Macmillan English Dictionary, and the Merriam-Webster Learner'sDictionary.

ü Dictionaries available free from non-commercial publishers (often institutions with government funding). Examples include the Algemeen Nederlands Woordenboek (ANW), and Den Danske Ordbog;

Some online dictionaries are regularly updated, keeping abreast of language change. Many have additional content, such as blogs and features on new words. Many dictionaries for special purposes, especially for professional and trade terminology, and regional dialects and language variations, are published on the websites of organizations and individual authors. Although they may often be presented in list form without a search function, because of the way in which the information is stored and transmitted, they are nevertheless electronic dictionaries.

References:

1. Hutchins, J. (2005). "The history of machine translation in a nutshell"
2. Melby, Alan K. (1995). The Possibility of Language. Amsterdam: J. Benjamins. pp. 27–41.
3. Van Slype, G. (1983) "Better Translation for Better Communications", (Pergamon Press : Paris)

Lecture 3. Translation Memory

1. A translation memory, or TM, is a type of database that stores segments that have been previously translated. A translation-memory system stores the words, phrases and paragraphs that have already been translated and aid human translators. The translation memory stores the source text and its corresponding translation in language pairs called “translation units”.

Some software programs that use translation memories are known as translation memory managers (TMM).

Translation memories are typically used in conjunction with a dedicated computer assisted translation (CAT) tool, word processing program, terminology management systems, multilingual dictionary, or even raw machine translation output.

A translation memory consists of text segments in a source language and their translations into one or more target languages. These segments can be blocks, paragraphs, sentences, or phrases. Individual words are handled by terminology bases and are not within the domain of TM.

Research indicates that many companies producing multilingual documentation are using translation memory systems. In a survey of language professionals in 2006, 82.5 % out of 874 replies confirmed the use of a TM. Usage of TM correlated with text type characterised by technical terms and simple sentence structure.

Using translation memories

The program breaks the source text (the text to be translated) into segments, looks for matches between segments and the source half of previously translated source-target pairs stored in a translation memory, and presents such matching pairs as translation candidates. The translator can accept a candidate, replace it with a fresh translation, or modify it to match the source. In the last two cases, the new or modified translation goes into the database.

Some translation memories systems search for 100% matches only, that is to say that they can only retrieve segments of text that match entries in the database exactly, while others employ fuzzy matching algorithms to retrieve similar segments, which are presented to the translator with differences flagged. It is important to note that typical translation memory systems only search for text in the source segment. The flexibility and robustness of the matching algorithm largely determine the performance of the translation memory, although for some applications the recall rate of exact matches can be high enough to justify the 100%-match approach.

Segments where no match is found will have to be translated by the translator manually. These newly translated segments are stored in the database where they can be used for future translations as well as repetitions of that segment in the current text.

Translation memories work best on texts which are highly repetitive, such as technical manuals. They are also helpful for translating incremental changes in a previously translated document, corresponding, for example, to minor changes in a new version of a user manual. Traditionally, translation memories have not been considered appropriate for literary or creative texts, for the simple reason that there is so little repetition in the language used. However, others find them of value even for non-repetitive texts, because the database resources created have value for concordance searches to determine appropriate usage of terms, for quality assurance (no empty segments), and the simplification of the review process (source and target segment are always displayed together while translators have to work with two documents in a traditional review environment).

If a translation memory system is used consistently on appropriate texts over a period of time, it can save translators considerable work.

Main benefits

Translation memory managers are most suitable for translating technical documentation and documents containing specialized vocabularies. Their benefits include:

* Ensuring that the document is completely translated (translation memories do not accept empty target segments)
* Ensuring that the translated documents are consistent, including common definitions, phrasings and terminology. This is important when different translators are working on a single project.
* Enabling translators to translate documents in a wide variety of formats without having to own the software typically required to process these formats.
* Accelerating the overall translation process; since translation memories "remember" previously translated material, translators have to translate it only once.
* Reducing costs of long-term translation projects; for example the text of manuals, warning messages or series of documents needs to be translated only once and can be used several times.
* For large documentation projects, savings (in time or money) thanks to the use of a TM package may already be apparent even for the first translation of a new project, but normally such savings are only apparent when translating subsequent versions of a project that was translated before using translation memory.

Main obstacles

The main problems hindering wider use of translation memory managers include:

* The concept of "translation memories" is based on the premise that sentences used in previous translations can be "recycled". However, a guiding principle of translation is that the translator must translate the "message" of the text, and not its component "sentences".
* Translation memory managers do not easily fit into existing translation or localization processes. In order to take advantages of TM technology, the translation processes must be redesigned.
* Translation memory managers do not presently support all documentation formats, and filters may not exist to support all file types.
* There is a learning curve associated with using translation memory managers, and the programs must be customized for greatest effectiveness.
* In cases where all or part of the translation process is outsourced or handled by freelance translators working off-site, the off-site workers require special tools to be able to work with the texts generated by the translation memory manager.
* Full versions of many translation memory managers can cost from US$500 to US$2,500 per seat, which can represent a considerable investment (although lower cost programs are also available). However, some developers produce free or low-cost versions of their tools with reduced feature sets that individual translators can use to work on projects set up with full versions of those tools. (Note that there are freeware and shareware TM packages available, but none of these has yet gained a large market share.)
* The costs involved in importing the user's past translations into the translation memory database, training, as well as any add-on products may also represent a considerable investment.
* Maintenance of translation memory databases still tends to be a manual process in most cases, and failure to maintain them can result in significantly decreased usability and quality of TM matches.
* As stated previously, translation memory managers may not be suitable for text that lacks internal repetition or which does not contain unchanged portions between revisions. Technical text is generally best suited for translation memory, while marketing or creative texts will be less suitable.
* The quality of the text recorded in the translation memory is not guaranteed; if the translation for particular segment is incorrect, it is in fact more likely that the incorrect translation will be reused the next time the same source text, or a similar source text, is translated, thereby perpetuating the error.
* There is also a potential, and, if present, probably an unconscious effect on the translated text. Different languages use different sequences for the logical elements within a sentence and a translator presented with a multiple clause sentence that is half translated is less likely to completely rebuild a sentence.
* There is also a potential for the translator to deal with the text mechanically sentence-by-sentence, instead of focusing on how each sentence relates to those around it and to the text as a whole.
* Translation memories also raise certain industrial relations issues as they make exploitation of human translators easier.