Description of the Elements and Attributes in the OLIF DTD; date: Fri Feb 8 10:40:59 2002
Name Type Description
abbrev e The abbrev element holds data about an abbreviated form of the entry string (note that abbreviations may also be encoded as cross-references). Example use: ERP
abbrevHandling e The abbrevHandling element holds data about the way how abbreviations are represented. Two options exist: via the abbrev element or via a crossRefer element. Example use: we use both the abbrev element, and the crossRefer element
address e The address element holds data about a postal address of the distributor.
AdminLang a The AdminLang attribute holds data about the default language for the administrative and informative elements 'note' and 'prop'. The value of the AdminLang attribute must be one of the ISO 3166/639 language identifiers (2 or 3-letter code) or one of the standard locale identifiers (2 or 3-letter language code, dash, 2-letter territory/country code). Example use: en
adminStatus e The adminStatus element classifies the administrative status of an entry relative to a given work environment. Example values: ver
aspect e The aspect element classifies verbal aspect. Example values: perf, iter
aspectDCS e The aspectDCS element holds data about a user-extended scheme for describing the aspect of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
auxType e The auxType element classifies the auxiliary type for an auxiliary verb. Example values: have, faire
availability e The availability element holds data about the availability of an OLIF file, for example, any restrictions on its use or distribution, its copyright status, etc. A company may use 'Available upon written agreement' to indicate that the OLIF file may not be freely redistributed.
body e The body element groups a list of entries which contain linguistic/lexical/terminological data categories for entry strings/designators.
byteCount e The byteCount element holds data about the size of the OLIF document including its tags, in its representation as a text file encoded in the character set mentioned in the encoding attribute of the XML declaration. This is useful for calculating media requirements or file download times.
ByteCountUnit a The ByteCountUnit attribute classifies the unit in which the bytecount is measured. Possible values: bytes - bytes kb* - kilobytes mb - megabytes gb - gigabytes
canForm e The canForm element holds the entry string, represented in canonical form in accordance with OLIF guidelines. Example use: success story
case e The case element classifies grammatical case. Example values: d, a, loc
changePOS e The changePOS element holds data about the part of speech of an element being added or deleted Example values: noun, adj
changeType e The changeType element holds data related to the type of change. Example values: change-role, add-in-target
changeValue e The changeValue element holds data about the string or data category being changed. Example values: active, subj-dobj
company e The company element holds information about the company/organisation for which the entry is valid. Example use: LongDistanceRunners Ltd.
conceptCount e The conceptCount element holds data about the number of concepts in the OLIF document.
conceptHierarchyDCS e The conceptHierarchyDCS element holds data about a user-extended scheme for describing the concept hierarchy/ontology of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
confidence e The confidence element holds data from terminology extraction. The value of the confidence element indicates, how confident the term extraction program is, that the term really is a term. Example values: 0.99, high
contentInfo e The contentInfo element groups data categories related to the practice adopted for encoding quotation marks, abbreviations etc.
context e The context element holds data about one of the following: a) the context for a given translation of a source word/phrase into a target word/phrase b) the context for a structural change in the target language Example values: pp, genobj
contextStmt e The contextStmt element groups multiple related contexts (contexts can be connected by means of logical operators).
CreaDate a The CreaDate attribute holds data about the date of the creation of the element. Its value must be in ASCII, in the format YYYYMMDDThhmmssZ. (e.g. 19970811T133402Z for August 11th 1997 at 1 hour 34 minutes 2 seconds.) This is one of the options described in ISO 8601:1988. The value should be given in Coordinated Universal Time (UTC; as indicated by the terminal Z). Example use: 19970811T133402Z
CreaId a The CreaId attribute holds data about the user who created the element. Example use: Lars Nauter
CreaTool a The CreaTool attribute holds data about the tool that created the OLIF document. Its possible values are not specified in OLIF but each tool provider will publish the string identifier it uses. Example use: CoolTermExtract
CreaToolVersion a The CreaToolVersion attribute holds data about the version of the tool that created the OLIF document. Its possible values are not specified in OLIF but each tool provider will publish the string identifier it uses. Example use: 2.14
crLinkType e The crLinkType element classifies the relation between the entry from which the link originates and the entry to which the link points. The possible relations include ISO relations (most of which formally apply to concepts rather than the terms themselves; they have been adapted here for the purposes of OLIF) and the analysis contained in EuroWordNet (July, 2000). Example values: synonym, antonym
crLinkTypeDCS e The crLinkTypeDCS element holds data about a user-extended scheme for describing the types of cross-references between OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
crossRefer e The crossRefer element groups the data categories for cross-references. Cross-references define relations between the given entry (link source) and other entries in the lexicon (link target) in the same language.
dataCatReg e The dataCatReg element groups data categories for extensions to extensible OLIF data categories (like ptOfSpeech). The idea is that whenever a user chooses to make use of a user extension (and for example supplies his own tag set for part-of-speech), he explains the overall listing of the data categories and values he uses (for example via a URL that he puts into the ptOfSpeechDCS element of the dataCatReg element). The dataCatReg element contains several data category specifications (DCS).
date e The date element holds data about a date. Its value must be in ASCII, in the format YYYYMMDDThhmmssZ. (e.g. 19970811T133402Z for August 11th 1997 at 1:34pm 2 seconds.) This is one of the options described in ISO 8601:1988. The value is preferably given in Coordinated Universal Time (UTC; as indicated by the terminal Z). The DateValue attribute can be used to specify the date in an arbitrary format.
DateValue a The DateValue attribute holds data about the a date in ISO 8601 format.
DCSType a The DCSType attribute classifies a data category specification. Possible values: replacement - replace existing OLIF values extension - extend (add to) the predefined OLIF values.
definition e The definition element holds a prose definition of the entry string. Example use: Collection of interfaces usable by a programmer
degree e The degree element classifies adjectival degree type. Example values: comp, sup
depSynonym e The depSynonym element holds data about a rejected or deprecated synonym of the entry string. Example use: IS-H
distributor e The distributor element holds data about the person or institution who distributes the OLIF document.
DistributorType a The DistributorType attribute classifies a distributor. Possible values: person - name of a person place - name of a place org - name of an organization article in a periodical cmp - name of a company
eAddress e The eAddress element holds data about an electronic address of the person or institution who distributes the OLIF file. Note that more than one occurrence of this tag can appear, so that multiple addresses (possibly of different types) can be included.
EAddressType a The EAdressType attribute classifies the electronic address (email address, web site, ftp site, etc.). Possible values: email* - the value is an electronic mail address url - the value is an URL
entry e The entry element groups all of the linguistic/lexical/terminological data categories related to a single entry string/designator.
entryCount e The entryCount element holds data about the number of entries in the OLIF document.
entryFormation e The entryFormation element classifies the shape/structure of the entry string. Example values: abb, acr
entrySource e The entrySource element holds data about the entry source, or the lexicon/termbase that the entry originated from. Example use: TermDB for software package X
entryStatus e The entryStatus element classifies the entry status of an entry within a given lexicon/termbase (note that there exists a separate data category for the administrative status). Example values: word
entryType e The entryType element classifies the entry string as being a product name, trademark, or orthographic variant (note that orthographic variants may also be encoded as cross-references). Example values: trademark, orth-var
equival e The equival element holds data about the degree of transfer relationship between words/phrases in two different languages. Example values: full, partial
example e The example element holds data about a sample text or portion of text that contains the entry string as an illustration of usage. Example use: ERP is on the rise again.
fax e The fax element holds data about the fax number of the person or institution who distributes the OLIF file (preferably in a format conformant to ITU-T/CCITT Recommendation E.123.
fileDesc e The fileDesc element groups data categories relating to physical features of the OLIF instance (document).
fileExtent e The fileExtent element groups data categories related to counts of items (for example number of entries) in the contents of the OLIF instance.
fileId e The fileId element holds data about a unique identifier (e.g. a globally unique identifier) of the OLIF file. Example use: 011000358700000683362001E.xml
fileName e The fileName element holds data about the name of the OLIF file. Example use: olifForAgency14Jan02.xml
gender e The gender element classifies grammatical gender. Example values: m, f
generalDC e The generalDC element groups general data categories. General data categories are optional elements that can be used in any of the top-level OLIF groups for entries (mono, crossRefer, or transfer).
geogUsage e The geogUsage element holds data about the geographical usage, or dialect, of the entry string. Example values: CA, GB
head e The head element holds data about the head word in a multiword/phrasal entry string. Example use: infotype (planned compensation infotype)
header e The header element groups data categories information about the data that has been encoded (thus, header holds meta-data).
idNo e The idNo element holds data about a number (e.g. ISBN) used to identify an OLIF document.
IdNotype a The IdNoType attribute holds data about a name or abbreviation (e.g., isbn) identifying what type of identifying number is given. Possible values: isbn* - the value is an International Standard Book Number (ISBN) number
inflection e The inflection element holds data about the inflection pattern(s) of the entry string (or its head in case of a multiword/phrasal entry). Example use: book, 16
inflectionDCS e The inflectionDCS element holds data about a user-extended scheme for describing the inflection of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
InflectionDCSType a The InflectionDCSType attribute classifies the way how inflection information has been encoded. Possible values: classDesignator - reference to a code/designator from a classification scheme inflectsLike - example
keyDC e The keyDC element groups the five key data categories whose values uniquely identify an entry.
KeyDCUniversalId a The KeyDCUniversalId attribute holds data about a universal identifier (ie. one which is unique, not only in the user's environment but worldwide) of a grouping of OLIF key data categories. This identifier can for example be used in cross-references.
KeyDCUserId a The KeyDCUserId attribute holds data about a user-defined identifier of a grouping of OLIF key data categories. This identifier can for example be used in cross-references.
langIdUse e The langIdUse element holds data about the way language identifers have been used. Possible values: region_standard - the region part of a locale (e.g. the CA in FR_CA) has been used even if the term also exists in the unrestricted locale (e.g. French as a whole). region_exception - the region part of a locale only has been used if the term does not exist in the unrestricted locale.
language e The language element encodes the language to which the entry string belongs. Example values: fr, en
locInfo e The locInfo element holds data about localization-relevant information (e.g. product version, component name, operating system platform, or build number).
logOp e The logOp element holds data about a logical operator. Possible values: AND - for trRestrictStmt and structChangeStmt OR - for trRestrictStmt NOT - for trRestrictStmt
logOpAnd e The logOpAnd element holds data about the logical operator AND.
mapping e The mapping element groups a mapValue and a mapTarget. The mapValue should be used for the item designated by the mapTarget.
mappingTarget e The mappingTarget element holds data about an item to which a replacement should be applied.
mappingValue e The mapping element holds data about a replacement string that is used in a mapping.
modDate e The modDate element holds data about the date on which the entry was last modified. Example use: 20011115T140324Z
mono e The mono element groups the monolingual data within an entry.
monoAdmin e The monoAdmin element groups the administrative data within a monolingual entry.
monoDC e The monoDC element groups optional data categories for administrative, morphological, syntactic and semantic data.
monoMorph e The monoMorph element groups the morphological information within a monolingual entry.
monoSem e The monoSem element groups the semantic information within a monolingual entry.
monoSyn e The monoSyn element groups the syntactic information within a monolingual entry.
mood e The mood element classifies verb mood or mode. Example values: imper, cond
morphStruct e The morphStruct element holds data about the morphological structure of the entry string (note the possibilities provided for multiwords by means of the synStruct element). Example use: #[[gebrauch+s]:[gegen+stand]]#
morphStructDCS e The morphStructDCS element holds data about a user-extended scheme for describing the internal morphological structure of entry strings/designators (see the comment for the ptOfSpeechDCS element for more information).
name e The name element holds data about a name (e.g. of a distributor or owner).
natGender e The natGender element classifies the biological gender associated with the entry. Example values: m, f, un
note e The note element holds data about a note, or commentary, on an entry by a lexicographer/terminologist. Example use: Never translate this.
NoteType a The NoteType attribute holds data for categorizing notes (e.g. 'for localizer', 'for quality management').
number e The number element classifies grammatical number. Example values: sg, du
olif e The olif element is the base document element of a document in Open Lexicon Interchange Format (OLIF).
OlifVersion a The OlifVersion attribute holds data about the version of OLIF to which the XML instance (document) conforms. The OLIF Consortium publishes the string identifier that might be used for the OlifVersion attribute.
OrigFormat a The OrigFormat attribute holds data about the format of the file from which the OLIF document has been generated. The format specification may include a product name and even a version tag. This may lead to format specifications like the following: LOGOS-eSense LOGOS-LDE-1.1 LOGOS-LDE-1.2
originator e The originator element holds data about the individual who originated the entry. Example use: Christopher Columbus
orthVariant e The orthVariant element holds data about an orthographic variant of the entry string (note that orthographic variants may also be encoded as cross-references). Example use: auf Grund
orthVariantType e The orthVariantType element classifies the type of orthographic variant that the target of a cross-reference represents (currently only used for German; used for example to list old/new spelling) represents. Example values: german-4
orthVariantTypeDCS e The orthVariantTypeDCS element holds data about a user-extended scheme for describing the orthographic variants of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
owner e The owner element holds data about the person, or institution that owns the OLIF document.
OwnerType a The OwnerType attribute classifies an owner. Possible values: natPerson - name of a person place - name of a place org - name of an organization article in a periodical cmp - name of a company
person e The person element classifies grammatical person. Example values: first, sec
phraseType e The phraseType element classifies the phrasal type of an entity. Example values: mw
prep e The prep element holds data about prepositions that further specify syntactic frame elements. Example use: into, about, from, mit, wegen, ausser
product e The product element holds data about a product for which an entry is valid. Example use: Spreadsheet3005
project e The project element holds data about a project for which an entry is valid. Example use: localization of product X from English into German
prop e The prop element holds data about non-standard (proprietary) information in an OLIF document. It may be used for communicating tool-specific information.
PropLang a The PropLang attribute holds data about the language used in a prop element.
PropType a The PropType attribute holds data about the kind of data a prop element represents.
ptOfSpeech e The ptOfSpeech element classifies the part-of-speech represented by the entry string. In cases of phrases/multiword entries, the value for part-of-speech depends on the function of the phrase/multiword within a clause; the part-of-speech of the head element often indicates the value for part-of-speech value for the entire phrase/multiword string. Example values: noun, verb
ptOfSpeechDCS e The ptOfSpeechDCS element (DCS is short for data category specification) holds data about a user-extended scheme for describing the part-of-speech of OLIF entries. Users can for example describe their additional part-of-speech tags by means of a URL or by means of CDATA sections. Example uses: http://www.company.com/nlp/ptOfSpeech/projectX.htm
publStmt e The pubStmt element groups data categories related to the distributor and the owner of the OLIF document. The publStmt element also gives supplementary information about the OLIF document (e.g. copyright protection).
PubStatus a The PubStatus attribute classifies the current availability of the OLIF data. Possible values: restricted - the text is not freely available unknown* - the status of the text is unknown free - the text is freely available
QuotMarkForm a The QuotMarkForm attribute classifies the standardization of quotation marks. Possible values: std - use of quotation marks has been standardized and open and close quote marks are distinct nonStd - open and close quote marks are represented indiscriminately unknown*- use of quotation marks is unknown
quotMarkInfo e The quotMarkInfo element holds data about editorial practice adopted with respect to quotation marks. Example use: our open quote is '!' and our closing quote is '$'
QuotMarkRet a The QuotMarkRet attribute classifies the convention used for retaining quotation marks. Possible values: none - no quotation marks have been retained some - some quotation marks have been retained all - all quotation marks have been retained
Region a The Region attribute holds data about the territories within which rights related to the OLIF data apply. Possible values: world* - the text is freely available eu - European Union only
replacements e The replacements element groups data categories for string replacements that should be applied to the document. The replacement element helps to compress data and might for example specify one value for the date element of a list of 1000 elements.
semReading e The semReading element classifies readings for entries with identical values for canonical form, language, part-of-speech, and subject field. Example values: color, definite space
semReadingDCS e The semReadingDCS element holds data about a user-extended scheme for describing the semantic reading information of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
semType e The semType element classifies an entry string with respect to a semantic type classification structure. Example values: anim-hum-pn, cnc-class
semTypeDCS e The semTypeDCS element holds data about a user-extended scheme for describing the semantic types of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
structChange e The structChange element groups data categories related to a change in the target language vis-a-vis the source structure based on the transfer restriction having been satisfied. Structural changes are definable for the following parts-of-speech: noun, verb, adjective, preposition.
structChangeStmt e The structChangeStmt element groups multiple related structural changes (which can be connected via the logical operator AND).
subjField e The subjField element classifies the knowledge domain to which the lexical/terminological entry is assigned. Example values: agriculture, aviation
subjFieldDCS e The subjFieldDCS element holds data about a user-extended scheme for describing the subject field information of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
syllabification e The syllabification element holds data about the syllable boundaries within the entry string. Example use: do-cu-men-ta-ry, li-be-ra-li-ty
syllabificationMarkInfo e The syllabificationMarkInfo element holds data about editorial practice adopted with respect to syllabification in the original. Example use: we use '*' as marker
synFrame e The synFrame element classifies the syntactic frame for the entry string (subcategorisation). Example values: subj-imps-opt, dobj-opt
synFrameDCS e The synFrameDCS element holds data about a user-extended scheme for describing the syntactic frames of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
synPosition e The synPosition element classifies the unmarked positioning of the entry string syntactically. Example values: prenoun, cl-init
synStruct e The synStruct element holds data about the constituent structure of a multiword entry string (note the possibilities provided for single words by means of the morphStruct element). Example use: [[adj][noun]] (General Ledger)
synStructDCS e The synStructDCS element holds data about a user-extended scheme for describing the syntactic structures of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
synType e The synType element classifies the general syntactic behavior of the entry string. Example values: cnt, refl, attrib
synTypeDCS e The synTypeDCS element holds data about a user-extended scheme for describing the syntactic type of OLIF entries (see the comment for the ptOfSpeechDCS element for more information).
telephone e The telephone element holds data about the telephone number of the person or institution who distributes the OLIF file (preferably in a format conformant to ITU-T/CCITT Recommendation E.123).
tense e The tense element classifies verb tense. Example values: pres, fut
termCount e The termCount element holds data about the number of terms (generally defined as those entries which are both not general vocabulary and distinguished from one another by the values of the key data categories) in the OLIF document.
termExtractInfo e The termExtractInfo element holds data which is relevant for terminology extraction (e.g. name and size of corpus to which term extraction has been applied).
test e The test element holds data about a single test.
testDC e The testDC element holds data about a data category to which a test pertains. Example values: semType, tense
testStmt e The testStmt element groups multiple related tests (connected by means of logical operators).
testType e The testType element holds data about the type of test. Example values: string, datacat
testValue e The testValue element holds data about the string or data category being tested in the context(s) (eg. 'sg' if the test is on the data category for grammatical number). Example values: anim-hum, sg
timeRestrict e The timeRestrict element holds data about a time restriction, or the period of time during or since which usage of the entry is valid. Example use: 20011115T140324Z/20011215T140324Z
transfer e The transfer element groups data categories which define bilingual transfer relations between the given entry and other entries in the lexicon in different languages (cf. to crossRefer elements which point to entries in the same language).
transType e The transType element classifies the transitivity type of a verb. Example values: trans, ditrans
TrDefault a The TrDefault attribute holds data about the default transfer.
trRestrict e The trRestrict element groups data categories for a single transfer restriction.
trRestrictStmt e The trRestrictStmt element groups multiple related transfer restrictions (eg. alternatives connected via the logical operator OR).
TrTarget a The TrTarget attribute holds data about the target entry of a transfer relationship.
updater e The updater element holds data about the individual who last modified the entry. Example use: Jessica King
usage e The usage element holds data about a usage note for the entry string. Example use: Never use this when talking about ERP.
userDesignat e The userDesignat element holds a user designator of an entry string. The userDesignat element can be used if a need exists to represent the entry string not just in canonical form.
valDefault e The valDefault element holds data about the default value for one specific data category. Example use: The example below shows how to set the default for the data category 'product' to the string 'OLIF Converter': OLIF Converter
ValDefaultRefName a The ValDefaultRefName attribute holds data about the name of the element, attribute or entity to which a value default is related.
ValDefaultRefType a The ValDefaultRefType attribute classifies the OLIF item to which a value default refers. Possible values: el - element att - attribute en - entity.
valueDefaults e The valueDefaults element groups information about the default values for various data categories. Whenever an OLIF entry does not specify a value for one of these data categories, information from the valueDefaults element should be applied.
verbPart e The verbPart element holds data about verb particles that further specify syntactic frame elements. Example use: down, up, over
workflowInfo e The workflowInfo element holds data about user-specific workflow support. Example use: to be validated by 31 Dec 2001 at the latest
workflowInfo e The workflowInfo element holds data about workflow-related information like the task that is currently performed, its deadlines, and the person responsible for executing the task.