ISO 10303-21:2016(E)
In the exchange structure, a token is a special token, a keyword, a simple data type encoding, or an IETF encoding.
The special token "ISO-10303-21;" shall be used to open an exchange structure, and the special token "END-ISO-10303-21;" shall be used to close an exchange structure.
The special token "HEADER;" shall be used to open the optional header section of an exchange structure, and the special token "ENDSEC;" shall be used to close the header section of an exchange structure.
The special token "ANCHOR;" shall be used to open the optional anchor section of an exchange structure, and the special token "ENDSEC;" shall be used to close the anchor section of an exchange structure.
The special token "REFERENCE;" shall be used to open the optional reference section of an exchange structure, and the special token "ENDSEC;" shall be used to close the reference section of an exchange structure.
The special token "DATA" shall be used to open the optional data sections of an exchange structure, and the special token "ENDSEC;" shall be used to close the data sections of an exchange structure.
The special token "SIGNATURE" shall be used to open the optional signature sections of an exchange structure, and the special token "ENDSEC;" shall be used to close the signature sections of an exchange structure.
The special token dollar sign ("$") is used to represent an object whose value is not provided in the exchange structure.
The special token asterisk ("*") is used to represent an object whose value is not provided in the exchange structure but can be derived from other values according to rules given in the EXPRESS schema (see 12.2.6).
The special tokens semicolon (";"), parentheses ("(", ")"), comma (",") and solidus ("/") are used to punctuate the exchange structure.
Keywords are sequences of graphic characters indicating an entity or a defined type in the exchange structure. Keywords shall consist of capital letters, digits, low lines, and possibly an exclamation mark "!". The exclamation mark shall occur at most once, and only as the first character in a keyword.
Keywords may be schema-defined keywords or user-defined keywords. Keywords that do not begin with the exclamation mark are schema-defined keywords. Keywords that begin with the exclamation mark are user-defined keywords. A user-defined keyword is the identifier for a named type (an entity data type or a defined type) in the EXPRESS schema governing the exchange structure. The meaning of a user-defined keyword is a matter of agreement between the partners using the exchange structure.
Six simple data type encodings are used in exchange structures: integer, real, string, instance name, enumeration and binary.
An integer shall be encoded as a sequence of one or more digits, as prescribed in Table 2, optionally preceded by a plus sign "+" or a minus sign "-". Integers shall be expressed in base 10. If no sign is associated with the integer, the integer shall be assumed to be positive.
EXAMPLE
Valid integer expressions | Meaning |
---|---|
16 | Positive 16 |
+12 | Positive 12 |
-349 | Negative 349 |
012 | Positive 12 |
00 | Zero |
Invalid integer expressions | Problem |
---|---|
26 54 | Contains spaces |
32.0 | Contains full stop |
+ 12 | Contains space between plus sign and digits |
A real shall be encoded as prescribed in Table 2. The encoding shall consist of a decimal mantissa optionally followed by a decimal exponent. The decimal mantissa consists of an optional plus sign "+" or minus sign "-", followed by a sequence of one or more digits, followed by a full stop ".", followed by a sequence of zero or more digits. A decimal exponent consists of the latin capital letter E optionally followed by a plus sign "+" or minus sign "-", followed by one or more digits.
NOTE 1 No attempt is made to convey the concept of precision in this part of ISO 10303. Where a precise meaning is necessary, the sender and receiver of the exchange structure should agree on one. Where a precise meaning is required as part of the description of an entity data type, this meaning should be included in the entity data type definition in the EXPRESS schema.
NOTE 2 Under certain conditions, transfer of clear text files via electronic mail attachment has been observed to corrupt the full stop in a real value. See A.2.2 for recommendations.
EXAMPLE
Valid real expressions | Meaning |
---|---|
+0.0E0 | 0.0 |
-0.0E-0 | 0.0, as above example |
1.5 | 1.5 |
-32.178E+02 | -3217.8 |
0.25E8 | 25 million |
0.E25 | 0. |
2. | 2. |
5.0 | 5.0 |
Invalid real expressions | Problem |
---|---|
1.2E3. | Decimal point not allowed in exponent |
1E05 | Decimal point required in mantissa |
1,000.00 | Comma not allowed |
3.E | Digit(s) required in exponent |
.5 | At least one digit must precede the decimal point |
1 | Decimal point required in mantissa |
A string shall be encoded as an apostrophe "'", followed by zero or more characters from the basic alphabet, and ended by an apostrophe "'". The null string (string of length zero) shall be encoded by two consecutive apostrophes "''". Within a string, a single apostrophe shall be encoded as two consecutive apostrophes. Within a string, a single reverse solidus "\" shall be encoded as two reverse solidi "\\".
As specified in 5.2, the octet representation of the characters at code points U+0080 to U+10FFFF is given by UTF-8. These characters may be encoded as hexadecimal digits (see HEX in Table 2) using control directives defined in 6.4.3.3 when compatibility with previous editions of ISO 10303-21 is desired.Characters not in the basic alphabet shall be encoded using the control directives defined in 6.4.3.2, 6.4.3.3 and 6.4.3.4. The WSN of control directives for encoding strings is given in Table 4.
NOTE Under certain conditions, transfer of clear text files via electronic mail attachment has been observed to corrupt a full stop in a string value. See A.2.2 for recommendations.
In ISO/IEC 8859, G(x/y) is the notation for the character in "column" x "row" y, i.e., code value (16 · x) + y, in the code table. Each part of ISO/IEC 8859 is identical to the ISO/IEC 10646 code points U+0000 to U+007F in positions G(00/00) through G(07/15). The various parts of ISO/IEC 8859 differ in the symbols of the extended character set — positions G(10/00) through G(15/14). To include characters from the extended character set in a string requires the use of control directives.
NOTE The control directives described in this section are retained for compatibility with previous editions of ISO 10303-21. It is recommended that all ISO/IEC 8859 characters be converted to corresponding ISO/IEC 10646 values.
The PAGE control directive — reverse solidus latin capital letter S reverse solidus ("\S\") followed by a LATIN_CODEPOINT character (see Table 1) — is used within a string to allow a character in the basic alphabet to represent the character in the corresponding position in the ISO/IEC 8859 extended alphabet. The PAGE control directive shall be interpreted in the string as the single character G((x+8)/y), where G(x/y) is the basic alphabet character following the "\S\". That is, if the basic alphabet character has code value v, it shall be interpreted as the character with code value v + 128.
The control directive reverse solidus latin capital letter P UPPER reverse solidus shall indicate that, for this string only, the subsequent reverse solidus latin capital letter S reverse solidus control directives shall be interpreted as referring to the extended alphabet defined in that part of ISO/IEC 8859 indicated by the value of UPPER. The capital letter referred to shall be one of the following letters : "A", "B", "C", "D", "E", "F", "G", "H", "I". In this context, the latin capital letter A identifies ISO/IEC 8859-1; latin capital letter B identifies ISO/IEC 8859-2, etc. If this control directive does not appear within a string, the value "A" shall be assumed for all PAGE control directives; i.e., the extended alphabet shall be that specified in ISO/IEC 8859-1.
EXAMPLE
String as stored | Effective contents | Comments |
---|---|---|
'CAT' | CAT | |
'Don''t' | Don't | |
'''' | ' | |
'' | string of length zero | |
'\S\Drger' | Ärger | |
'h\S\ttel' | hôtel | |
'\PE\\S\*\S\U\S\b' | Њет | Cyrillic, 'Nyet' |
This part of ISO 10303 specifies control directives that allow encoding of ISO/IEC 10646 characters as a sequence of hexadecimal characters. These control directives may be used in place of UTF-8 encoded characters when compatibility with previous editions of the exchange structure encoding is desired.
The control directive reverse solidus latin capital letter X digit two reverse solidus "\X2\" shall be followed by multiples of four hexadecimal characters. Each multiple of four hexadecimal characters shall be the interpreted as a 16-bit number giving an integer position within the UCS codespace.
The control directive reverse solidus latin capital letter X digit four reverse solidus "\X4\" shall be followed by multiples of eight hexadecimal characters. Each multiple of eight hexadecimal characters shall be the interpreted as a 32-bit number giving an integer position within the UCS codespace.
The control directive reverse solidus latin capital letter X digit zero reverse solidus "\X0\" shall be used to indicate the end of the "\X2\" or "\X4\" hexadecimal character sequence.
NOTE This use of eight hexadecimal characters in the "\X4\" encoding predates the restriction of the UCS codespace to a maximum value of 10FFFF. The first two characters in each eight character group will always be digit zero.
EXAMPLE
String as stored | Code point | Character |
---|---|---|
'\X2\03C0\X0\' | U+03C0 | greek small letter pi (π) |
'\X2\03B103B203B3\X0\' | U+03B1 U+03B2 U+03B3 | greek small letters alpha, beta, gamma (αβγ) |
'\X4\001F638\X0\' | U+1F638 | grinning cat face with smiling eyes (an emoticon, 😸) |
'\X4\001F638001F596\X0\' | U+1F638 U+1F596 | grinning cat face with smiling eyes, raised hand with part between middle and ring fingers (two emoticons, 😸 🖖) |
This control directive shall be used for UCS code points U+0000 to U+001F and code point U+007F. This control directive may be used in place of UTF-8 encoded code points U+0080 to U+00FF when compatibility with earlier editions of the exchange structure encoding is desired.
NOTE The characters defined by ISO/IEC 10646 and ISO/IEC 8859-1 are identical within this range.
EXAMPLE
String as stored | Effective contents | Comments |
---|---|---|
'see \X\A7 4.1' | see § 4.1 | Contains section sign. |
'line one\X\0Aline two' | line one line two |
Contains line feed control character. |
The maximum length of a string as stored in an exchange structure is 32769 octets, including the beginning and ending apostrophes. If embedded quotation marks, reverse solidi, apostrophes, print control directives (see clause 12) or characters encoded according to 6.4.3.2, 6.4.3.3, or 6.4.3.4 are included in the string as stored, the maximum length of the effective contents of the string will be less than 32767 graphic characters. The effective contents is the sequence of graphic characters after these encoding conventions have been resolved.
An occurrence name shall be a constant instance name, a constant value name, an entity instance name or a value instance name.
NOTE 1 This edition of this part of ISO 10303 allows constant values, constant entities, values instances and entity instances to be named and referenced in an exchange structure. Previous editions only allowed entity instances to be named and referenced (see clause 4.3).
A constant instance name shall be encoded as a number sign, "#", followed by an UPPER character, followed by a sequence of UPPER or DIGIT characters.
Constant instance names are references to entity instances defined in the EXPRESS schema. If there are multiple EXPRESS schemas defined in the file_schema of the exchange structure then the constant instance name shall reference an entity instance defined in the first schema (see clause 8.2.4).
The WSN for constant instance names is given in Table 2 in the CONSTANT_INSTANCE_NAME production.
EXAMPLE
Valid name expressions | Meaning |
---|---|
#FARADAY | Reference to constant named FARADAY in the EXPRESS schema |
#INCH | Reference to constant named INCH in the EXPRESS schema |
Invalid name expressions | Problem |
---|---|
#23 | Name begins with a digit |
#INCHES | INCHES is not defined in the EXPRESS schema |
#PI | PI is defined as a value in the EXPRESS schema |
#Inch | All letters must be normalized to upper case |
Constant instance names may be used in RHS_OCCURRENCE productions only (see Table 2).
A constant value name shall be encoded as an at sign, "@", followed by an UPPER character, followed by a sequence of UPPER or DIGIT characters.
Constant value names are references to values defined in the EXPRESS schema. If there are multiple EXPRESS schemas defined in the file_schema of the exchange structure then the constant value name shall reference a value defined in the first schema (see clause 8.2.4).
The WSN for constant value names is given in Table 2 in the CONSTANT_VALUE_NAME production.
EXAMPLE
Valid name expressions | Meaning |
---|---|
@PI | Reference to the value of PI as defined in the EXPRESS schema |
@E | Reference to the value of E as defined in the EXPRESS schema |
Invalid name expressions | Problem |
---|---|
@23 | Name begins with a digit |
@INCH | INCH is defined as an ENTITY instance in the EXPRESS schema |
@Pie | All letters must be normalized to upper case |
Constant value names may be used in RHS_OCCURRENCE productions only (see Table 2).
An entity instance name shall be encoded as a number sign, "#", followed by a sequence of DIGIT characters. At least one character shall not be "0". Leading zeros are not significant. An entity instance name shall not use the same integer as a value instance name.
NOTE 1 The integer spaces for ENTITY_INSTANCE_NAME and VALUE_INSTANCE_NAME are not permitted to overlap because both types may be referenced using a URI, for example "<abc.stp#123> " (see clause 10.2.7).
NOTE 2 Leading zeros in entity instance names are ignored so "#001" is the same identifier as "#1".
The WSN for entity instance names is given in Table 2 in the ENTITY_INSTANCE_NAME production.
EXAMPLE
Valid name expressions | Meaning |
---|---|
#12 | Names or refers to entity with identifier 12 |
#023 | Names or refers to entity with identifier 23 |
Invalid name expressions | Problem |
---|---|
#Faraday | Contains non-numeric character |
#439A6 | Contains non-numeric character |
#+23 | Contains '+' sign |
#00.1 | Contains decimal point |
74 | Does not begin with a number sign |
Entity instance names are used as references to entity instances. Both forward and backward references are permitted. An entity instance name may be defined in the reference section (see clause 10) or a data section (clause 11). Entity instance names may be used in LHS_OCCURRENCE and RHS_OCCURRENCE productions (see Table 2).
A value instance name shall be encoded as an at sign, "@", followed by a sequence of DIGIT characters. At least one character shall not be "0". Leading zeros are not significant. An value instance name shall not use the same integer as an entity instance name.
NOTE This edition of this part of ISO 10303 allows instance names to be assigned to values so that values can be defined in external files. See annex K for examples.
The WSN for value instance names is given in Table 2 in the VALUE_INSTANCE_NAME production.
EXAMPLE
Valid name expressions | Meaning |
---|---|
@12 | Names or refers to value with identifier 12 |
@023 | Names or refers to value with identifier 23 |
Value instance names are used as references to values. A value instance name is defined in the reference section (see clause 10). Value instance names may be used in LHS_OCCURRENCE and RHS_OCCURRENCE productions (see Table 2). A value instance name shall be defined in the reference section only.
An enumeration value shall be encoded as a sequence of latin capital letters or digits beginning with a latin capital letter delimited by full stops. The meaning of a given enumeration value is determined by the EXPRESS schema and its associated definitions from the enumeration type declarations.
NOTE Under certain conditions, transfer of clear text files via electronic mail attachment has been observed to corrupt the full stop at the start or end of an enumeration value. See A.2.2 for recommendations.
EXAMPLE
Valid enumeration expressions | Meaning |
---|---|
.STEEL. | Indicates a value of STEEL |
Invalid enumeration expressions | Problem |
---|---|
.RED | Missing ending full stop |
.123. | Does not start with an alphabetic character. |
A binary is a sequence of bits (0 or 1). A binary shall be encoded as determined by the following procedure.
NOTE This is a binary to hexadecimal conversion.
EXAMPLE
Binary value | Representation |
---|---|
'null' or 'empty' | "0" |
0 | "30" |
1 | "31" |
111011 | "23B" |
100100101010 | "092A" |
The following encodings are used in the anchor, reference and signature sections.
A resource shall be encoded as a URI preceded by a less-than sign, "<" and followed by a greater-than sign, ">".
The WSN for resources is given in Table 2 in the RESOURCE production.
NOTE 1 In the anchor section the resource is on the right of the equals sign ("=") and the anchor name is on the left see clause 6.5.4.
EXAMPLE 1
Valid expression in the anchor section | Meaning |
---|---|
<picture> = <a.jpeg>; | Sets anchor "picture" to the resource <a.jpeg> |
<BOM> = <b.xml#123>; | Sets the anchor "BOM" to the resource <b.xml#123> |
NOTE 2 A resource in the reference section must resolve to an entity instance or a value instance. See clause 10 for the resolution process
EXAMPLE 2
Valid expression in the reference section | Meaning |
---|---|
#10 = <a#b>; | Sets entity instance 10 to the entity identified by the resource <a#b> |
@20 = <c#d>; | Sets value instance 20 to the value identified by the resource <c#d> |
A UNIVERSAL_RESOURCE_IDENTIFIER token of Table 2 shall meet the requirements defined by the IETF (see 3.1.7.1).
EXAMPLE
External Reference | Example Usage |
---|---|
<http://www.giant.com/examples/part.stpnc#first_workpiece> | Reference to a workpiece in a STEP-NC file stored at the given world wide web address |
<building.ifc#first_floor> | Reference to a floor in an IFC building on the current server |
<file:///c:/users/jt_files/assembly.jt.#first_shape> | Reference to a shape in a JT file |
A URI_FRAGMENT_IDENTIFIER token of Table 2 is the name following the number sign, "#", in a Universal Resource Identifier.
EXAMPLE
Universal Resource Identifier | Fragment Identifier | Example Usage |
---|---|---|
<http://www.tool_vendor.com/mill.stp#tool_tip> | tool_tip | Fragment identifier for a point at the tip of a cutting tool |
<#first_floor> | first_floor | Fragment identifier for a floor in the current exchange structure |
<http://www.plumber.com/structure.ifc#3F2504E0-4F89-11D3-9A0C-0305E82C3301> | 3F2504E0-4F89-11D3-9A0C-0305E82C3301 | Fragment identifier defined by a UUID (see annex G) |
An anchor name shall be encoded as a URI Fragment identifier preceded by a less-than sign, "<" and followed by a greater-than sign, ">". At least one character in a URI Fragment identifier that references an anchor name shall not be a digit.
NOTE 1 URI Fragment identifiers defined as digits are assumed to be references to occurrence names in exchange structures defined by previous editions of ISO 10303-21. See 10.2.7.
An anchor name that meets the requirements of annex G is a Universally Unique IDentitifer (UUID).
NOTE 2 Anchors defined by a UUID can be found without a URI because they are universally unique. See 10.2.2.
The WSN for anchor names is given in Table 2 in the ANCHOR_NAME production. Anchor names are used to define identifiers that can be externally referenced (see clause 9).
EXAMPLE
Valid expression in the anchor section | Meaning |
---|---|
<a> = 3.142; | Sets anchor "a" to 3.142 |
<b> = @10; | Sets anchor "b" to value @10 |
<c> = #20; | Sets anchor "c" to entity #20 |
<ad3f1724-19cf-4d19-94ef-eed90b7b4dde> = 2.71828; | Sets anchor with the UUID "ad3f1724-19cf-4d19-94ef-eed90b7b4dde" to 2.71828 |
<2f0cb220-355d-11e5-a2cb-0800200c9a66> = @30; | Sets anchor with the UUID "2f0cb220-355d-11e5-a2cb-0800200c9a66" to value @30 |
<3f553e90-355d-11e5-a2cb-0800200c9a66> = #40; | Sets anchor with the UUID "3f553e90-355d-11e5-a2cb-0800200c9a66" to entity #40 |
A tag name shall be encoded as a sequence of UPPER, LOWER and DIGIT characters. The first character shall be an UPPER or LOWER character.
The WSN for tag name is given in Table 2 in the TAG_NAME production. Tag names associate additional information with anchors. This information is not part of the information model.(see 9.2.8).
NOTE Tag names are allowed in this edition of this part of ISO 10303 so that programmers can create data structures to optimize traversals when an information model is distributed across many exchange structures linked by anchors and references.
EXAMPLE
Valid expression in the anchor section | Meaning |
---|---|
<plate_edge> = #20 {preparation:<WELD_DC.XML>} | Associates edge at #20 with file WELD_DC.XML using the tag name "preparation" |
EXAMPLE
Base64 encoding of a message digest |
---|
873b48e9dd16ec9c7a8423faba7e75a7a9d19ea07abce2808d94b3176ee8bd60 |
© ISO 2016 — All rights reserved