| Production First Software | INDEX | SEARCH | PREV | NEXT | HOME | HELP | Copyright & Disclaimer Notices |
cache Scratch pad computer memory reserved for a specific function. See font cache for a specific application of this practice.
calligraphy The art of hand lettering or handwriting. In typography, calligraphy strictly denotes a style remeniscent of being written with a brush or broad pen and producing unconnected letterforms.
Typographically speaking, the difference between calligraphy and script is that script has letterforms which look connected or with a suggestion that they were meant to be, whereas calligraphy has unconnected letterforms or letterforms which look like they were intended to be unconnected.
Some typeface designs are calligraphic in appearence. These include: American Uncial, Banco, Dom, Lydian, Ondine, and Reporter No. 2.
candrabindu A Devanagari sign appearing as a 'period' or 'dot' slightly above a 'breve'.
canonical ordering A property of characters determining the priority of order given to a character relative to other characters when a string of text data is generated. This property comes into effect if a composite character is decomposed into its component parts; if a ligature, digraph, trigraph, or n-graph character is split, or in some scripts where vowel characters are used in spelling but do not appear in the spelling order.
cantillation marks Diacritical marks added in Hebrew text to specify cadence and pronounciation of Hebrew. They are most often used in the Arts or for teaching Hebrew to new students.
cap height or Capheight The standard height of capitals in a typeface, usually measurable from the height of E or H.
capline or cap line or capital line An imaginary line at cap height parallel to the baseline.
caret or caret mark Literally, a character glyph ( ^ ) used as an editing notation to indicate a missing word or portion of text. More generally, a term used to indicate a text position marker. A 'caret' character is not placed in Unicode. Actual characters which could be used for a 'caret' are <005e> ( 'asciicircum' ), <02c4> ( 'arrowheadupmod' ), or <2227> ( 'logicaland' ).
caret point A position between two characters in text where a cursor is permitted to be placed. A cursor may not be permitted, for various reasons, to be placed at some positions between characters.
case A property of certain alphabets and scripts whereby each letterform has more than one variant form relating to height; and a property of characters, letterforms and character glyphs. The only alphabets and scripts which include character glyphs having more than one case classification are: Armenian, Cyrillic, Georgian, Greek, and Roman. Alphabets and scripts having having character glyphs with only one case classification include: Arabic, bopomofo (Chinese), Hebrew, hiragana (Japanese), Indic, IPA phonetic alphabet, katakana (Japanese), Lao, Thai, and Tibetan.
There are five classification of case: UPPERCASE, Titlecase, lowercase, mixed, and caseless. UPPERCASE and lowercase have commonly understood meanings, but Titlecase is uncommon and arises with digraphs, trigraphs, or n-graphs as follows. Considering a digraph 'DZ' (which occurs in Eastern European languages), the UPPERCASE would be 'DZ', the lowercase would be 'dz', and the Titlecase, used where the first letterform only is to be capitalized, would be 'Dz'. Since a digraph, trigraph, or n-graph is managed as a single character, all three forms are distinct, and should be represented in a font resource. The mixed classification applies to alphabets which are comprised of a mixture of UPPERCASE and lowercase letterforms, each letter only having one case form rather than two. An example of this is a Single Alphabet Uncial typeface. The caseless classification applies to letterforms which have only a single case, but which are incorporated in alphabets or scripts which have the property of case. Examples would be letterforms comprising the IPA phonetic alphabet or certain letterforms used in African and Turkish variants of the Roman alphabet which only have a single case classification form.
Character glyphs which are usually considered caseless include such symbols as mathematical operators, monetary units, numerals, and punctuation. [These symbols, however, are sometimes designed with UPPERCASE and lowercase ( « small cap » ) styles so as to appear more size consistent in text using UPPERCASE and lowercase character glyphs; but they are still functionally caseless.]
Text strings which are considered case sensitive undergo character changes when the case is changed. A character change could be a one-to-one, one-to-many, or many-to-one.
Numerals normally are caseless. However, in today's modern electronic information age, numerals are often mixed with alphabet letterforms. Since software often treats a text string or text character depending on its case, numerals sometimes need to be assigned a case class depending on the circumstances. For example, in the case of text used in certain computer language instructions, numerals might be considered caseless; whereas for text in a database which may be searched, numerals might be considered as UPPERCASE.
case conversion See case folding.
case folding The process of combining characters of more than one case in a text string into characters of only one case. The most common example of case folding (but by no means the only example) is the substitution of a lowercase character for an UPPERCASE/Titlecase character in a text string, or the reverse.
Examples of the use of case folding include (but are not limited to):
(1) Search processes
If the search text string has UPPERCASE only or both UPPERCASE and lowercase characters, a lowercase character is substituted for each UPPERCASE character in the string before the string is submitted for search. This is termed case conversion, and is the most basic strategy for implementing case folding.
(2) Electronic publishing document layout
If a heading text string, typically having been set with UPPERCASE beginning word characters and lowercase for the remaining characters, is to be reset as all UPPERCASE (where UPPERCASE characters are substituted for all lowercase characters) or the heading text string is to be reset as « all caps » (where « small cap » characters replace all lowercase characters).
(3) Internationalized Internet domain naming
A domain name string is to be represented only in a single case, achievable through case folding. In 2000, the protocall or model to achieve this has yet to be determined by the international standards organizations involved.
The process of substituting a lowercase character for an UPPERCASE/Titlecase character may not be straightfoward for the following reasons:
(1) An UPPERCASE character may not have a lowercase equivalent in a particular encoding. [An example of this is the old Greenlandic character 'kra' (at Unicode <0138>)].
(2) Different languages which share the same script may not have the same UPPERCASE/lowercase equivalent pairs. [An example of this is French, which, by convention, uses both E/é or É/é, but English would always use É/é in foreign-derived words.]
(3) Some languages might use different pairs. [An example of this is Turkish, which uses 'Idot'/i and I/'dotlessi' as pairs, but English uses I/i as a pair.]
(4) Some script characters may not have a lowercase form or they may be caseless.
The process of substituting an UPPERCASE character for a lowercase character may not be straightfoward for the following reasons:
(1) An UPPERCASE character may not exist in a particular character set or encoding. [An example is 'Ydieresis'/ÿ, where ÿ exists in the ISO 8859-1 character set, but 'Ydieresis' does not exist.]
(2) An UPPERCASE ligature character equivalent to a lowercase ligature character may not exist, and therefore a one-to-one character substitution cannot be made. [Examples of this type are fi or ß. These must be replaced by UPPERCASE character pairs FI and SS, respectively.]
case fraction A fraction designed as a single character.
caseless One of the classifications of case.
case mapping The classification of characters in a character set or encoding by case.
case rules A set of rules applied to letterform pairs to determine whether or not they will be kerned based on the cases (upper or lower) of both letterforms.
castoff or cast off A calculation to determine how much space typeset copy will take up.
catchline A temporary headline placed near the top of a galley for identification purposes.
categories See domain name categories.
CCD An abbreviation for Charge-Coupled Device. A light-sensitive semiconductor chip which detects light and produces binary data in response. Arrays of CCDs are used in flatbed scanners, digital still cameras, and digital video cameras to produce monochrome or color images. When used to produce color images, either three sets of CCD arrays (for red, green, blue) with dichroic mirrors are required (for cameras) or a single set of CCDs with movable color filters and multiple passes (for scanners) are required. CCDs are relatively cheap and rugged. The alternative is to use photomultiplier tube ( « PMT » ) technology (PMTs being analog devices), and convert the data to digital form. Although photomultiplier tubes result in a slightly better image representation, the tradeoffs of increased expense, shorter lifetime, and a more mechanically-delicate nature are less attractive in a profit-driven society.
CCITT An abbreviation for International Telegraph and Telephone Consultative Committee. A standards group which covers data transmission and communication standards using telegraph and telephone technologies.
CDE An abbreviation for Common Desktop Environment, a graphical user interface implemented on a Motif / X Window System installation. CDE is an extension of capability and features installable over Motif.
CDF An abbreviation for Channel Definition Format.
.CDF file A file format containing Channel Definition Format data.
CDRA or CDRA mappings Character Data Representation Architecture. A set of IBM character encodings and codepages consistent with EBCDIC.
CDSL An abbreviation for Consumer DSL. Supports 56 kbps V.90 modem access when ADSL is not available.
CE An abbreviation for « Central European. » With respect to font technology, this usually refers to a character set (such as ISO 8859-2 or Windows CP1250) which support Latin characters required for some of the Central European languages. This designation does not apply to Production First Software Language Group character sets, because those support languages based on the actual characters required without any geographic proximity considerations.
cell text See terminal font.
cellular computing or cellular computer A computer system characterized by:
CERN An acronym for Conseil Europeen pour la Recherche Nucleaire [European Council for Nuclear Research].
CFF An abbreviation for Compact Font Format.
.CFG file A configuation file which is used to control the installation of a system resource or utility under Microsoft Windows.
CGI An abbreviation for Common Gateway Interface, a capability of envoking scripting computer languages to write control programs for Web servers and Web sites. CGI scripts must be written specifically for the server software being used. Therefore, they are not usually portable and cannot be used for a portable Web site.
An advantage of using CGI scripts is that the actual script commands are inaccessible to casual viewing or copying, since they are not downloaded in order to be executed. This advantage, however, is only due to one of the many major failures of the W3C committee in building needed capabilites (for file security, in this case) into HTML.
challenged user See print legibility for the visually-impaired and Unified Web Site Accessibility Guidelines, and Web accessibility guidelines. See also intelligent agents.
Chameleon A font data lossy compression and representation technology which reduces the amount of data (and, therefore, the storage requirement) of a font file.
It was originally developed by Ares Software Corp. to enable an end-user to generate variations on the design of a font, and then enable more fonts to be used with a given amount of printer memory. Adobe Systems later purchased Ares Software, and has built Chameleon into PostScript Level 3. There, it is used to compress font data for the standard core fonts usually supplied with a PostScript interpreter. Adobe previously developed a technology (MultipleMaster format) which enables an end-user to alter or change certain design aspects of a font. Several foundries, including Production First Software, produce fonts in the MultipleMaster format.
The disadvantage of Chameleon technology is that it is « lossy » and therefore cannot render outlines with arbitrarily high accuracy, introducing artifacts. These artifacts are in addition to artifacts generated by the digitizing error inherent in output devices.
chandrabindu An alternate spelling of 'candrabindu', a Devanagari sign.
Chang Jei An input method for Chinese using letters A through Y, each representing a radical, with X used to construct more complex radicals.
channel A technology by which data is transmitted to a target client or user periodically ( « netcasting » ) over the Internet or an intranet. It is an implementation of push technology.
Channel Definition Format Microsoft-architected implementation of channel technology using XML. It was incorporated by the W3C into channel standards for the Web.
character [typical] The abstract term used for any type of mark or design on a page, such as letters of an alphabet, numerals, punctuation marks, diacritical marks, filled geometric shapes, or other meaningful designs. Some of these designs can be extremely complex, composed of hundreds of curved or straight line segments and filled geometric shapes.
Formally, the object correctly designated by the term « character » is independent of style and is a designated number, which then is used as a pointer or index to the specific object (a set of software instructions or data, a cast metal impression block, etc.) which is the actual agent for making visible marks. The term is often applied colloquially and erroneously to the visible shape (or glyph) which is the result of the display of a character by use of an agent, such as a font or impression block.
Characters have various properties such as: canonical ordering; case; combining class; directionality; letter class; mathematical; mirroring; name; numeric value class; spacing; special; and width. The name property is used internal to fonts resources and other software in the process of rendering text. The numeric value is the way characters are specified in
character behavior Applied to an encoding standard such as Unicode, sets of rules which define how text characters should be manipulated or should behave. This includes, but is not limited to: bidirectional text behavior; decomposition rules for precomposed or composite characters; handling of missing characters, non-spacing characters, surrogate characters; and script-dependent text-handling rules such as for for line-breaking, conjoin order, interlinear annotation, or other similar topics.
character bounding box A bounding box of an individual character glyph. The coördinates of character bounding boxes are listed in AFM files.
character collection A standardized collection of characters which serves a specific purpose. Loosely speaking, a character collection is the same as a character set, although the latter term more specifically refers to implementation on a computer.
character fit See letterfit.
character/glyph model A concept which emphasizes the principle that an encoding is a link between a logical text string (information introduced by a keyboard or input method editor) and a glyph library (font resource); that this character-to-glyph mapping can be one-to-one, one-to-many, or many-to-one; and that a mapping is flexible and arbitrary.
An example of a one-to-one mapping would be the text character [65]decimal or <0041> which would map to a glyph of the Latin letter A.
An example of a one-to-many would be the precomposed text character [225]decimal or <00e1> mapping to glyphs a and ' so as to render as á. This could also be handled by using two text characters sequentially (mapping to a and ' in a one-to-one); but there are typographic problems with this approach.
An example of a many-to-one mapping would be two text characters [102]decimal or <0066> (which maps to the glyph f alone), and [108]decimal or <006c> (which maps to the glyph l alone), mapping to the ligature glyph fl.
If the example above for the one-to-many mapping were recast to illustrate a mapping which would render í, then an additional complication would be revealed: the character [105]decimal or <0069> cannot be used to map to the i glyph as the base, because the base ' i ' glyph must not have a dot over it. Therefore, the glyph actually used must be 'dotlessi', or an ' i ' without a dot. On the otherhand, a composite character using a diacritical mark below the baseline still requires that the base character glyph i have the dot over it. Since both 'i' and 'dotlessi' are present in Unicode, this is not a distinction which must be made based on a base glyph variant, but rather a different base character. This requires an additional mapping, although this case can still be represented by algorithmic mapping rules. In the case of the one-to-many mapping of ' j ' with ' circumflex ', the situation gets even more complicated because a 'dotlessj' is required as the base glyph, and this is not encoded in Unicode.
Sometimes for a composite character, the base character glyph also changes depending on the shape of the diacritical mark. This is a discretionary typeface style-dependent trait, thereby not able to be recognized by an algorithmic mapping rule, because base glyph variants must be used. An example of this can be found in the Production First Software typeface LafayettePF. But this must be handled by the rendering engine (or, hypothetically, by an extremely smart font resource) or by some other means than by using the mechanisms identified by the character/glyph model.
One of the fundamental principles upon which the character/glyph model is based is that an encoding should be used to represent characters spelled in text and used as lexical elements, with no notion of mapping glyph variants, ligatures, or uncommon dingbats. This principle, however, has not been strictly adhered to, and, in fact, has been broken often in many encodings (even in the newer encodings such as Unicode), because of the reality that application software which can perform glyph mapping management has not been widely available. An alternative to the character/glyph model is the direct encoding model.
The principles of the character/glyph model were first enabled with the development of PostScript, and its font machinery. PostScript fonts use the character/glyph model with the notion of encoding and the capability of reëncoding. Applications which can perform automatic ligature substitution would also make use of this model.
character properties Character properties describe the attributes and distinguishing characteristics of characters. The following character properties are described here: canonical ordering; case; combining class; directionality; letter class; mathematical; mirroring; name; numeric value class; spacing; special; and width.
character set A specification of a collection of characters used as a reference for a font or other software.
(another definition follows)
character set (typical) The entire collection of different characters that an application, font, or keyboard can represent or produce. The extent of a character set which can be handled is usually smallest with an application and largest with a keyboard (when the keyboard is used along with an input method editor). Theoretically, the size of the character set which can be handled by a keyboard driving an input method editor is infinite. A character set is often larger than an encoding, but the reverse can never be true without a mechanism to deal with characters the encoding cannot find in the character set. Although a character set may consist of the same characters as an encoding, the basic distinction is that a character set does not have any notion or ordering, whereas the purpose of an encoding is to impart an ordering to a character set as part of a character glyph retrieval mechanism.character set standards For some standards, this is really a misnomer, as the term « encoding standard » is more appropriate since it is an encoding standard which has been adopted as the composition of a character set. Intended encoding standards are indicated with ' * '. Variations of a character set standard used only in certain proprietary platform versions (such as systems sold by Apple, IBM, NEC and others) are not included, unless they are used cross-platform.
character set supplement An addendum or addition of extra characters to a some reference character set. This is usually accomplished by a descendent font, a supplementary font, or additional data blocks or tables, depending on the font format.
characters per inch See pitch.
characters per pica Used to compare the relative width of text set by different typefaces. There are different algorithms used to derive cpp, but they are all intended to determine the average number of 12 point characters required to fill a 12 point space for typical text.
character spoof An alternative legal character representation, most commonly for composite (precomposed) characters. Example: Á is represented by the (Unicode) character <00C1> in composite form. It can also be represented, under the rules of Unicode, by a decomposed form: <0041><0301> . In some software systems, both the composite and the decomposed forms are equivalent and produce the same rendered form for visual presentation. Under those software systems, if the original composing software was looking for the composite form, but is presented with the decomposed form, the decomposed form may also work.
character subset or character subsetting A part of the collection of different characters that an application, font, or keyboard can represent or produce.
character width or character glyph width Sometimes also referred to as "character width." The amount of horizontal distance on a page or in a line of text that a particular character glyph takes up, before any adjustments are made. The character width is located by an origin (at the lower left for left-to-right writing systems, or at the lower right for most left-to-right writing systems). The character width advances the current point or cursor when laying out text, so that the origin of the next character width is located at the end of the previous one. Most character glyph widths include some spacing on the left and right sides of a character glyph design (called side bearings).

(another definition follows)
character width A property of characters related to the representation of CJKV script glyphs in fixed-width multibyte encoding schemes. Since ideographic glyphs are nearly square but other script letterform glyphs are mostly not square, and since most multibyte encodings encode both; two designs for, say, English, are sometimes encoded: a wide square design, and a narrow design (usually half the width). The character width property indicates whether a character assigned to a specific codepoint represents a narrow or wide glyph design. Originally, multibyte encoding schemes were variable width, with narrow characters representing non-CJKV ranges having 1-byte code points and wide characters representing CJKV ranges having 2-byte code points. Newer encodings (like Unicode) are fixed 2-byte or 4-byte width.CharMapML An abbreviation of Character Map Markup Language which is designed to represent character encoded character mapping tables.
charset A contraction of the words « character set. » The term has been used, unfortunately, to indicate both character set and encoding. The confusion is due to the fact that an encoding essentially defines a « character subset, » because an encoding chooses certain, specific characters from a character set, and assigns them to specific code points. The character set is often much larger in terms of the number of characters than the number of characters assigned from it in the encoding.
chase In hot type, a metal frame used to lock together metal type bodies, leading, and other components to make up a plate for printing.
choke The slight recession of the inner or outer edges of a solid or halftoned shape in a process color image, so that it will overlap slightly with a shape in another process color image. The slight overlap eliminates unwanted gaps which may occur due to imperfect registration of the printed process color images. The process of applying a choke may be done in conjunction with applying a spread, which is the opposite. These processes, called « trapping, » can be done automatically by trapping software or manually by an operator or designer during page setup, page design, or drum scanning. Sometimes, automatic trapping is unsatisfactory, and trapping must be done manually.
cHTML An abbreviation of chunks of HTML. A subset of HTML used in wireless communication arena.
Chu-Han Vietnamese for « Han character. »
Chu Nom or Chu Nôm A Vietnamese script comprised of Chu-Han Vietnamese ideographs.
cicero A unit of measurement: 1 cicero = 12 didot points.
CID-keyed font A PostScript font format (either base or composite) developed by Adobe Systems which references glyph descriptions directly by number from a randon-access type data block. This scheme allows a character to be rasterized or displayed much faster for fonts having character sets with thousands of characters and where these characters are usually accessed in almost a random fashion. This was developed for Chinese, Japanese, and Korean ( « CJK » ) character collections.
The CJK character collections can be viewed as a « dense » array with « sparse » access (because few consecutively-used ideographs are located adjacent in the array). Conversely, accessing a large character set (like Unicode) for non-CJK scripts (languages using the Arabic, Armenian, Cyrillic, Georgian, Greek, Hebrew, Indic, Roman, Thai/Lao and other alphabets) is repeatedly using a relatively small collection of characters which are located not too far apart (« dense » access). Dense access of a dense array suffers a much smaller performance penalty for fonts that are not CID-keyed than sparse access of a dense array on most computer systems. This is due to the nature of buffered disk and memory operation.
CID-keyed fonts dispense with the notion of a character name, since all glyphs are referred to by the CharacterID number. This has advantages and disadvantages. The advantage is that politically-incorrect and language-script dependent naming issues are avoided. The disadvantages include the lack of a positively identifiable human-readable name for each glyph and the lack of consistency with previous PostScript formats, which used character names like 'ampersand'. Creating new CID-keyed fonts using « borrowed » characters from other CID-keyed fonts is also much more complicated. Finally, calling characters from a font in PostScript by character name only (overriding a font encoding) is impossible, unless an extension to PostScript is made in the future.
CIE color space See color space.
cipher A monogram whose component letterforms are intricately interwoven.
(another definition follows)
cipher A writing having been subject to the process of encryption so that it is no longer human readable. Sometimes the decryption key is also called a cipher. Text which is encrypted is termed « ciphertext ».ciphertext Text which has been encrypted. See cipher and encryption.
CIP3 Achronym for International Cooperation for the Integration of Prepress, Press, and Postpress. A consortium of over 30 graphic arts software and hardware producers which has developed an embedded data file format called PPF (Print Production Format), for use in data file formats such as PDF and PostScript, to generate automated, optimized workflow.
CISC An acronym for Ccomplex Iinstruction Set Computer. The computer microprocessor architecture developed by Zilog and used in all the little-endian-based chips from the Zilog Z-80 to the Intel Pentiums. Compare with RISC and EPIC, which are newer and faster technologies.
citizen A term applied to a systems agent or resource (such as a font resource) when it comes to describing its behavior or impact on the operating system. Good behavior or no adverse impact on the system is termed a « good citizen; » while the opposite traits would be termed a « bad citizen. »
CJK or CJKV Chinese, Japanese, Korean, Vietnamese,
classification See typeface classification.
CLAUI Acronym for Cultural and Linguistic Adaptability and User Interfaces Coordination Committee.
ClearType A Microsoft driver and rasterizer technology, based on subpixel anti-aliasing, which enables fonts to render more clearly on some LCD and TFT screens. This is achieved by adjusting the individual red, green, and blue color portion of the pixels around the interface of a glyph outline and the background. In order for this to be done optimally, additional data tables must be built into the (TrueType or OpenType) font. ClearType can achieve a little over twice the apparent resolution of on-screen text. It should be mentioned that as digital displays increase in resolution, as they inevitably always will, the advantage of ClearType will disappear and ClearType will eventually become unnecessary. The graphic below depicts text set three ways: bitmap without anti-aliasing, bitmap with standard (whole pixel) anti-aliasing, and bitmap with ClearType (subpixel) anti-aliasing.

ClearType technology makes no improvement on cholesteric LCD's, because the color elements are stacked. However, cholesteric LCD's can be made physically much smaller, which allows much higher resolution displays exceeding 200 dpi to be manufactured. [200 dpi is approximately twice the average resolution of computer color display devices of the late 1990's.]
clone A functionally identical but structurally different « duplicate » of a hardware or software product. Many clones (both hardware and software) turn out not to be functionally identical under certain, sometimes hard to identify, circumstances. Clones are usually less costly than the original product.
close up An editing mark used to indicate a reduction of white space between words or the spacing between text characters.
CMAP An ASCII text resource file used with PostScript CID-keyed fonts which encodes a CID-keyed font as a new font resource instance to the PostScript Level 2 or greater interpreter.
Any number of CMAPs can be used for a given font. However, a CMAP must be composed based on the specific character content of a CID-keyed font. If a number of CID-keyed fonts have the same character content, a given CMAP can be used with all of them.
CMOS An acronym for Complementary Metal Oxide Semiconductor. A type of semiconductor chip material.
cmap table A data table in a TrueType or OpenType font which serves as an encoding between characters and glyphs.
CML An abbreviation for Chemical Markup Language, a markup language application of XML for the exchange of information on the molecular descriptions of chemical substances.
CMYK An abbreviation for Cyan, Magenta, Yellow, and blacK, the usual process colors.
CMYK color space See color space.
codec An abbreviation for a compression/decompression software utility or operating system resource. Codecs are used by many software applications from word processor and page layout applications to paint and Web browser applications. Codecs are used to generate compressed audio, image, or video data files; and then used to play or display them.
coded character set Another name for codepage.
codepage A collection of selected characters, arranged in a specific order, that supports one or more alphabet scripts and functions; and is made available to the input method editor or keyboard. Usually, the collection is not more than 256 characters (so that all characters can be expressed by 8 bits or 1 byte) and they are usually all from one font. The codepage operates functionally between the keyboard driver or input method editor and the encoding process. That is, after the encoding process assigns character numbers to a collection of glyphs, a codepage may alter the order of and/or exclude characters. A codepage works at the character level, and therefore affects all fonts used. Codepages are usually loaded at the system level, and on some operating systems only one is available at any given time, and, therefore, on any given document page. Sometimes, changing the keyboard mapping not only has the effect of changing the keyboard driver, but also, changing the codepage.
Production First Software has extended this concept using the LanguageGroup Kit to enable codepages which span more than one font and can address more than 256 characters. The Production First Software implementation of codepages is not at the system level; and more than one codepage may be used on the same document page.
TrueType font formats allow inclusion of a list of codepages which the font may support. At least one application (Microsoft Word97 and beyond) makes use of this data. However, since codepage definition and support is usually implemented at the system level, it seems unwise to evolve a system where the only codepage support from an application is triggered by the codepage list built into a font. With this methodology, if additional codepages are designed and implemented in an operating system, preëxisting fonts would not have the information built-in, even if the character content of those fonts would support the new codepages. A better methodology is to be able to install codepage support through the application, with partial or full support by any particular font depending on its character content. The Production First Software LanguageGroup Kit allows this for PostScript fonts with applications like PageMaker, without the necessity of building in any data about codepages in the font itself.
codepage switching The technique of switching keyboard character sets and character arrangements, for keyboard layout conventions of different countries, by specifying a country code.
code point The numerical representation or property of a character used to represent its location in a codepage or encoding.
code point notation Code point notation is usually expressed in hexadecimal (numbers of base 16) more commonly than decimal (numbers of base 10).
For 1-byte encodings and codepages, this is indicated here as <nn> (for example <a8>) or 0xNN (for example 0xA8 ).
For 2-byte encodings (such as Unicode or ISO/IEC/10646 in UCS-2 format, U+NNNN or <nnnn> is used (an example being U+20A8 or <20a8>).
For 4-byte encodings (such as ISO/IEC/10646 in UCS-4 format, U-NNNNNNNN or I-NNNNNNNN or <nnnnnnnn> is used (an example being U-005D20A8 or I-005D20A8 or <005d20a8>).
The notation convention used in this Encyclopædia is usually the < ... > forms with lowercase letters in the hexadecimal numbers. Occasionally, the U+, U-, or I- forms with UPPERCASE letters in the hexadecimal numbers will be used.
code space The extent, list, or range of code points represented by an
COLD An acronym for Computer Output to Laser Disk. A methodology of storing data.
cold type Used to designate type set by photocomposition or other computer-driven typesetting methodologies (including desktop publishing). The alternative is « hot type, » which usually refers to generating cast metal type.
CoolType An Adobe technology announced in 1998 combining a PostScript 3 rasterizer (or parts of it); and Type 1, OpenType, and TrueType font and printer driver interfaces. The areas of use include page layout applications, operating systems, and related font utilities. CoolType could be considered a replacement for Adobe Type Manager (ATM). The first CoolType software was incorporated into a page layout application called InDesign in late 1999.
colonial typeface A somewhat informal classification of typeface designs which have rough outlines to simulate poor casting techniques and poor printing results common in the U.S. colonial era. Examples include: Caslon Antique, MisionPF Antique, Old Claude, Poliphilus, and 16th Century Roman.
color calibration See GATX, GetragMacbeth, or IT8.
colophon An inscription usually placed either on or right after the title page, or at the end of a book, document, paper, report, (or, now, Web page) which provides information on the calligraphic or typographic composing details.
color gamut The spectrum of colors that a device or system can reproduce.
color palette or color pallet A collection of specific colors that are available for an image in a digital representation. For example, the color palette of HTML-related image formats consist of as few as 216 colors; whereas the color palette representable in 8 bit color is 256 colors; in 24 bit color is 16 777 216 colors; and in 36 bit color is 68 719 476 736 colors. Generally speaking, if a color is specified which is incapable of being represented by an application, output device, or other rendering system due to being excluded from the available color palette, it usually is rendered by dithering.
Continuous tone images (such as those made by photography) display patchy areas of color rather than continuous tones if millions of colors are not reproducable.
color or color space The collection of all possible colors reproduceable by a specific environment or recipe.
Specifically, with no adjectives in front of the term, color space implies the complete collection of colors in the spectrum of white light. When adjectives are applied (for example: CIE color space, CMYK color space, Land colopr space, RGB color space), the term denotes all colors reproduceable by that particular color reproduction technology.
COM An acronym for Component Object Model Architecture, a client/server communication protocall used in middleware, architected by Microsoft. A competing protocall is CORBA, an international standard.
combining character A character whose character glyph usually is used along with another character's glyph (which character may or may not be a base character) and combined into a single glyph. The distinction between using a combining character glyph along with another character glyph, and a ligature is that in the former case, the combining character glyph usually does not entirely give up its visual form. However, when a base character glyph is used to create a ligature glyph, the resulting ligature glyph could be visually quite different from using the composing base character glyphs individually.
A combining character is usually not used alone in regular text, although it could be under special circumstances. When a combining character glyph is used along with a base character glyph, the resulting character is called a composite character.
composite character A text character whose corresponding glyph is constructed from more than one base character glyph; can be decomposed into more than one base character; or whose corresponding glyph is internally constructed within a font by using more than one character glyph, at least one of which is usually a base character glyph. A composite character is also known as a « precomposed character. » An example of a possible composite character is Á (Aacute), which can be composed of the characters A and ´ (acute). Composite characters in Production First Software fonts can be composed of an unlimited number of base characters. Some ordinary PostScript fonts might define what should be a composite character as a base character. This increases the amount of data unnecessarily.
Certain scripts use a large number of composite characters (with some of the composite characters being ligatures). These include Arabic, Hangul, and Roman. Currently, there is a big controversy as to whether, or how many, composite characters should be placed in multibyte encodings, such as Unicode and ISO/IEC/10646. The alternative to placing composite characters in encodings is to rely on artificial intelligence algorithms in composing software (word processors, page layout programs) or font resources to properly represent composite characters on-the-fly. This strategy is not simple to implement for any script if text justification or track kerning is employed. Even if neither technique is used, it is not simple to implement (in fact, theoretically impossible) for Roman scripts when pair kerning is utilized. The reason is that when a sequence of characters is spelled out to represent a composite character with another character, it is impossible to precisely determine which two adjacent characters are to be kerned without human intervention for each and every character pair. Oftentimes, positioning of base characters to create a composite character on-the-fly requires discretionary information as to positioning. Usually, this is the purview of the typeface designer. Therefore, the text rendering algorithm must obtain positioning information from a data file, which is different for each typeface. Some typefaces use different designs of diacritical marks depending upon the base character; and some typefaces use a different design of the base character glyph depending on the diacritical mark. These are yet additional permutations which must be recorded as data and then used by the positioning algorithm. For some typefaces, certain composite characters are replaced with ligatures, again at the discretion of the typeface designer. An example of all these effects can be seen in the Production First Software typeface LafayetteU0PF. The use of composite characters eliminates these problems, complications, and obstacles. It should be noted that using pair kerning in document text can increase the comprehension speed by up to 50%, and typically reduce document viewing size (or the number of pages) by up to 30%. Using composite characters could reduce document file sizes by 10% to 30% depending on what percentage of text characters are composite characters.
The disadvantages of using composite characters turn out to be relatively minor. One disadvantage consists of the number of additional code points required in the encodings. This, however, does not translate into larger documents or file sizes unless the composite character additions require more bytes to represent a larger encoding space. Document size is proportional to code point byte width for native representation; but some schemes like UTF-16 can substantially reduce this penalty. The only other significant disadvantage is a possible increase in the amount of kern pair data (if a significant number of composite characters are kerned). Kern pair data is a sparse matrix whose size is proportional to the square of the number of characters kerned; but kern pair data is only font-dependent, not document-dependent. However, most typefaces do not include a significant number of kern pairs having composite characters, except typefaces produced by Production First Software.
combining class A property of characters, which is a priority number assigned to a character used to specify how a character combines with another character typographically. It is important for using multiple diacritical marks on the same base character, and the use of vowels in complex scripts.
combining mark A combining character which is a diacritical mark or tonal.
Communicator A Web browser and Email communication software developed by Netscape Inc. It is available for free from Netscape's Website.
Common Desktop Environment See CDE.
Compact Font Format A PostScript, nearly completely binary, font format which allows one or more Chameleon, CID-keyed, MultipleMaster, single, or synthetic fonts to be bundled together in a compact representation. Metrics are handled in the same manner as previous PostScript font formats.
Compatibility Zone A zone in ISO/IEC/10646 and Unicode which contains character variants in use in other standards which should not or cannot be unified with characters in the standard script blocks.
complex script A catch-all term referring to scripts which require special processing in typical use (other than a dedicated change in writing direction), relative to plain text, because of some complexity in their nature. Examples include scripts which form conjoins (such as the Indic scripts), scripts which require contextual analysis for glyph selection (such as Arabic, Indic, or Korean); and scripts which may have alternating writing directions or unusual line-break requirements (such as Mongolian).
complex document or compound document A document composed of text, graphics, images, sound bites, animations, movies, or other non-textual components.
compression scheme A technique for reducing the size of a file by using a specific algorithm to eliminate some of the data. Depending on the nature of the data and the algorithm, if the eliminated data does not result in any change in representable content (no unreconstructable loss of data representation), the scheme is termed « non-lossy » or « lossless. » If a change in representable content occurs (an unreconstructable loss of data representation), the scheme is termed « lossy. »
composite font A font whose font definition points to other fonts in a hierarchical manner. While a composite font can be said to be « composed of other fonts, » this terminology is misleading, because composite fonts do not require duplication of data.
conventional font The term here denotes a single-byte font, or a font accessed by single-byte characters.
core font A font which is bundled with another software product.
composite font extension A feature available on some PostScript Level 1 interpreters which allows the use of Type 0 composite fonts. This feature is also known as the « Kanji font extension. »
compound ligature A ligature which is constructed from a base character and another ligature from an action not within the font.
compressed Refers to a typeface which is a narrower version of another typeface but with nearly the same vertical stem widths.
Compression schemes are usually encountered with and utilized for audio and image files, and, rarely, for entire text or entire document files. Common image formats which utilize compression include .GIF, .JPEG, .PCX, and .TIFF, although compressed TIFF is rarely used at this time. A compressed file is considered to be « coded, » and a decompressed file to be « decoded, » although this terminology is imprecise. A separate software utility which perform these functions is sometimes called a « codec. »
A number of compression schemes are known, and some are commonly-used: ASCII-85 (PostScript), Discrete Cosine Transform (JPEG, PostScript, TIFF), Fractal, Huffman (CCITT Group 3, CCITT Group 4, JPEG), JPEG compression (JPEG, MPEG-1, MPEG-2), Lempel-Welch-Ziv (GIF, TIFF, .ZIP files), Run-length (Dr. Halo, Macintosh Paint, Windows Paint, PhotoShop, PostScript, Sun Raster Data, SGI, Targa), Stuffit (.SEA and .SIT files), Wavelet, Zip (.ZIP files), and Zlib (.PNG files).
Computer Style A typeface structural style having the characteristics of simulating computer-generated output, such as printout, ocr, and illuminated displays.
comstock Same meaning as inline.
condensed Refers to a typeface which is a uniformly narrower version of another typeface either by design of the font or by manipulation from within an application.
conjoin A property of glyphs representing a string of characters in certain alphabets joining together before being represented or presented in visual form. Arabic and Indic alphabets require horizontal or vertical conjoining in correct typographic representation. The noun describing the result of a conjoin is a « conjoint » (also known as a « conjunct » ).
Conjoins are often implemented in operating systems by simply placing a conjoined character glyph in the proper position, overlapping it slightly with adjacent glyphs in some cases. There is a problem with this type of implementation: if an application specifies outlined or textured conjoined characters, this cannot be done properly because the conjoined glyph outlines are not continuous. Conjoined glyphs must then be reconstructed with continuous outlines, thereby removing overlapping outlines, usually by a graphics application. An alternative treatment is to replace the conjoined pair with a ligature; but this has practical problems in that the number of ligatures required are equal to the number of pair combinations of characters which conceivably could conjoin. Conjoint ligatures are common in Indic scripts, because the conjoined forms are visually different than the isolated forms. Conjoint ligatures are less common in Arabic, not being used in everyday text communications.
conjoint See conjoin.
conjunct See conjoin.
Conseil Europeen pour la Recherche Nucleaire Also known as « CERN » and « European Laboratory for Particle Physics, » the scientific institute located in France and Switzerland which performs nuclear physics research on fundamental nuclear particles, the building blocks of nature. Much of the work involves designing and operating particle accelerators ( « atom smashers » ) to produce a variety of heavy particles for study. In 1990, CERN developed the World Wide Web, a graphical extension of the Internet.
contextual commerce A construction strategy of Web sites in which the opportunity for transactional processing is only associated with certain, specific elements of page content. A simplified, hypothetical example would be a link to a transactional page for the purchase of specific products included in a list of suggested products, some of which would be free.
contextual editing The checking of data for correctness. A spell checker is an example of contextual editing. Another example is form validation in HTML, wherein form data is checked to determine if it fits within limits for the particular item of information collected.
contextual processing The task of selecting (other than by simple character/glyph mapping) the correct glyph or shaping a glyph on-the-fly to represent a text character and determining the layout location to render it. This process can be optional and relatively simple in some scripts like Greek, Hebrew, or Latin (for ligature substitution or for substituting a medial form), or it can be complicated for scripts like Arabic (selecting one of several forms and adding kashida glyphs), Egyptian hieroglyphics (mirroring and selecting different forms), Indic (selecting different forms, relocating and joining glyphs), and Mongolian (selecting different forms, joining and relocating glyphs, modulating the writing direction). Scripts requiring these capabilities are said to possess contexual sensitivity.
In Arabic, Armenian, Cyrillic, Georgian, and Latin, ligatures must be specified based on a sequence of more than one character. Ligature substitution is a relatively simple contextual processing task. For example, in Latin, the ligature glyph fi replaces individual glyphs for f and i. In Arabic, the proper form of the glyph (initial, medial, final, or isolate) must be determined by analyzing a text string representing a word. In hieroglyphics, it is necessary to determine whether the right or left hand glyph image is required. In both hieroglyphics and Mongolian, the writing direction of a particular text string must be determined. In Indic, the proper glyph must be selected based on the placement; and the placement (above, below, or beside the previous glyph) must be determined based on other characters in the text string. A script that requires contextual processing in typical use is usually called a complex script. Contextual processing is usually performed by an input method editor.
contour Another name for inline.
contraction Strictly speaking, in character space, the replacement of more than one character in a string by a single character. This could be considered (only improperly) an analog of a ligature in glyph space because the replacement (character or glyph) or the objects consumed (characters or glyphs) may not correspond.
contrast The change in stroke thickness between the thickest stem or curved stroke portion and the thinnest stroke or thinnest curved stroke portion.
A typeface is said to exhibit high contrast when the aforementioned difference is greatest. Typeface examples of this are Bodoni Poster and Thorogood Roman. The opposite extreme is a typeface with no contrast. Typeface examples of this are BernalPF and Courier.
Copyright Convention See the Berne and Pan American Copyright Convention.
CORBA An acronym for Compliant Object Request Broker Architecture, a client/server communication protocall used in middleware. A competing protocall is COM, (architected by Microsoft).
corner mark A right angle-shaped mark used as a registration or cut mark.
Corporate Use subarea A portion of the ISO/IEC10646/Unicode encoding Private Use area starting from
counter The open area between vertical stems in a letterform.
counterform A term sometimes used to denote the space between letterform parts enclosed by an imaginary rubber band stretched around the outside of the letterform.
country code A code used to specify a particular keyboard character set and layout for the language of a particular country.
CP An abbreviation for Code Page or codepage.
CPI An abbreviation for Characters Per Inch. See pitch.
cpp An abbreviation for characters per pica.
CPSI An abbreviation for Configurable PostScript Interpreter. A PostScript interpreter which runs on a host or server microcomputer system, generates bitmap data, and sends the bitmap data to an output device which serves strictly as a marking engine.
The advantages of CPSI is that it is easily upgradeable and it has direct access to font resources and input files stored on the system hard disk or other mass storage devices. The main disadvantage over embedded (firmware) PostScript interpreters is, so far, throughput. Throughput depends on high speed processing and the ability to quickly send huge amounts of bitmap data to the output device. The processing speed penalty occurs because specialized firmware can be highly optimized. The processing speed issue, however, will improve as microprocessors get faster, which is happening almost continuously. The penalty of sending huge amounts of bitmap data to the output device can only be reduced, however, if faster data transfer protocalls are designed. The data transfer rates are affected by the system bus and disk subsystem interface. These protocalls cannot improve in speed almost continuously like microprocessors, because they make use of standardized specifications, which do not change frequently.
Cray Research A company spun off from Control Data Corporation (CDC) of Minneapolis, Minnesota, USA in 1971, made famous by developing commercial scientific supercomputers in the 1970's-80's, including the Cray-I, Cray-I-XMP48, and the Cray-2. The Cray's were characterized by a circular CPU compartment approximately 7 feet wide and 6 feet high, surrounded by a circular arrangement of cushioned seats, and weighing approximately 8 tons, because there were so many semiconductor memory units packed so densely. The resulting short electrical connections added to the speed of the processor. The original Cray-I was a single processor system, but the Cray-I-XMP48 and Cray-2 were multiple processor systems with user-programmable parallel computing. The Cray's originally ran a Control Data Corporation (CDC) operating system, their lineage and architecture somewhat related to the CDC-6600 and 7600 developed by Seymour Cray, who later left CDC to found Cray Research. Varieties of Unix later ran on the Cray's. The Cray's were mainly used for compute-intensive monte carlo and deterministic mathematical modelling tasks in such fields as nuclear weapons design, nuclear fusion research, particle and radiation transport, aerosol transport, weather forcasting, molecular chemistry and pharmocologic research, fluid dynamics, structure stress design calculations, earthquake prediction, and movie animation.
With the advent of ever-faster microcomputer chips, the speed advantage of the Cray's was continuously being eroded, and the expensive operation (air conditioning, electricity, space) helped the demise of that technology. Cray Research, however, continued to develop supercomputers using microprocessor chips in massively parallel processor (MPP) configurations. In 1995, a Cray-I computer, retired from Lawrence Livermore Laboratories in Livermore, California, was advertised in the San Francisco Examiner for sale as scrap. It was purchased by an entrepeneur for US$10,000. In 1996, Silicon Graphics Inc. (SGI) purchased Cray Research for US$740 million, as well as Paragraph International in 1997, a Russian software company owned by world chess expert Gary Kasparov, who later lost a widely-publicised chess tournament to IBM's MPP RS/6000-based supercomputer. The purchases nearly bankrupted SGI, sending SGI stock plummeting from US$25/share to US$8/share, having not yet recovered by 2000. In January 2000, SGI sold Cray Research for US$100 million to Tera Computer Company of Seattle, Washington, USA. Tera was founded in 1987 by Jim Rottsolk and Burton Smith to develop supercomputers.
crawling robot A software utility, not under the control of an end-user, that examines a number of Web sites to determine if a specific action is to be taken. Crawling robots take a variety of forms:
.CRD file See bitmap formats.
.CRE file See bitmap formats.
CRM An abbreviation for Customer Relationship Management. A customer retention strategy of setting up a centralized database that customer support and customer call centers can access from any place in the world.
crossbar A generally horizontal stroke which completely intersects a generally vertical stroke. Examples can be found in: A, T, f, t.
cross-format A designation given to a software product package which contains versions of the product in different formats, or one version which acts as if it were equivalent to versions of different formats.
cross head A heading interposed within body text for separating or identifying different subject areas of the text.
cross-platform A nebulous term which can imply consistency of architecture in the issues of data exchange, or identical functioning, on different platforms. « Cross-platform » does not imply « platform-independence », where the same identical software product can function on different platforms.
cross-script A term designating a property which spans more than one script. An example would be cross-script pair kerning, which is present in most Production First Software Unicode and UCS-4 typefaces.
cross-script kerning This refers to pair kerning where the characters of the pair belong to different scripts. Combinations include, but are not limited to, Cyrillic-Latin, Cyrillic-Greek, Greek-Latin, and Hebrew-Latin. This arises due to hybrid alphabets, which are used in some African, eastern European, and native American languages.
crossware A term used for electronic communication software which enables cross-communication between Internet and intranet systems.
crotch The area immediately above the junction of the two diagonal stems of the letter 'M'.
CRT An abbreviation for Cathode Ray Tube. A device for displaying images consisting of a large glass container, containing a vacuum, with a phosphor screen at one end and a high-voltage electron gun at the other end. It is used in televisions and desktop computer monitors.
CTP An abbreviation for Computer-To-Plate. See platesetter.
.CUR file See bitmap formats.
cursive See script.
(another definition follows)
cursive or cursive writing A flowing style of writing (« handwriting ») where letterforms are continuously linked.Some primary schools are no longer teaching cursive writing, because there are questions about its importance in the computer age, where keyboard buttons, mouse clicks, and touch-sensitive screens are used to create text.
cursor A moving position indicator on a computer monitor screen.
CUS See Corporate User subarea.
Cyrillic Extended A Production First Software supplementary Cyrillic and Old Cyrillic character set which includes historic and religious variants, ligatures, numerals, upper and lower case diacritical marks, and symbols not in the
| Production First Software | INDEX | SEARCH | PREV | NEXT | HOME | HELP | Copyright & Disclaimer Notices |