Library of Congress
Pinyin Conversion Project

An Outline of Pinyin Conversion

What converted, what did not convert, and immediate cleanup tasks


Authority Records

Bibliographic Records


Authority Records (September 28, 2000)

  1. Scope of Conversion
  2. The conversion covers all name and series authority records in National Authority File with headings and references which include 2 or more Wade-Giles syllables in the following subfields:

    100$a, $t
    110$a, $b, $t
    111$a, $c, $e, $t
    130$a, $n, $p
    151$a
    The presence of one such subfield made a record eligible for conversion; for example:
    110 10 $a China (Republic : 1949- ). $t Laws, etc. (Liu fa ch'üan shu : 1985 ed.)

    110 20 $a Chung-kuo kuo min tang. $b Boston Branch

    Go to Table of Contents
  3. Extent of Conversion on Individual Records


  4. Marker:
    'c' in the 008/07 field signifies one of several conditions: the record was
    1. fully converted by the machine program
    2. fully converted by the machine program, and awaiting manual review
    3. partially converted by the machine program, and awaiting manual review
    4. reviewed and updated manually
    'n' in the 008/07 field indicates that the record was considered for conversion, but was not converted because the heading was not romanized according to Wade-Giles
    Headings: were converted as fully as possible:

    100 fields, personal names:

    Wang, Tan == Wang, Dan
    Hsü, Kuo-ching == Xu, Guojing
    Yang, Yen-keng == Yang, Yan'geng
    Ch'en, Shan-ang == Chen, Shan'ang
    Su, Wen-chung kung == Su, Wenzhong gong
    Ssu-ma, Ch'ien == Sima, Qian
    Fang Jen, Li-sha == Fang Ren, Lisha
    T'ang T'ai-tsung == Tang Taizong
    Wu-ming-shih == Wumingshi
    110 fields, corporate bodies:
    Chung-kuo kung ch'an tang == Zhongguo gong chan dang

    Fu tan ta hsüeh == Fu dan da xue

    Ho-nan sheng she hui k'o hsüeh yüan == Henan Sheng she hui ke xue yuan

    Chinese University of Hong Kong. $bShe hui kung tso hsüeh hsi == Chinese University of Hong Kong. $bShe hui gong zuo xue xi

    111 fields, meetings:
    Liang an ching mao kuan hsi chih chien t'ao yü chan wang hsüeh shu yen t'ao hui$d(1992 :$cShih fan ta hsüeh, Taipei, Taiwan) == Liang an jing mao guan xi zhi jian tao yu zhan wang xue shu yan tao hui $d (1992 :$cShi fan da xue, Taipei, Taiwan)

    Fo hsüeh yü k'o hsüeh yen t'ao hui == Fo xue yu ke xue yan tao hui

    130 fields, uniform titles
    Chung yang yen chiu yüan chih wu yen chiu so chuan k'an == Zhong yang yan jiu yuan zhi wu yan jiu suo zhuan kan

    Hung lou meng yen chiu tzu liao ts'ung shu.$pChia pien == Hong lou meng yan jiu zi liao cong shu. $pJia bian

    Headings for geographic locations:
    Single-syllable generic terms: three single-syllable generic terms were captalized, and separated from the preceding syllable by the machine program when they were used as part of a place name in headings for geographic locations (in 151 fields and 110 fields with first indicator 1): shih = shi, hsien = xian, sheng = sheng. The program usually capitalized those syllables when they appeared as part of the name of a corporate body appearing in straight form; however, sometimes you may find that it did not. Because the syllables can have meanings other than city, county and province, there are some instances of incorrect capitalization. The program did not capitalize these three terms when they appeared as part of a uniform title.
    Examples:
    151 -0 $a Fang-shan hsien (China) == 151 -0 $a Fangshan Xian (China)

    151 -0 $a Kung-i shih (China) == 151 -0 $a Gongyi Shi (China)

    110 2 $a Shang-hai shih hsiang chiao kung yeh kung ssu == 110 2 $a Shanghai Shi xiang jiao gong ye gong si
    [Cleanup task: change Shi to shi]

    110 2 $a Chi-lin shih fan ta hsüeh == 110 2 $a Jilin Shi fan da xue

    110 2 $a T'ai-wan sheng li po wu kuan == 110 2 $a Taiwan Sheng li bo wu guan
    [Cleanup task: change Sheng to sheng]

    130 -0 $a T'ai-wan sheng li po wu kuan ts'ung shu == 130 -0 $a Taiwan sheng li bo wu guan cong shu

    130 -0 $a Chung-kuo hsien tai shih tzu liao ts'ung k'an == 130 -0 $a Zhongguo xian dai shi zi liao cong kan

    In order to avoid putting letters into upper case incorrectly, other single-syllable generic terms for place names (such as chou = zhou, chen = zhen) were not capitalized by the machine program.
    Examples:
    151 -0 $a Mei-ts'un hsiang (Jiangsu Sheng, China) == 151 -0 $a Meicun xiang (Jiangsu Sheng, China)
    [Cleanup task: change xiang to Xiang]

    151 -0 $a Yung-le chen (Shanxi Sheng, China) == 151 -0 $a Yongle zhen (Shanxi Sheng, China)
    [Cleanup task: change zhen to Zhen]

    When romanizing Chinese after pinyin Day 1, insofar as possible, headings should conform to the conventions applied by the US Board on Geographic Names (BGN). Generic terms for place names should be capitalized, following BGN practice.
    Examples:
    110 2 Fuzhou Shi ren min yi yuan
    110 2 $a Tianjin (China). $bShi zheng gong cheng ju
    410 2 $a Tianjin shi zheng gong cheng ju
    410 2 $a Tianjin Shi shi zheng gong cheng ju
    110 2 $a Jilin shi fan da xue

    110 2 $a Taiwan sheng li bo wu guan

    Multi-syllable generic terms: the individual syllables of a multi-syllabic term, as defined by BGN, were capitalized and joined together without spaces when used as part of a place name. When the single syllable tsu preceded any of the multi-syllabic terms in the name of a place, it was converted to zu and joined to the syllable that preceded it. The following terms were converted by the conversion program to conform to BGN guidelines when they appeared as part of proper names, in names of corporate bodies, and in uniform titles:
    ti ch'ü == Diqu
    tzu chih ch'i == Zizhiqi
    tzu chih chou == Zizhizhou
    tzu chih hsien == Zizhixian
    chuan ch'ü == Zhuanqu
    hsing cheng ch'ü == Xingzhengqu
    tsu jan pao hu ch'ü == Ziran Baohuqu
    tu chia ch'ü == Dujiaqu
    t'e ch'ü == Tequ
    Examples:
    151 -0 $a Lu-ch'üan I tsu Miao tsu tzu chih hsien == 151 -0 $a Luquan Yizu Miaozu Zizhixian (China)

    151 -0 $a Sang-hsiung ti ch'ü == 151 -0 $a Sangxiong Diqu

    110 2 $a Lien-nan Yao tsu tzu chih hsien ti fang chih pien tsuan wei yüan hui == 110 2 $a Liannan Yaozu Zizhixian di fang zhi bian zuan wei yuan hui

    110 2 $a Fu-yang chuan ch'ü wen hsüeh i shu kung tso che lien ho hui == 110 2 $a Fuyang Zhuanqu wen xue yi shu gong zuo zhe lian he hui

    Two-syllable place names: the individual syllables of a two-syllable place name have been capitalized and separated from each other:
    151 -0 $a Ling-hsien (China) == 151 -0 $a Ling Xian (China)

    151 -0 $a Ch'i-hsien (Henan Sheng, China) == 151 -0 $a Qi Xian (Henan Sheng, China)

    Conventional place names: any heading in the NAF which was not consistent with the recently changed conventional form has been converted by the conversion program.

    People's Republic of China: the name of the People's Republic of China converts in the same manner wherever it appears, in accordance with BGN practice:

    130 -0 $a Chung hua jen min kung ho kuo ti fang chih ts'ung shu == 130 -0 $a Zhonghua Renmin Gongheguo di fang zhi cong shu

    110 2 $a Chung-hua jen min kung ho kuo Hsia-men hai kuan == 110 2 $a Zhonghua Renmin Gongheguo Xiamen hai guan

    Taiwan: following current BGN guidelines, headings for place names in Taiwan were not converted to pinyin, but remain in Wade-Giles romanization. Qualifiers that included headings for Taiwan place names in established heading form were not converted:
    151 -0 $a Kao-hsiung shih (Taiwan)
    151 -0 $a Wu-feng hsiang (T'ai-chung hsien, Taiwan)
    451 -0 $a T'ai-chung hsien Wu-feng hsiang (Taiwan)
    130 -0 $a T'ai-wan wen i ts'ung shu (Kao-hsiung shih, Taiwan)
    451 -0 $a Taiwan wen yi cong shu (Kao-hsiung shih, Taiwan)
    but:
    130 -0 $a Chuan t'i pao kao (Taiwan. Min cheng t'ing) == 130/0 $a Zhuan ti bao gao (Taiwan. Min zheng ting)
    References
    References for personal names (400) and geographic locations (451) were converted, and the Wade-Giles forms were retained. For example,
    100 1 $a Liu, Gongmian, $d1824-1883
    400 1 $a $wnne Liu, Kung-mien, $d1824-1883
    400 1 $a Liu, Shumian, $d1824-1883
    400 1 $a Liu, Mianzhai, $d1824-1883
    400 1 $a Liu, Shu-mien, $d1824-1883
    400 1 $a Liu, Mien-chai, $d1824-1883
    151 -0 $a Tongren Diqu (China)
    451 -0 $a $wnne T'ung-jen ti ch'ü (China)
    451 -0 $a Guizhou Sheng Tongren Diqu (China)
    451 -0 $a Kuei-chou sheng T'ung-jen ti ch'ü (China)
    References for corporate bodies (410), meetings (411), and uniform titles (430) were converted, but the Wade-Giles forms were not retained. For example,
    110 2 $a Mei tan ke xue yan jiu yuan
    410 2 $a $wnne Mei t'an k'e hsüeh yen chiu yüan
    410 2 $a Mei tan ke xue yuan (China)
    410 2 Coal Scientific Research Inst. (China)
    410 1 China. $bMei tan gong ye bu. $bMei tan ke xue yan jiu yuan
    410 1 China. $bMei tan gong ye bu. $bCoal Scientific Research Inst.
    If any portion of the heading converts, and the heading only includes a subfield $a, the Wade-Giles 1xx field was retained as 4xx $wnne. Headings that include other subfields (e.g. $b, $t) were not retained. The exception: headings for personal names in Wade-Giles form with $c and $d subfields were retained. Examples:
    110 1 China. $bZui gao fa yuan [former Wade-Giles heading not retained as a reference]

    100 1 $a Wu, Jingzi, $d1701-1754. $tRu lin wai shi [former Wade-Giles heading not retained as a reference]

    100 10 $a Zhou, Enlai, $d 1898-1976
    400 10 $a $wnne Chou, En-lai, $d 1898-1976

    100 0 $a Zhao Wuling wang, $cEmperor of China, $d340 B.C.-295 B.C.
    400 0 $a $wnne Chao Wu-ling wang, $cEmperor of China, $d340 B.C.-295 B.C.

    Pre-AACR2 Headings
    Pre-AACR2 headings for personal names have been converted, but these headings will remain coded for pre-AACR2 rules. Other pre-AACR2 headings for corporate bodies, meetings, uniform titles and geographic locations were not converted; they may be converted at a later date, as they are encountered by catalogers.
    Mixed Text
    When Wade-Giles syllables and other syllables appear together in a subfield, the record will be sent to a manual review file. Some of these subfields have been converted by the machine program, others have not. LC will review these and make corrections as appropriate.
    Go to Table of Contents
  5. Excluded From Conversion
  6. A great effort has been made to prevent the conversion of headings which appear to be romanized according to Wade-Giles, but in fact were not. OCLC provided Library of Congress staff with several lists of headings for personal names that had Wade-Giles syllables, but with access points on a significant number or percentage of non-Chinese bibliographic records. From these lists, LC compiled a list of headings (ca. 2400 names) which were blocked from conversion. For example:
    Chiang, Ching-kuo, 1910-1988
    OCLC also provided extensive lists of headings that included 2 or 3 Wade-Giles syllables. LC staff researched the headings and compiled a list of several thousand which could have been mistaken for Wade-Giles headings, but were in fact not romanized according to Wade-Giles. For examples:
    Wang Chung (Musical Group)

    Yuan, Wen Lin, $d 1928-

    The records on these two lists were marked 'n' in the 008/07 field, but were otherwise unchanged in the conversion process. They must remain unconverted.
    Go to Table of Contents
  7. Manual Review and Cleanup
  8. Procedure: OCLC converted authority records fully, or to the extent possible, and marked them 'c' in the 008/07 field. Those records which were not fully converted, and those with certain characteristics, were sent to LC in one of several files for manual review.

    These are the manual review and cleanup projects with the highest priority:

    General manual review -- ambiguous subfields; x00$c subfields; questionable personal names; headings in languages other than Chinese; at issue = did they convert or not? should they have converted or not? Priority: very high; target date: October 31

    110 10 $a China. $b Nei cheng pu
    [$b was not converted because it is ambiguous, that is, it could be either Wade-Giles or pinyin]

    100 00 $a Miaozhou, $cShih [$c should have converted to Shi]

    100 10 Wang, Wei-Ko [not a Wade-Giles personal name; it should not convert]
    Review of headings categorized as 'fully converted' to identify headings in languages other than Chinese, and a very small number of headings in Chinese which may not have converted; at issue = did they convert or not? should they have converted or not? Priority: very high; target date: October 31

    Subfields that consist of a single syllable Wade-Giles term; at issue = did that term convert correctly? Priority: high; target date: January 1, 2001

    100 10 $a Ba, Jin, $d1905- $t Chun. $lKorean

    110 10 $a China. $b Lu jun. $bShi, 200

    x30 fields in which the machine program converted a single-syllable generic term for a place name, or a 2-syllable place name; at issue: was the term converted in the heading or in a qualifier? Did the term convert correctly? Priority: high; target date: January 1, 2001

    Undifferentiated (non-unique) personal names (ca. 8400); issue: many of these records will be brought into conformance with NACO guidelines by LC catalogers and a number of NACO volunteers; Priority: high; target date: January 1, 2001

    Two categories of uniform titles and headings for meetings which are qualified by Taiwan place names; at issue = the possible presence of a corporate body in a qualifier along with a Taiwan place name (the corporate body should be converted, while the place name should not); Priority: high; target date: January 1, 2001

    Headings rejected by OCLC's secondary filtering mechanism; at issue = there may be a small number of records which need to be converted; Priority: High; ; target date: January 1, 2001

    Single-syllable generic terms for place names; at issue = capitalization. Priority: low (filing and access not affected)

    Go to Table of Contents
  9. Records Converted Incorrectly
  10. Note: As might be expected in a project of this size and scope, a few records among those marked 'c' in the 008/07 field have been converted incorrectly. Included are a small number of records in many languages. We have identified most of these errors. Cleanup projects #1 and #2 above are under way. We intend to correct these errors by the end of October.

    If a NACO library encounters a conversion error before it has been corrected by LC or another NACO member, please make the correction promptly. If you are not sure how to undertake the correction, please contact Cathy Yang (cyan@loc.gov) on LC's Cooperative Cataloging Team. Non-NACO participants encountering erroneous conversions should bring them to the attention of CPSO or Philip Melzer (pmel@loc.gov).

    Go to Table of Contents
  11. Caution: Do Not Use Converted Pinyin Name Headings on Unconverted Bibliographic Records!!
  12. If you use a converted pinyin heading on an unconverted bibliographic record, you risk its being converted to another form by a bibliographic conversion program. If you're not sure, look at the marker in the 008/07 field to see if the heading has been converted (or is subject to manual review). It is safe to use a converted heading on a bib record that uses a marker in the 987 field to indicate that it is in pinyin form. Conversion programs for bib records will not further change any record with a 987 field marked "PINYIN" in the $a subfield.
    Go to Table of Contents

Bibliographic Records (May 30, 2001)

It is our intention to further expand and clarify this description of how bibliographic records are being converted in the coming months. In the meanwhile, if there is an aspect of the conversion of bib records about which you have a question or need further clarification, please contact Philip Melzer at pmel@loc.gov.

  1. Scope of Conversion
  2. RLG is converting all bibliographic records in the RLIN database which are coded Chinese (CHI) in the 008/35-37 field, beginning with clusters containing Library of Congress records having library identifiers DCLP or DCLC. OCLC has converted Chinese serial records, and is now converting bibliographic records coded Chinese (CHI) in the 008/35-37 field, beginning with the most recent records; then proceeding to convert other records in WorldCat.

    The conversion program either converts what it finds in a subfield, or analyzes the subfield to determine if it should proceed to convert the syllables therein. All variable fields will be subject to conversion, following Library of Congress specifications, with the following exceptions: 210, 222, 246$i, 520, 546, 561$a, 60030$a (family names), and 653.

    Go to Table of Contents


  3. Extent of Conversion of Individual Records
  4. Chinese romanization will be consistent on the vast majority of bibliographic records after they have been converted. However, because data dictionaries and conversion sequences were not applied in the same manner to all subfields, certain syllable strings will convert differently in different subfields. For example, if mixed text procedures have been applied to subfield A but not to subfield B, a certain string of syllables in subfield A may not convert, while the same string in subfield B would. Because a data dictionary is applied to a subject heading but not a descriptive string, the subject heading may convert correctly but the same term appearing in a title may not. Most of the bibliographic records containing such inconsistencies will be marked for review. Some other problems may require special searching after conversion.

    We have tried to explain these inconsistencies in the following section of the pinyin home page. If you encounter an inconsistency in conversion that you don't understand, or have difficulty in constructing searches to find all instances of a certain kind of conversion, please contact the Library of Congress or your users group for assistance.

    A. Marker

    A local MARC field 987 has been added to all bibliographic records being converted to pinyin by machine program at RLG and OCLC. After October 1, 2000, libraries should add the 987 field to all records that are coded Chinese in the 008/35-37 language field, or any other bib record containing romanized Chinese characters. The 987 field shows that a record has been created in pinyin. That record will not then be converted by the utilities' conversion programs. Thus, adding field 987 to a record eliminates the potential for possible erroneous conversion.

    For example, the marker would be added to bibliographic records under these circumstances:

    a fully romanized Chinese record

    a romanized name appears in a statement of responsibility, with a corresponding romanized added entry

    a romanized name appears in a statement of responsibility, but the corresponding added entry is not given in romanized form

    a note includes romanized Chinese text, but there are no romanized Chinese access points on the record

    Updated instructions, March 20, 2002

    Continue adding a pinyin marker to all records that are coded Chinese in the language code.

    Insofar as possible, the Library of Congress will add the pinyin marker to new non-Chinese records on which romanized Chinese is present, as well as to non-Chinese records that are converted to pinyin, until completion of the conversion project. We have found that this application of the marker makes it possible for the Library to eliminate converted non-Chinese records from being processed again.

    OCLC users are being urged to add a pinyin marker to all non-Chinese records on which romanized Chinese is present.

    RLIN users are not being asked to add a pinyin marker to non-Chinese records by RLG.

    RLG and OCLC have substantially completed the machine conversion of bibliographic records in the RLG Union Catalog and WorldCat. Nevertheless, they urge the continued use of the 987 field until such time as there are no more Wade-Giles records being loaded into the two databases, or being created as a result of retrospective conversion projects.

    B. Headings:

    Headings in access points should convert in the same manner as authority records: see Authority Records, 2. Extent of Conversion on Individual Records, for a description of how headings on authority records converted.

    The forms of heading for several place names have changed since the data dictionaries in the conversion program were written. A list of these changed headings will also be posted in the near future, giving the heading as it will appear on the converted record, along with the more recent, correct forms. The list can be used to manually update those subject headings on bibliographic records so that they correspond with the subject authority records.

    EXAMPLE:

    Headings on bib records were converted to:
    Shenzhen jing ji Tequ (Shenzhen, Guangdong Sheng, China)
    But these headings now need to changed to the current form :
    Shenzhen Jingji Tequ (Shenzhen, Guangdong Sheng, China)


    C. Pre-AACR2 Headings:

    Subfields in headings on bibliographic records were converted if they appeared to the conversion program to be in Wade-Giles romanization. Therefore, if a subfield in a pre-AACR2 heading was used on a bibliographic record that is converted by the machine program, it probably was converted to pinyin.

    D. Subject headings:

    D1. Personal Names (600 fields), Meetings (611 fields)

    Subject headings for personal names and meeting names converted in the same manner as they did in other access points. OCLC flagged 611$c names containing "Taiwan" for manual review, and "Taiwan" is noted in the 987$f subfield.

    D2. Corporate headings (610 fields), Uniform titles (630 fields), Topical Subject Headings (650 fields), Geographics (651 fields)

    Topical subject headings (650 fields) are converted by data dictionary only: if a topical subject heading in a 650 field exactly matches an entry in the data dictionary, the conversion is made; if there is not an exact match, no conversion takes place. Examples:
    650 -0 $a Ju-I (Scepters) == 650 0 $a Ru yi (Scepters)
    650 -0 $a Feng-shui == 650 -0 $a Feng shui

    In fields 610 (corporate bodies) and 651 (geographics), the machine program first checks data dictionaries: if there is an exact match, a conversion is made; if there is not an exact match, then the machine program tries other routines to convert a string of syllables.

    EXAMPLES:

    651 -0 $a Wei River Valley (Kansu Province and Shensi Province, China) == 651 -0 $a Wei River Valley (Gansu Sheng and Shaanxi Sheng, China)
    [exact subject match]


    610 20 $a Yüan tung fang chih Group == Yuan dong fang zhi Group
    [exact subject match]


    610 20 $a Yen-ch'ing ta hsüeh == Yanjing da xue
    [no exact subject match; standard conversion]


    651 -0 $a Chen-ching shih (China) == 651 -0 $a Zhenjiang (Jiangsu Sheng, China)
    [no exact subject match, but there is an exact match in the data dictionary of conventional place names]


    651 -0 $a Fang-shan hsien (China) == 651 -0 $a Fangshan Xian (China)
    [no exact subject match; Xian is capitalized (generic place name); then standard conversion]


    The forms of heading for a number of subject headings have changed since the data dictionaries in the conversion program were written. Also, several subject headings were not converted. Lists of these changed and unconverted headings, as well as lists of all Chinese subject headings that converted in the years 1999 and 2000, and the Chinese subject headings that were changed to descriptive headings, are posted on the pinyin home page. These lists should be used to manually correct subject headings on bibliographic records so that they correspond with headings on subject authority records. [updated November 5, 2001]
    x30 00 $a Tun-huang manuscripts
    -- did not convert; it should convert to:
    x30 00 $a Dunhuang manuscripts

    651 -0 $a Mo-li miao Reservoir (China)
    -- has been converted to:
    651 -0 $a Moli Miao Reservoir (China)
    -- the heading now needs to be changed to the current form:
    651 -0 $a Muruin Sum Reservoir (China)
    D3. Headings for Regions

    Subject headings for regions that included certain multi-syllable generic terms were converted by data dictionary (see below, section E3). Most converted correctly when they appeared in subject headings. It appears at present that fewer than 100 Chinese bib records need to be corrected. At least that many non-Chinese records will also have to be located and corrected.

    Conversion failed to occur for several different reasons. Because of an error in the specifications (which has since been corrected), the heading Sinkiang Uigur Autonomous Region usually did not correctly convert to Xinjiang Uygur Zizhiqu in 651$a or 650$z subfields. Some headings for regions did not convert because of typographical errors; others because they included terms for former (non-Wade-Giles) conventional names.
    EXAMPLES:

    651 -0 $a Canton Region (China)… [change manually to Guangzhou Region (China)]

    651 -0 $a Taiyuan Shi Region (China) [change manually to Taiyuan Region (Shanxi Sheng, China)]

    650 -0 … $z Xingyi Shi Region (Guizhou Sheng) [change manually to Xingyi Region (Guizhou Sheng)]
    A few regions that should have converted as 'conventional place names' may also not have converted.
    EXAMPLE:

    650 -0 … $z Sinkiang Uighur Autonomous Region [change manually to Xinjiang Uygur Zizhiqu]
    Reviewing and cleaning up headings for regions also provides us with the opportunity to correct some existing errors in word order.
    EXAMPLES:

    650 -0 … $z Tangshan (Hebei Sheng) Region [change manually to Tangshan Region (Hebei Sheng)]

    650 -0 … $z Luoyang (Henan Sheng) Region [change manually to Luoyang Region (Henan Sheng)]

    651 -0 $a Luoyang Shi Region (China) [change manually to Luoyang Region (Henan Sheng, China)]
    E. Descriptive fields:

    E1. 2-syllable place names.

    The conversion programs almost always identified 2-syllable place names in corporate headings. Because testing revealed that 2-syllable place names could not reliably be identified and converted in descriptive subfields, conversion sequence G2 (Two-syllable place names, in which the second syllable is a generic term) was applied to headings for place names, but not to descriptive subfields (such as the 245$a, 260$b, 500$a and 740$a subfields).

    To compensate for this necessary shortcoming, records in which these sequences fired have been marked for review in the 987$f subfield. At least in these records, the descriptive strings in which these names are most likely to occur can be scrutinized and corrected if necessary. LC will correct these errors on a priority basis. When reviewing one of these records,
    1) check to make sure that the access point converted correctly, and
    2) check descriptive fields (245, 260, 440, etc.): if the 2-syllable place name appears there, correct the romanization of the name

    EXAMPLE 1:

    245 00 $a Lixian zhi… [change manually to Li Xian zhi…]
    245 00 $a

    651 -0 $a Li Xian (Sichuan sheng, China)… [change manually to Li Xian (Sichuan Sheng, China)…]
    651 -4 $a (Sichuan sheng, China)… [change manually to (Sichuan Sheng, China)…]

    987 $a PINYIN $b… $c… $d r $e… $f see descriptive cataloging for 2-syllable place name

    EXAMPLE 2:

    245 00 $a Huaxian zhi… [change manually to Hua Xian zhi…]
    245 00 $a

    651 -0 $a Hua Xian (Henan Sheng, China)
    651 -4 $a (Henan Sheng, China)

    987 $a PINYIN $b… $c… $d r $e… $f see descriptive cataloging for 2-syllable place name
    E2. Single syllable generic terms for jurisdictions

    Single syllable generic terms for jurisdictions have not been capitalized in descriptive strings because the conversion program could not identify them reliably. Conversion sequence G1 (Single syllable generic terms for jurisdictions used in proper names) has been applied to headings for place names, but not to descriptive subfields. Records with these terms have not been marked for review in the 987 field. LC will correct these sorts of capitalization problems on an as-encountered basis.
    EXAMPLE 1 from a single record:

    245 00 $a Xinmi shi zhi, 1986-1995 ... [change manually to Xinmi Shi zhi, 1986-1995...]
    245 00 $a , 1986-1995 ...

    651 -0 $a Xinmi Shi (China)
    651 -4 $a (China)

    987 $a PINYIN $b… $c… $d r $e… $f G1

    EXAMPLE 2:

    245 00 $a Chengdu shi zhi… [change manually to Chengdu Shi zhi…]
    245 00 $a

    710 2- $a Chengdu Shi di fang zhi bian zuan wei yuan hui.
    710 2- $a ..

    987 $a PINYIN $b… $c… $d r $e… $f G1
    E3. Multi-syllable generic terms may have converted incorrectly

    E3A. How these terms should have converted

    Conversion specifications called for the identification and conversion of 10 multi-syllable generic terms for jurisdictions (see list in specification G3). Wade-Giles romanization separated these syllables, both in descriptive text and access points. The conversion programs were to identify these 10 terms when they were used as part of place names, connect the previously unconnected syllables, and capitalize the resulting term.

    E3B. 3 categories of errors, with examples

    These terms converted correctly in most instances. However, several kinds of errors did occur in the conversion of these multi-syllable terms on LC's Chinese bibliographic records by RLG. These errors were not marked for review in the 987 field.

    1. Several strings of syllables which were not listed among the terms given in the conversion specifications were joined together erroneously. As far as we are aware today, these are the terms which have been created incorrectly, along with the number of LC Chinese records on which they occurred:
    diquan 78
    diqueh 5
    dujiaqu 1
    minzu 65
    tequan 7
    xingzhengquan 0
    zizhiquan 1
    zhuanquan 4
    The individual syllables in these terms should be separated, not joined. The LC bib records that had these errors have been corrected.
    EXAMPLES:

    245 00 $a Bi qiu ni Zhuanquan ji [change manually to Bi qiu ni zhuan quan ji]
    245 00 $a

    240 10 $a Zuo zhuan
    240 10 $a
    245 10 $a Zuo Zhuanquan yi [change manually to Zuo zhuan quan yi]
    245 10 $a

    245 10 $a Qian yi shi Diquan shi [change manually to Qian yi shi de quan shi]
    245 10 $a
    2. In certain instances, strings of syllables given in the specifications were joined in parts of the bibliographic record where they should have been left separate. We have just begun to identify the scope of the problem. We have searched for all of the terms given in the specification. The following chart shows the 10 multi-syllable terms, the number of LC Chinese records on which these converted terms appear, and, insofar as we can estimate at this time, the approximate number that will have to be corrected.
    TERM HITS NEED CORRECTION
    Diqu 1670 ca. 1100
    Dujiaqu 1 0
    Tequ 88 ca. 55
    Xingzhengqu 75 ca. 40
    Zhuanqu 11 2
    ziran 33 0
    Zizhiqi 11 0
    Zizhiqu 11 0
    Zizhiqi 1300 0
    Zizhixian 253 0
    Zizhizhou 356 0


    EXAMPLES:

    245 00 $a Guan yu fa hui Diqu you shi di yan jiu [change manually to Guan yu fa hui di qu you shi de yan jiu]
    245 00 $a

    245 00 … $c Hua dong Diqu da xue… [change manually to Hua dong di qu da xue…]
    245 00 … $c

    245 00 $a Zhongguo li dai Xingzhengqu hua [change manually to Zhongguo li dai xing zheng qu hua
    245 00 $a

    3. Following the specifications, the conversion programs occasionally converted a multi-syllable term in the wrong context. In other words, even though the terms were converted accurately according to the specifications, correction is still needed.
    EXAMPLE:

    245 10 $a Zhongguo Xingzhengqu hua gai lun [change manually to Zhongguo xing zheng qu hua gai lun]
    245 10 $a
    E4. 0 Zu, the generic term for People

    The term 0 zu ("people") is normally separated from the "name of racial, linguistic, or tribal (grouping) of mankind", as stated in the pinyin romanization guidelines ("Separation of Character Romanizations", section 1; example given in section 2c). The exception to this rule occurs when the name of the people, along with the term zu, are included as part of a place name (section 2b). The exception is made to conform with BGN romanization practice.

    The pinyin conversion specifications make this distinction in conversion sequence G3 (Multi-syllabic generic terms for jurisdictions used as part of a proper name). Consider the following examples:
    EXAMPLE 1:

    245 00 $a Maowen Qiangzu Zizhixian Heihuxiang she hui diao cha bao gao
    245 00 $a 0

    440 -0 $a Qiang zu diao cha cai liao
    440 -0 $a

    The term zu is connected to the name of the people, Qiang, in the 245$a subfield by conversion sequence G3 because this term is part of a place name (Maowen Qiang People's Autonomous County). However, it is separated in the series statement because the title reads "Qiang People research material". Here, "Qiang People" is not part of a place name. [The conversion specifications did not call for 'xiang' to be connected to the preceding place name; this change will have to be made manually.]

    EXAMPLE 2:

    245 00 $a Yunnan Sheng Dehong Daizu Jingpozu Zizhizhou she hui gai kuang. $p Jingpo zu diao cha cai liao
    245 00 $a 0. $p 0

    The term zu is connected to the name of the people, Jingpo, in the 245$a subfield because it is part of a place name (Yunnan Province Dehong Dai People and Jingpo People's Autonomous District). The term zu is separated in the 245$p subfield because it is not part of a place name ("Jingpo People research material").
    E5. de/di. See http://lcweb.loc.gov/catdir/pinyin/di2.html

    E6. A data dictionary may have been applied in one subfield but not another


    Data dictionaries were used in certain situations by the conversion program to replace one string of syllables with another. This made it possible to accomplish non-standard conversions. For example, topical subject headings (field 650$a) were compared with terms in a data dictionary; when a match occurred, the Wade-Giles form on the bibliographic record was replaced with the pinyin form from the data dictionary. If an exact match did not occur, no conversion was carried out. Data dictionaries were also used to convert conventional place names appearing in access points such as the 110$a and 651$a subfields. The basic conversion of syllables was accomplished through use of a data dictionary.

    However, it was not practicable to use data dictionaries other than the standard syllable table in descriptive subfields. For that reason, the same term appearing in a heading and in a descriptive string may have converted differently. Sometimes the results are correct, as shown in example 1; in other cases, one or more descriptive subfields may need to be corrected (as in example 2).
    EXAMPLE 1:

    245 00 $a Taiyuan bao wei zhan...
    245 00 $a 0...

    651 -0 $a Taiyuan (Shanxi Sheng, China)...
    651 -4 $a (Shanxi Sheng, China)

    987 $a PINYIN $b… $c… $d c

    EXAMPLE 2:

    245 00 $a Dongzhai niao lei zi ran bao hu qu ke xue kao cha ji… [change manually to Dongzhai Niaolei Ziran Baohuqu ke xue kao cha ji...]
    245 00 $a ...

    651 -0 $a Dongzhai Niaolei Ziran Baohuqu (China)
    651 -4 $a (China)

    987 $a PINYIN $b… $c… $d c
    E7. Ambiguous subfields

    The specifications direct the conversion program to process the 245$a subfield as mixed text, but to simply convert what is found in the 260$b subfield. Because the Wade-Giles syllables pei and tou were identified as being "common" to both Wade-Giles and pinyin, but representing different pronunciations (see the mixed text portion of the conversion specifications), the subfield could represent either Wade-Giles or pinyin romanization. Therefore, to be on the safe side, the conversion program did not convert the subfield, but marked it for review. The same syllables in the 260$b subfield, however, were converted to bei dou.
    EXAMPLE:

    245 00 $a Pei tou. [change manually to Bei dou]
    245 00 $a 0.

    260 $a Xianggang : $b Bei dou chu ban wei yuan hui,
    260 $a : $b 0 ,

    987 $a PINYIN $b… $c… $d r $e… $f 245$a
    E8. Guangzhouese

    On Chinese bib records converted in RLIN, the word Cantonese was converted to Guangzhouese when it appeared in subject headings. This term has been manually corrected on all LC records.
    EXAMPLES:

    650 -0 $a Guangzhouese dialects – change manually to
    650 -0 $a Cantonese dialects

    650 -0 $a Cookery, Chinese $x Guangzhouese style – change manually to
    650 -0 $a Cookery, Chinese $x Cantonese style
    F. Mixed text:

    The conversion program analyzes the descriptive subfields which were most likely to contain a combination of Wade-Giles syllables along with non-Wade-Giles syllables. When mixed text is identified, the program does this:

    STEP 1 tries to further break down the subfield into smaller subsections;
    STEP 2 it analyzes the syllables in each of the subsections;
    STEP 3 it converts any of those units that contained purely Wade-Giles text;
    STEP 4 it then analyzes the resulting subsections again;
    STEP 5 if the subsections consist of pure Wade-Giles and entirely non-Wade-Giles syllables, the program considers the subfield to have been converted;
    STEP 6 if mixed text remains in any of the subsections, then the record is marked 'r' for review

    EXAMPLES:

    500 $a Fu lu: wu sa shih ch'i wen hsüeh ta shih chi: p. 284-289. == 500 $a Fu lu: wu sa shi qi wen xue da shi ji: p. 284-289.
    [Wade-Giles text was separated from non-Wade-Giles text and converted]


    245$a Ta T'ang pi hua / $c Li Kuo-chen pien chuan ; [lin mo T'ang Ch'ang-tung] = Magnificent frescos from the great Tang dynasty / compiled by Li Guozhen ; [reproduction Tang Changdong]. == 245 $a Da Tang bi hua / $c Li Guozhen bian zhuan ; [lin mo Tang Changdong] = Magnificent frescos from the great Tang dynasty / compiled by Li Guozhen ; [reproduction Tang Changdong].
    [the $c subfield was identified as having mixed text; the convoersion program them subdivided the subfield into smaller units; the first two units (up to the semicolon that follows Wade-Giles chuan) were converted; the units that followed could not be converted, so the record was marked; do not convert those portions, since they were transcribed from the item]

    500 $a A collection of essays published in various journals written by Hu P'eng and Hu K'o.
    [Wade-Giles text could not be separated from the non-Wade-Giles text, so the field was not converted and the record was marked; convert manually]


    500 $a Reprint of the 1940 ed. published by Kuo hsüeh cheng li she, Shanghai.
    [Wade-Giles text could not be separated from the non-Wade-Giles text, so the field was not converted and the record was marked; convert manually]


    245 $b A study of Chang-sha ware in Tang dynasty
    [Wade-Giles text could not be separated from the non-Wade-Giles text, so the field was not converted and the record was marked; review but do not convert because the text was not romanized but transcribed]


    830 -0 $a Shan-tung sheng chih (Series)
    [Wade-Giles text could not be separated from the non-Wade-Giles text, so the field was not converted and the record was marked; convert manually]


    246 14 Reminiscences of Mr. Wei Yung-ning
    [Wade-Giles text could not be separated from the non-Wade-Giles text, so the field was not converted and the record was marked; review but do not convert because the text was not romanized but transcribed]
    The mixed text procedure will identify certain typographical errors as constituting a combination of Wade-Giles and other syllables, and will mark them for review
    EXAMPLES:

    245 10 … /$c Hsü Hüeh-lu pien chu.
    [typo Hüeh identified by conversion program as non-Wade-Giles syllable, so the field was not converted and the record was marked; convert manually]


    246 30 $a Chung-kuo ching chi she hui fa chan chan lu¨eh
    [typo lu¨eh identified by conversion program as non-Wade-Giles syllable, so the field was not converted and the record was marked; convert manually]
    In the example below, mixed text processing was applied to the 245$a and 580$a subfields. The 245$a subfield did not convert because the syllables were identified by the conversion program as being ambiguous (see section E below). The personal names and corporate body in the 580$a subfield were not converted because the program could not separate out a convertible string of syllables from those which are not convertible. However, Wade-Giles personal names in access points, such as the 700 fields, were always converted to pinyin (but see section 3, Excluded from Conversion, below).
    EXAMPLE:

    245 00 $a Kuo feng pao.

    580 $a Photoreprint of the periodical (3 no. a month; some numbers called also: Kouk fong po) published by Kuo feng pao kuan, Shanghai and edited by Ho Kuo-chen and Liang Ch'i-ch'ao.

    700 1 $a He, Guozhen.

    700 1 $a Liang, Qichao, $d 1873-1929.

    987 $a PINYIN $b… $c… $d r $e… $f 245, 580
    G. Non-roman (880) fields: Certain qualifiers for headings in non-roman (880) fields may not convert to the same form found in their matching roman fields. The conversion program identifies conventional place names from a list (data dictionary) and converts them to a specific form. When the geographic qualifier for one of these conventional place names changes on other than a one-to-one basis, the qualifier in the parallel field may not have converted in the same manner.
    EXAMPLES:

    651 -0 $a Hai-k'ou shih (China)
    651 -0 $a 0 (China) converted to
    651 -0 $a Haikou (Hainan Sheng, China)
    651 -4 $a (China)

    651 -0 $a An-yang shih (China)
    651 -0 $a (China) converted to
    651 -0 $a Anyang (Henan Sheng, China)
    651 -0 $a (China)

    Go to Table of Contents
  5. Excluded from Conversion


  6. Headings for the personal names on the Exclusion List, numbering approximately 2400, were not converted to pinyin (see Authority Records, 3. Excluded from Conversion, for a description of the Exclusion List). The Exclusion List itself may be viewed in the conversion specifications. Certain pre-AACR2 forms of names of jurisdictions were also excluded from conversion.

    Go to Table of Contents
  7. Caution: Do not use Converted Pinyin Name Headings on Unconverted Bibliographic Records!!
  8. If you use a converted pinyin heading on an unconverted bibliographic record, you risk its being converted to another form by a bibliographic conversion program. If you're not sure, look at the marker in the 008/07 field to see if the heading has been converted (or is subject to manual review). It is safe to use a converted heading on a bib record that uses a marker in the 987 field to indicate that it is in pinyin form. Conversion programs for bib records will not further change any record with a 987 field marked "PINYIN" in the $a subfield.
    Go to Table of Contents


Pinyin Conversion Project Home Page
Cataloging Directorate Home Page
Library of Congress Home Page

Library of Congress
Library of Congress Help Desk (11/07/01)