[ MC Home ]
Instructions for Concordancing Chinese E-Texts
using Wenlin
The Wenlin program can search for all instances of a word or string from one or more e-texts. For best results, first place the file or files to be concordanced in its own subdirectory. Then, open Wenlin and proceed as instructed below. In the instructions below, Wenlin 3.1 is used. For this set of instructions, one source e-text is chosen. It is a GB-encoded, non-spaced e-text with simplified Chinese characters. The selection is chapter 2 of the Honglou Meng (红楼梦) that is included with the Wenlin program.
- As the first step, in the Wenlin program, make sure that you have selected traditional or simplified forms of Chinese characters for input. To check, go to Options in the menu bar, and see if Simple form characters is checked/ticked or not. It should be unchecked if the e-text is in traditional Chinese characters, and should be checked/ticked if the e-text is in simplified Chinese characters. In the screenshot shown below, Simple form characters is not checked/ticked. Since the source file contains simplified characters, click on Simple form characters to select input of simplified Chinese characters.
![]()
- Next, from the menu bar, select Search, and then Search files ....
![]()
- The character to be input for searching is jiāng 将, as displayed in the screenshot below. Input in Wenlin via Pinyin romanization, with or without tone number (1, 2, 3, 4), click Convert (or use a slash (/) bar after the romanization), and then select the correct word from the options displayed in the conversion bar. Then click OK.
![]()
- Next, choose the folder containing the file(s) to be searched. In this case, the folder is "Ch 2," containing one file, namely, 002.GB. Observe that Search next file after 3 matches is checked/ticked. Uncheck/untick that default option, because we want to search for all instances of 将, and not just the first three occurrences. Now, click Open, to open the file. With the default selection of "*.*", all files (including just one, as in this case) in the folder are selected.
(Note: The default option is useful for multiple files for determining whether and which files contain at least three instances of the searched item.)![]()
- A pop-up window prompts for the selection of encoding. That is, you need to select which encoding system was used in the source e-text(s). In this case, the e-text was GB-encoded, as shown in the selection below. After the correct encoding is selected, click OK.
![]()
- As shown in the screenshot below, the search yields 15 occurrences (or "hits"). The source for each occurrence is also displayed. Clicking on any triangle button will display the full context where that particular token came from in that e-text.
(Note: Be sure that the "hand" icon in the toolbar (on the left or the right side of the screen, based on user selection) is highlighted before clicking on the triangle button.)![]()
- The last screenshot below shows the result of clicking on the upper-most triangle button in the search results above. That token of 将 is displayed within a red square here (added to the screenshot for illustrative purposes), while that token in the full context is highlighted in yellow. (Highlighting, with user-selected color, is available in version 3.1 of the Wenlin program.) The searched result can be copied and pasted to a new file and saved for further study later.
![]()
From the above, you can see that the Wenlin software program can be used for concordancing. Equally importantly, the ocncordancing results are accurate. However, Wenlin was designed for language-learning purposes and was not intended to be used as a full-fledged concordancer. Hence, Wenlin has several limitations with respect to concordancing needs; there is no KWIC (keyword-in-context) display, for instance, and no sorting capabilities. To make concordances with KWIC-display results for Chinese e-texts and to sort the results, one would need to use dedicated concordancing software, such as R.J.C. Watt's Concordance program1 or Michael Barlow's MonoConc Pro program.2----------
Notes:
1See my online Instructions for Concordancing East Asian E-Texts using Concordance.
2See my online Instructions for Concordancing Chinese E-Texts using Monoconc Pro 2.2.
[ MC Home ]Created by Marjorie K.M. Chan on 9 April 2003 for a concordancing workshop at Kenyon College.
Last update: 25 March 2005 (with additional information and links since April 2003).
Copyright © 200x Marjorie K.M. Chan. All rights reserved.URL: http://people.cohums.ohio-state.edu/chan9/conc/wenlin_conc.htm