There are various means by which a space can be added between double-byte, GB/Big5-encoded e-texts. One method is given in this webpage. The following is a set of instructions for adding spaces between double-byte characters using NJStar Communicator 2.2x. A demo version can be downloaded from NJStar's website at <www.njstar.com>
- Use NJStar Communicator and select Universal Code Converter from the menu under the NJStar logo, as shown in the screenshot below.
- For a file that is already exists on your computer, select Text in File. The program will prompt you to select your input source file. Next, select Input Code, Output Code, and Options. In the screenshot below, the Input Code selected is Chinese Auto-Detect, and the Output Code selected is Chinese GBK. That is, the Universal Converter can convert files from one encoding to another. In addition, there are various Options that one can select, among which is to option to select Add Spaces, which is selected in the screenshot. (Other options include Stripping HTML Code, for example, which would strip away the HTML-formatting in webpages.) When selection is done, the file is converted, and the converted file is placed in a subdirectory within the subdirectory containing the source file. This GBK-encoded e-text file is then ready for full concordancing using R.J.C. Watt's Concordance software program.1
1 See my Instructions for Concordancing East Asian E-Texts using Concordance.
[ MC Home ]
Created by Marjorie K.M. Chan on 8 April 2003. Last update: 24 March 2005.
(This webpage is based on an earlier version that was prepared for my Chinese linguistics classes.)
Copyright © 200x Marjorie K.M. Chan. All rights reserved.