A.Stat/Transfer stores strings internally in Unicode, which is capable of storing all of characters in all languages, plus many, many other symbols. Most older character sets are of much more limited scope. For instance, the most common encoding, ASCII, is only capable of storing a handful of symbols, letters and numbers, since it has only 127 locations for characters and control codes. Other single-byte character sets double the amount of storage and allow accented characters and other useful symbols. There are a number of such single-byte character sets, for instance one is suitable for the Cyrillic alphabet and another for modern Greek.
When Stat/Transfer reads data, it converts it to Unicode either based on the settings for character sets in the encoding options, or information written in the input file. If you file does not have information on the encoding, and is in a character set that is not the default encoding used on your computer, you must tell Stat/Transfer which encoding to use. For instance, if a Greek colleague sends you a Stata dataset, you may need to select a Greek character set in order to properly read it and translate it to a Unicode based system such as Excel. If the dataset contains non-ASCII strings and you do not set the encoding properly, you will get nonsense on output.
Because all single byte characters can be mapped to Unicode, there are seldom errors on input. However, you might encounter them if you are reading multi-byte characters such as those for Japanese.
The most common problems occur on output, when sometimes a character that was read on input has no mapping to the output character set. For instance, if you read your Greek data set and attempted to write it to SAS, using your Western European machine default, there would be many encoding errors because Greek characters in Unicode cannot be mapped to a character set such as latin1.
Some problems are more surprising because it looks as if you are dealing with ASCII, but your file has some characters that cannot be represented in the output. For instance all Microsoft applications use Unicode and characters such as the left apostrophe cannot be mapped to common non-unicode character sets. The same is true for the Euro sign, which is not present in ISO-8859-1, but is present in its more modern replacement, ISO-8859-15. If any of these characters are present, they create a potential for encoding errors.
In order to simplify matters, the most current builds of Stat/Transfer substitute characters very freely, for instance an é will go to e if the accented character is not presnt and an
您好,楼主,我转了几万条就出现“Optimization error: The number of encoding errors executed the user-defined limit, execution will be stopped.”我再在Data viewer options中设置的时候,这个limit无法抬高了,就只有几万,主要我有几百万条数据,所以,请问楼主可有解决办法