全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 Gauss专版
14598 17
2010-12-28
Gauss可以很容易地读写文本文件,也可以使用专用二进制格式的Gauss数据集。数据集后缀可以表示为.txt.dat.asc, .fmt等文件。当然也可以调用excel文件。
可以用load语句读入一个用空格或逗号分隔数据项的文本文件,格式为:
load 向量名(变量名)=文件名;
load 矩阵名[行数,列数] 文件名;
第一种格式读入所有数据到一个向量,适用于未知数据个数的情况,第二种格式读入给定的行、列数的数据项到一个矩阵。例如读入当前目录下的expendure.txt只要用:
load x[1003]=expendure.txt;

要写一个文本文件也很简单,只要用output file = 文件名 reset;


语句就可以打开一个文件并从头开始写,这时用print语句输出的结果在显示到屏幕的同时被输出到指定的文件。如果上面的reset改为on,则输出是附加在指定文件末尾。为了关闭屏幕输出,用screen off;语句。再打开用screen on语句。也可以暂时关闭文件输出,用:output off;语句。恢复用output on语句。对于初学者,输出最好用print语句,简单明了,结果只在gauss系统下的Command Input-Output窗口显示。如使用过多的文本输出语句,不利于自己理解。


注意:在调入数据文件时,其路径一定要符合文件夹所处的地址。


如:output file=gmm.out reset;语句中等号后面没有具体的路径,这是因为我们在菜单栏已经把路径设置到D:\gauss7.0\ex下了。下面的语句load d[172,9]=habit.dat;也是如此。如果菜单栏只是把路径设置到D:\gauss7.0,这时,调用数据以及产出路径就要为load,


d[172,9]=D:\gauss7.0\ex\ habit.dat了,不然gauss系统提示找不到文件。


Gauss data 数据文件的后缀为.dat.dht文件,打开或输入Gauss数据文件时的语句为,



Open x = data




一些软件包,如fanpac,读者最好把fanpac中的,src文件看懂,然后使用它的估计程序,编写自己的
这样你不会局限于它的调用格式了。
一些软件包,做的非常不好,特别是调用数据格式上,不很直观
如.fmt矩阵数据格式的调用。
给大家举个例子:
new;cls;
load cpi[216,1]=tzl.txt; 调用自己的数据格式
save cpi; 保存为.fmt格式
open fx=cpi.fmt; 打开.fmt文件
y1=readr(fx,220); 读写.fmt。
但是这个读写,你在输入输出窗口看不见。
你必须在Tools工具栏下,Matrix Editor 编辑窗口中输入y1才能看到这个数据,但不能复制,可能少了其他一些功能。
所以,有些代码的调用格式是采取.fmt格式,太灵活了,也不好的。

在gauss userGuide中的File I/O输入输出标题下,读者把这些方面一定要读懂。

1、load
load statement can be used to load the data directly into a GAUSS matrix. The resulting GAUSS matrix must be no larger than the limit for a single matrix.

For example,

load x[] = dat1.asc;

will load the data in the file dat1.asc into an Nx1 matrix x. This method is preferred because rows(x) can be used to determine how many elements were actually loaded, and the matrix can be reshape’d to the desired form:

load x[] = dat1.asc;

if rows(x) eq 500;
x = reshape(x,100,5);

else;
errorlog “Read Error”;
end;

endif;

For quick interactive loading without error checking, use

load x[100,5] = dat1.asc;

This will load the data into a 100x5 matrix. If there are more or fewer than 500 numbers in the data set, the matrix will automatically be reshaped to 100x5.

2、Writing

To write data to an ASCII file, the print or printfm command is used to print to the auxiliary output. The resulting files are standard ASCII files and can be edited with GAUSS’s editor or another text editor.

The output and outwidth commands are used to control the auxiliary output. The print or printfm command is used to control what is sent to the output file.

The window can be turned on and off using screen. When printing a large amount of data to the auxiliary output, the window can be turned off using the command

screen off;

This will make the process much faster, especially if the auxiliary output is a disk file.

It is easy to forget to turn the window on again. Use the end statement to terminate your programs; end will automatically perform screen on and output off.

The following commands can be used to control printing to the auxiliary output:

format Specify format for printing a matrix.

output Open, close, rename auxiliary output file or device.

outwidth Auxiliary output width.

printfm Formatted matrix print.

print Print matrix or string.

screen Turn printing to the window on and off.

This example illustrates printing a matrix to a file:

format /rd 8,2;

outwidth 132;

output file = myfile.asc reset;

screen off;

print x;

output off;

screen on;

The numbers in the matrix x will be printed with a field width of 8 spaces per number, and with 2 places beyond the decimal point. The resulting file will be an ASCII data file. It will have 132 column lines maximum.

A more extended example follows. This program will write the contents of the GAUSS file mydata.dat into an ASCII file called mydata.asc. If there is an existing file by the name of mydata.asc, it will be overwritten:

output file = mydata.asc reset;

screen off;

format /rd 1,8;

open fp = mydata;

do until eof(fp);
print readr(fp,200);;

endo;

fp = close(fp);

end;

The output ... reset command will create an auxiliary output file called mydata.asc to receive the output. The window is turned off to speed up the process. The GAUSS data file mydata.dat is opened for reading, and 200 rows will be read per iteration until the end of the file is reached. The data read will be printed to the auxiliary output mydata.asc only, because the window is off.


3、字符串输入输出

getf will read a file and return it in a string variable. Any kind of file can be read in this way as long as it will fit into a single string variable.

To read files sequentially, use fopen to open the file and use fgets, fputs, and associated functions to read and write the file. The current position in a file can be determined with ftell. The following example uses these functions to copy an ASCII text file:

proc copy(src, dest);

local fin, fout, str;

fin = fopen(src, “rb”);

if not fin;

retp(1);

endif;

fout = fopen(dest, “wb”);

if not fin;

call close(fin);

retp(2);

endif;

do until eof(fin);

str = fgets(fin, 1024);

if fputs(fout, str) /= 1;

   call close(fin);

   call close(fout);

   retp(3);

endif;

endo;

call close(fin);

call close(fout);

retp(0);

endp;

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2010-12-28 20:39:07
4、open and readr
open f1 = dat1;

x = readr(f1,100);

The readr function in the example will read in 100 rows from dat1.dat. The data will be assigned to a matrix x.

loadd and saved can be used for loading and saving small data sets.

The following example illustrates the creation of a GAUSS data file by merging (horizontally concatenating) two existing data sets:

file1 = “dat1”;

file2 = “dat2”;

outfile = “daty”;

open fin1 = ^file1 for read;

open fin2 = ^file2 for read;

varnames = getname(file1)|getname(file2);

otyp = maxc(typef(fin1)|typef(fin2));

create fout = ^outfile with ^varnames,0,otyp;

nr = 400;

do until eof(fin1) or eof(fin2);

y1 = readr(fin1,nr);

y2 = readr(fin2,nr);

r = maxc(rows(y1)|rows(y2));

y = y1[1:r,.] ~ y2[1:r,.];

call writer(fout,y);

endo;

closeall fin1,fin2,fout;

In the previous example, data sets dat1.dat and dat2.dat are opened for reading. The variable names from each data set are read using getname, and combined in a single vector called varnames. A variable called otyp is created that will be equal to the larger of the two data types of the input files. This will ensure the output is not rounded to less precision than the input files. A new data set daty.dat is created using the create ... with ... command. Then, on every iteration of the loop, 400 rows are read in from each of the two input data sets, horizontally concatenated, and written out to daty.dat. When the end of one of the input files is reached, reading and writing will stop. The closeall command is used to close all files.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-12-28 20:39:40
5、Distinguishing Character and Numeric Data

Although GAUSS itself does not distinguish between numeric and character columns in a matrix or data set, some of the GAUSS Applications programs do. When creating a data set, it is important to indicate the type of data in the various columns. The following discusses two ways of doing this.

Using Type Vectors

The v89 data set format distinguishes between character and numeric data in data sets by the case of the variable names associated with the columns. The v96 data set format, however, stores this type of information separately, resulting in a much cleaner and more robust method of tracking variable types, and greater freedom in the naming of data set variables.

When you create a data set, you can supply a vector indicating the type of data in each column of the data set. For example:

data = { M 32 21500,

         F 27 36000,

         F 28 19500,

         M 25 32000 };

vnames = { “Sex” “Age” “Pay” };

vtypes = { 0 1 1 };

create f = mydata with ^vnames, 3, 8, vtypes;

call writer(f,data);

f = close(f);

To retrieve the type vector, use vartypef:

open f = mydata for read;

vn = getnamef(f);

vt = vartypef(f);

print vn’;

print vt’;

Sex   Age   Pay

0     1     1

The function getnamef in the previous example returns a string array rather than a character vector, so you can print it without the ‘$’ prefix.

Using the Uppercase/Lowercase Convention (v89 Data Sets)

This is obsolete, use vartypef and v96 data sets to be compatible with future versions.

The following method for distinguishing character/numeric data will soon be obsolete; use the Type Vectors method described earlier.

To distinguish numeric variables from character variables in GAUSS data sets, some GAUSS application programs recognize an “uppercase/lowercase” convention: if the variable name is uppercase, the variable is assumed to be numeric; if the variable name is lowercase, the variable is assumed to be character. The ATOG utility program implements this convention when you use the # and $ operators to toggle between character and numeric variable names listed in the invar statement, and you have specified nopreservecase.

GAUSS does not make this distinction internally. It is up to the program to keep track of and make use of the information recorded in the case of the variable names in a data set.

When creating a data set using the saved command, this convention can be established as follows:

data = { M 32 21500,

         F 27 36000,

         F 28 19500,

         M 25 32000 };

dataset = “mydata”;

vnames = { “sex” AGE PAY };

call saved(data,dataset,vnames);

It is necessary to put “sex” in quotes in order to prevent it from being forced to uppercase.

The procedure getname can be used to retrieve the variable names:

print $getname(“mydata”);

The names are:

sex

AGE

PAY

When writing or creating a data set, the case of the variable names is important. This is especially true if the GAUSS applications programs will be used on the data set.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-12-28 20:39:58
6、Matrix Files

GAUSS matrix files are files created by the save command.

The save command takes a matrix in memory, adds a header that contains information on the number of rows and columns in the matrix, and stores it on disk. Numbers are stored in double precision just as they are in matrices in memory. These files have the extension .fmt.

Matrix files can be no larger than a single matrix. No variable names are associated with matrix files.

GAUSS matrix files can be load’ed into memory using the load or loadm command, or they can be opened with the open command and read with the readr command. With the readr command, a subset of the rows can be read. With the load command, the entire matrix is load’ed.

GAUSS matrix files can be open’ed for read, but not for append or for update.

If a matrix file has been opened and assigned a file handle, rowsf and colsf can be used to determine how many rows and columns it has without actually reading it into memory. seekr and readr can be used to jump to particular rows and to read them into memory. This is useful when only a subset of rows is needed at any time. This procedure will save memory and be much faster than load’ing the entire matrix into memory.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-12-28 20:42:34
7、很多软件包里的数据格式,是作者自己编写的。这部分可以参考help userguide 中的Compil标题
Compiling Programs

Programs are compiled with the compile command.

Compiling a File

Source code program files that can be run with the run command can be compiled to .gcg files with the compile command:

compile qxy.e;

All procedures, global matrices and strings, and the main program segment will be saved in the compiled file. The compiled file can be run later using the run command. Any libraries used in the program must be present and active during the compile, but not when the program is run. If the program uses the dlibrary command, the .dll files must be present when the program is run and the dlibrary path must be set to the correct subdirectory. This will be handled automatically in your configuration file. If the program is run on a different computer than it was compiled on, the .dll files must be present in the correct location. sysstate (case 24) can be used to set the dlibrary path at run-time.

你如果不是按照作者那样编写的格式,把你数据导入,gauss是搜索不到的。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-12-28 20:42:57
8、Saving the Current Workspace

The simplest way to create a compiled file containing a set of frequently used procedures is to use saveall and an external statement:

library pgraph;

external proc xy,logx,logy,loglog,hist;

saveall pgraph;

List the procedures you will be using in an external statement and follow it with a saveall statement. It is not necessary to list procedures you do not explicitly call, but are called from another procedure, because the autoloader will automatically find them before the saveall command is executed. Nor is it necessary to list every procedure you will be calling, unless the source will not be available when the compiled file is use’d.

Remember, the list of active libraries is NOT saved in the compiled file so you may still need a library statement in a program that is use’ing a compiled file.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群