Chapter 6

Using discs for information storage

The facilities that have been described so far cover most programming needs but are not entirely suited to a major class of tasks that BASIC can be used for: the storage and processing of large amounts of information, such as personnel records, stock lists, address books etc.

So far, discs have only been described as a way of storing programs for later use, but in fact, they can also be used to store information, as a kind of extension of the computer's memory. Information is stored on discs by writing it from variables to named files called 'data files'.

Using data files to store information has a number of advantages and disadvantages compared to using variables:

Files Variables

Information preserved until deliberately erased or file deleted Information easily lost, e.g. when the computer is turned off

Limited by disc capacity Limited by computer memory

Easily shared between programs Not easily shared between programs

Access time varies between nearly instantaneous and several seconds Nearly instantaneous access

Information must be transferred to variables before it can be used in instructions Information can be used directly in instructions

Files	Variables
Information preserved until deliberately erased or file deleted	Information easily lost, e.g. when the computer is turned off
Limited by disc capacity	Limited by computer memory
Easily shared between programs	Not easily shared between programs
Access time varies between nearly instantaneous and several seconds	Nearly instantaneous access
Information must be transferred to variables before it can be used in instructions	Information can be used directly in instructions

To summarise, data files provide a way of preserving information so it can be easily used later with the same or other programs, but take a little more effort and time to use than variables.

BASIC provides three types of data files, to suit different applications: 'sequential', 'random access' and 'keyed access'.

A sequential file is used rather like a DATA statement, containing information in the sequence it was written to the file, which must be read in that sequence.

A random access file is used more like an array, with each element numbered; to read or change a particular element, you simply specify its number.

Keyed access files are rather like random access files, but let you choose the element you want much more conveniently, by using text 'keys' such as a surname or stock description.

Sequential and random access files are described in this chapter; keyed files are described in Chapter 7.

6.1 General disc commands

BASIC provides a number of commands to help you use discs that, although usefule when manipulatingdata files, also have a wider use.

While using the computer's operating system you have access to commands for listing, deleting, renaming and typing files. BASIC provides similar facilities through keywords with the same names and arguments:

To list files on a disc: DIR constant, eg. DIR *.BAS
To delete files on a disc: ERA constant, eg. ERA TEST.*
To rename a file: REN constant=constant, eg. REN NEW=OLD
To display the contents of a file: TYPE constant, eg. TYPE INFO.DOC

For details of these commands and the significance of their arguments, see Part I of this User guide.

The above BASIC commands are followed by constant text without double quotes, just like the operating system commands. BASIC provides equivalent commands that take string expressions as their arguments; the meaning of the argument is the same, but you can use the command just like a normal BASIC command like LOAD. These commands are FILES, KILL, NAME AS and DISPLAY:

To list files on a disc: FILES string, eg. FILES file.list$
To delete files on a disc: KILL string, eg. KILL "*.TMP"
To rename a file: NAME string AS string, eg. NAME oldname$ AS newname$
To display the contents of a file: DISPLAY string, eg. DISPLAY name$+type$

BASIC provides another command, RESET, which you should use before changing discs. This tells the operating system that the disc is about to be changed and 'closes' any open files (Section 6.2).

There is also a function, FIND$, which you can use to check files on a disc. It returns a null string if the file is not there, or a string containing information on the file if found. For example, to check if a file is on the disc:

        IF FIND$(file$) = "" THEN PRINT file$;" not found"

6.2 Sequential access files

Sequential access files are the simplest form of data file and so will be dealt with first. Each of the keywords needed to use sequential files will be described first, followed by an example program to illustrate how they work together.

Sequential files can be created and read, but not changed (directly).

6.2.1 Creating a sequential file

Creating a sequential file involves three stages:

opening the file
writing information to the file
closing the file

The file cannot be read until it has been closed.

Opening the file Before you can write to a sequential file, you must tell BASIC about it by opening the file using the keyword OPEN:

    OPEN "O", #file-number,file-name$

The "O" is the 'file access mode' and stands for output. This tells BASIC to create an entirely new file, deleting any existing file with the same name. Other modes ("I", "R" and "K") are described elsewhere.

Note that you cannot change parts of an existing sequential file.

The file-number is an integer in the range 1-3, which you will use in all subsequent instructions to refer to this file. This number must not be the same as the file number for any other file which is open. If you need to open more, you will have to close some of the files already open (see CLOSE, below). (It is possible to change this upper limit of three files: for details, see 'Mallard BASIC: Introduction and Reference'.)

The file-name$ is the complete name, including file type, that the operating system will use for the new file. This file must not be already open. If it already exists, it will be erased.

The file stays open until you close it (using CLOSE, RUN, BUFFERS, RESET or SYSTEM). You must not remove the disc on which the file has been opened until it has been closed.

Writing information to the file Once the file has been opened for output, you can place information in it using the keywords PRINT # and WRITE #.

Each of these keywords is followed by the file number, a comma, then a list of data items (variables, constants or expressions) which will be written to the file. For example:

  PRINT #addressfile%,name$(i),address$(i),phone%(i)

PRINT # behaves almost exactly as PRINT, allowing the use of USING, SPC and TAB to format the output to the file. The only important difference is that PRINT # does not expand Tab Control codes (Internal value 9) to spaces.

WRITE # is somewhat similar to PRINT #, but writes the data items to the file separated by commas, ignores print zones and writes string surrounded by double quotes. WRITE # does not allow the use of TAB, SPC or USING keywords.

The difference may be more obvious after the following examples, in which constants have been used for clarity:

    PRINT "Name","Address";"Telephone";1234

would be displayed as:

    Name                 AddressTelephone 1234

    PRINT #file%,"Name","Address";"Telephone";1234

would be stored exactly the same, as:

    Name                 AddressTelephone 1234

while

    WRITE #file%,"Name","Address";"Telephone";1234

would be stored as:

    "Name","Address","Telephone",1234

The need for these two different formats of output will only become apparent when you understand the way that information is read from a sequential file.

You are strongly advised to use WRITE # initially, as this automatically produces files which are easy to read correctly using INPUT #.

Each time a WRITE # or PRINT # instruction is executed for the file, the additional information is written to the end of the sequential file, until the file is closed.

(The information is not written to the file immediately, but initially stored in a small area of memory called a 'buffer'. The contents of the buffer are only written to disc when the buffer becomes full, or when you close the file. This buffering process makes programs more efficient, by reducing the number of times the disc drive has to be accessed.)

Closing the file When you have finished writing to the file, you should close it using the keyword CLOSE. This has four main effects:

ensures that all information written to the file is actually transferred to it from the buffer
allows the file to be opened for reading
frees that file number for re-use
frees resources assigned to that file so another can be opened

To close all files that are open, use CLOSE on its own. To close particular files, use CLOSE followed by a list of file numbers, separated by commas. For example:

    CLOSE client.file,stock.file

All open files are closed automatically when any of the following commands are executed: RESET, RUN, BUFFERS and SYSTEM.

6.2.2 Reading from a sequential file

Reading from a sequential file involves three stages: opening the file, reading the information and closing the file.

Opening the file Before you can read from a sequential file, you must tell BASIC about it by opening the file using the keyword OPEN:

    OPEN "I", #file-number,file-name$

The "I" is access mode (input). Othe modes ("O", "R" and "K") are described elsewhere.

The file-name$ is the complete name, including file type, that the operating system will use for the new file. This file must not be already open. If it already exists, it will be erased.

The file stays open until you close it (using CLOSE, RUN, BUFFERS, RESET or SYSTEM). You must not remove the disc on which the file has been opened until it has been closed.

Reading information from a sequential file After opening the file for input, you can read information from it using the keyword INPUT #. This is followed by the file number, a comma, and a list of variables to which the information read will be assigned. For example:

    INPUT #file%,name$(i),address$(i),count(i)

Information is read from the file in strictly sequential order (hence 'sequential file'): the first variable in the first INPUT # instruction will be assigned from the first value in the file, the second variable from the second value, and so on. It is not possible to re-read an item; to do so, you must open the file for input again, then read items until you read the required item again.

The information in a sequential file can be used in two main ways:

by reading all the data items from the file into suitable arrays, then ignoring the file and using the information in these arrays. This technique can only be used if all the information will fit into the computer memory at the same time, but is generally worthwhile as the particular item of interest can then be reached very quickly, in any order. It is particularly suited to tasks such as 'looking up' information in a file or reorganising informatiobn before printing or changing a file.
by reading data items from the file sequentially, using each item or sequence of related items in turn before reading the next. This technique can be used with files of any size, as only the items just read need to be stored, but limits processing to those 'current' items. It is particularly suited to tasks like totalling figures, copying or printing an entire file.

The type and order of variables in an INPUT # instruction is largely dictated by the structure of the file, which is in turn determined by the instructions used to create it.

The simplest way to use sequential files is to create them with WRITE #, then read the information back with INPUT #, using the same variable types in the same order when creating and reading the file. For example, if the file was created using:

    FOR i = 1 TO 10
	  WRITE #file%,name$(i),address$(i),telephone%(i)
	NEXT

then the information could be read back using:

    FOR i = 1 TO 10
	  INPUT #file%,client$(i),address$(i)
	  INPUT #file%,phone%(i)
	NEXT

Note that you need not use the same names for the variables and the variables need not be grouped exactly the same in the WRITE # and INPUT # instructions, just be written and read in the same order.

It is not even necessary to use exactly the same variable types as long as they are compatible (ie. both numeric or both string), and you are prepared to accept the rounding that may occur.

(If you want, you can even create files so that numeric data can be read back into numeric OR string variables. This is done by using the PRINT # command: see 'Mallard BASIC: Introduction and Reference'.)

If you try to read beyond the last data item in a file, this will generate an error. If you know how many items there are in the file when you write the program to read it, you can easily ensure that this limit is not exceeded, as in the example above. Failing this, it is entirely feasible to adopt a convention of always writing a unique value as the last entry in the file, then testing each item read against this.

However, BASIC provides a more elegant solution, in the form of the function EOF. EOF(file%) returns the value 0 while there are more items to be read from the file and -1 when the last item has been read. For example, to print all the entries in a file named "testfile.seq":

    file% = 1
	OPEN "I",#file%,"testfile.seq"
	WHILE NOT EOF(file%)
	INPUT #file%,item$
	PRINT item$
	WEND

(the keyword NOT is used here to invert the result of EOF(file%) so that the WHILE loop will continue until the end of the file is reached.)

Closing the file When you have finished writing to the file, you should close it using the keyword CLOSE. This is not as important as closing a file which has been opened for output, but has three main effects:

allows the file to be opened for output
frees that file number for re-use
frees resources assigned to that file so another can be opened

To close all files that are open, use CLOSE on its own. To close particular files, use CLOSE followed by a list of file numbers, separated by commas. For example:

    CLOSE client.file,stock.file

All open files are closed automatically when any of the following commands are executed: RESET, RUN, BUFFERS and SYSTEM.

6.2.3 Changing a sequential file

BASIC does not provide you with facilities to change a sequential file directly. If you want to change a sequential file (ie. add to, delete or alter information in it), you must do so by creating an entirely new version, from information read from the old version and the changes required.

If the file is small enough, read the entire file into memory, change the information then write the revised version back to the disc, as illustrated in the phone book example below. Alternatively, you can read information from the file in batches, creating a new version with a different name in stages.

Certain types of changes can however be made easily using the random access facilities described in Section 6.3.

6.2.4 Example programs

The following programs illustrate typical uses for sequential files. Don't be put off by their apparent triviality; they have been kept simple deliberately to avoid obscuring the salient points.

Example 1: simple statistical analysis

(This example illustrates the simplest way to use a sequential file - reading it an item at a time, using each data item immediately.)

The task: The task is to total, count and average all the numerical data entries in a sequential file, then print this information together with the maximum and minimum values read, when the whole file has been read. The average should reflect the average magnitude, ie. ignore the signs of the numbers, but all other results should be based on the signs.

The data in the file might be financial transactions, numbers of people at football matches, temperatures; it doesn't matter what. The data items will all be numbers, positive or negative, integer or floating point, in the range -9999.999 to +9999.999. The file may have between zero and twenty thousand entries. The name of the data file will be supplied by the user when the program is run.

If you want to test your understanding of the facilities described so far, attempt to design and write the program now, without reading any further.

The design: Since none of the processing requires access to previously read data, there is no need to try to store the data from the file in an array, which is just as well as larger files would not fit!

The program will consist of the following stages:

set up variables and screen
choose disc
choose file
read and process entries
finish calculations and print results
tidy up

The Program: The following is one way of coding the functions required by the design:

10 REM example 1 - simple statistical analysis
20 REM *** set up variables ***
30 '
40 file% = 1
50 '
60 PRINT cls$;"Statistical analysis program"
70 '
80 REM *** choose disc ***
90 '
100 FILES
110 PRINT "If the file you want to analyse is not on this disc,"
120 PRINT "change discs then press C."
130 PRINT "If this is the right disc, press any other key to continue"
140 i$ = ""
150 WHILE i$ = ""
160 i$ = INKEY$
170 WEND
180 i$= UPPER$(i$): IF i$ = "C" THEN RESET: GOTO 100
190 '
200 REM *** choose file ***
210 '
220 INPUT "Type the data file name, followed by RETURN";filename$
230 IF  FIND$(filename$) = "" THEN PRINT filename$; " IS NOT ON THE CURRENT DISC - TRY AGAIN": GOTO 220
240 '
250 REM *** read and process entries ***
260 '
270 PRINT "Analysing ";filename$;" - PLEASE WAIT": PRINT
280 OPEN "I",file%,filename$
290 WHILE NOT(EOF(file%))
300   count% = count% + 1
310   INPUT #file%,number
320   total = total + number
330   total.absolute = total.absolute + ABS(number)
340   minimum = MIN(number,minimum)
350   maximum = MAX(number,maximum)
360 WEND
370 '
380 REM *** finish calculations and print results ***
390 '
400 IF count% = 0 THEN PRINT filename$;" is empty":GOTO 500
410 average = total.absolute/count%
420 PRINT "RESULTS FOR FILE ";filename$:PRINT
430 PRINT "Final total:";TAB(40);total
440 PRINT "Average absolute value of an entry:";TAB(40);average
450 PRINT "Minimum value was:";TAB(40);minimum
460 PRINT "Maximum value was:";TAB(40);maximum
470 '
480 REM *** tidy up ***
490 '
500 CLOSE file%
510 END

You should find that this program is largely self-explanatory. The only part that is slightly complex is lines 140-170. These use the INKEY$ function to check the keyboard over and over again until a key has been pressed, and then pass that key, stored in i$, to line 180.

While this program is a reasonable solution, it is by no means perfect. For example, the user would have to press +C to quit, the variables should perhaps be double precision, the results could be formatted more neatly. Try enhancing the program along these lines.

Testing: The final stage in developing the program should be to test it, as realistically as possible. As you haven't got any of the files that the program is designed to work with, you must generate them. In keeping with the simple task, a simple program to generate a simple test file:

1  REM test1
10 OPEN "O",1,"test1.seq"
20 FOR a = 1 TO 10
30   WRITE #1,a,-a
40 NEXT
50 CLOSE 1

Run this program to generate the test file, then analyse it using the analysis program.

The ideal test produces results that you can easily check by independent means while pushing the program to its designed limits. A single test will rarely satisfy both aims. This simple test produces results that you can easily check by doing a little arithmetic, but does not include the number nor the wide range of numeric values that the specification requires the program to be able to cope with.

To save you the trouble of the arithmetic, the results should be:

Final total: 0

Average absolute value of an entry: 5.5

Minimum value: -10

Maximum value: 10

Example 2: phone book

(This example shows how to use a sequential file to set up an array for processing in memory, reading and writing the file when needed.)

The task: The task is to provide a 'phone book', which will give the phone number corresponding to a name typed by the user, and allow the user to add names and phone numbers to the 'book'. To keep the program simple, other useful facilities such as editing existing entries and printing a sorted phone list will not be provided.

When specifying a name to search for, upper and lower case are to be considered equivalent, the whole name need not be specified and all entries matching the name will be displayed.

The maximum number of entries will be 100. The longest phone number (including the STD code) will be 12 digits. The STD code can be stored as an integral part of the number. The longest name will be 30 characters.

The phone book file will have a fixed name.

If you want to test your understanding of the facilities described so far, attempt to design and write the program now, without reading any further.

The design: As the number of entries is fairly small, the whole file can be read into an array in memory for ease and speed of processing. For the same reason, there is no need to sort the entries into a particular order; the computer can find any particular name quickly by checking each element in turn.

Part of the design is deciding the way that the information will be stored. There are three types of information in the file: the names and phone numbers (obvious) and the relationship between a name and a phone number. This could be stored in a number of ways:

as a list of names, list of phone numbers (the third name has the third number)
as a list of entries consisting of a name followed by its phone number
as a list of entries consisting of a name plus a list of indices to the phone numbers in a subsequent list (eg. "AdAstra Inc",4,7 means that AdAstra Inc has phone numbers 4 and 7 in the phone number list)

There is not much to choose between the first two methods. The third, while more complex, can cater better for more complex situations (such as one name having several phone numbers, or several names having a shared phone number).

The program will consist of the following stages:

set up variables and screen
read file and store entries
ask for and carry out instructions (search by name, add an entry, finish)
write file and tidy up

The program: The following is one way of coding the functions required by the design. Note in particular that the program cannot change the phone book file as such; it must change the information held in memory, then write the file all over again!

10 REM example 2 - phone book
20 '
30 REM **** set up variables ****
40 DIM name$(100),phone$(100)
50 phone.file$ = "phone.seq": file% = 1
60 false = 0: true = -1
70 file.changed = false
80 PRINT "Phone book"
90 '
100  REM **** get data ****
110 GOSUB 290
120 '
130 REM **** get commands ****
140 command = 0
150 WHILE command < 5
160   PRINT "Phone book ";entries;" entries"
170   PRINT:PRINT "Search for number, Add a number or Finish (S/A/F)"
180   match$ = "SsAaFf": GOSUB 850: command = answer
190   ON command GOSUB 460,460,580,580:GOSUB 940
200 WEND
210 '
220 REM **** finish ****
230 IF file.changed = true THEN GOSUB 710
240 END
250 '
260 REM * * * * * *  SUBROUTINES * * * * * *
270 REM *** read file ***
280 REM ** check if there first! **
290 IF FIND$(phone.file$) <> "" THEN GOTO 370
300 PRINT "No phone book file on this disc."
310 PRINT "Change disc or start New phone book on this disc (C/N)?"
320 match$ = "CcNn":  GOSUB 850
330 IF answer = 3 OR answer = 4 THEN entries = 0: RETURN
340 GOSUB 800: GOTO 290
350 '
360 REM ** read entries **
370 PRINT: PRINT "Reading phone book: please wait"
380 OPEN "I",file%,phone.file$
390 entries = 0
400 WHILE NOT(EOF(file%))
410   entries = entries + 1
420   INPUT #file%,name$(entries),phone$(entries)
430 WEND
440 CLOSE file%
450 RETURN
460 REM *** search ***
470 IF entries = 0 THEN PRINT "Phone book empty!!!": RETURN
480 PRINT "Type name to search for followed by RETURN,"
490 INPUT "or just RETURN to skip search",search$
500 IF search$ = "" THEN RETURN
510 count = 1
520 WHILE count <= entries AND name$(count) <> search$
530   count = count + 1
540 WEND
550 IF count > entries THEN PRINT search$;" not found":RETURN
560 PRINT "Phone number:";phone$(count):RETURN
570 '
580 REM *** add entry ***
590 IF entries >= 100 THEN PRINT "No room!!!": RETURN
600 PRINT "Type name to add followed by RETURN,"
610 INPUT "or  just  RETURN  to  skip entry";name$
620 IF name$ = "" THEN RETURN
630 PRINT "Type phone number to add followed by RETURN,"
640 INPUT "or just RETURN  to skip entry";phone$
650 IF phone$ = "" THEN RETURN
660 entries = entries + 1
670 name$(entries) = name$
680 phone$(entries) = phone$
690 file.changed = true: RETURN
700 '
710 REM *** write file ***
720 PRINT: PRINT "Writing phone book: please wait"
730 OPEN "O",file%,phone.file$
740 FOR count = 1 TO entries
750   WRITE #file%,name$(count),phone$(count)
760 NEXT
770 CLOSE file%
780 RETURN
790 '
800 REM * change disc *
810 RESET
820 INPUT "Insert disc with phone book on and press RETURN",a$
830 RETURN
840 '
850 REM * get key *
860 answer$ = INKEY$
870 WHILE answer$ = ""
880   answer$ = INKEY$
890 WEND
900 answer = INSTR(match$,answer$)
910 IF answer = 0 THEN GOTO 860
920 RETURN
930 '
940 REM * wait for key *
950 PRINT: PRINT "Press any key to continue"
960 WHILE INKEY$ = ""
970 WEND
980 RETURN

Again, this program is by no means perfect. For example it will not find a name if you type it in capitals when the phone book has it in small letters, or if you do not type the complete name. Similarly, when you add an entry, it does not check if there is already an entry with the same name, it simply places it at the end of the file.

Try changing the program to correct these deficiencies (HINTS: Use UPPER$ or LOWER$ to convert the entry being compared and the search string to the same case. Use LEN and LEFT$ to allow incomplete names to be found. Search through the array each time when adding and replace an existing entry that matches, only adding new ones to the end, but be careful to adjust the 'entries' count accordingly.)

Testing: To test the program, run it to create a new phone book and add a few entries. Run it again, to check that it finds the existing phone book. Try all three commands to see if they work. If you've have patience, check that it will handle the limit of 100 entries correctly!

6.3 Random access files

Random access files are quite similar to sequential files: they also have names, must be opened before they are used, can have information written to or read from them and should be closed after use.

Random access files have two major advantages: the information in them can be read in any order (hence random rather than sequential) and that information can be changed, also in any order. (Information in sequential files can only be read in the order that it was written and cannot be changed directly - you must read the file into memory (in one go, or in chunks), changing any information that must be altered while it is in memory, then write it back to a new file on the disc.)

These differences make random access files more convenient and easier to use than sequential files for many applications. Against these advantages, random access files are a little more difficult to understand and use properly, and tend to take up more space on the disc for the same amount of information.

Information is stored in a random access file rather differently, too. In a sequential file, each data item is written or read separately and the different items can have different lengths. The items read from or written to a random access file are called 'records'. All the records in a file have the same fixed length. Each record can consist of a single data item (just like a sequential file) or a number of data items that are handled together.

Records provide a neat way of storing related pieces of data. For example, in a personnel file, there could be a record per person, containing the following data items:

Name (30 characters)
Address (50 characters)
Phone (10 characters)
NI number (9 characters)
Date of birth (6 characters)
Pay scale (1 character)
Insurance scale (1 character)

Keeping all this related information together makes a lot of sense!

To indicate the particular record that you want to read or write, you use its position in the file - its 'record number'. The first record is number 1, the second number 2, and so on. This is rather like the way you use array variables, by specifying the index (eg. name$(2)). In fact, you can use random access files rather like arrays of variables, which are no longer limited in size by the computer memory and can be shared easily between programs!

Random access files are often used to hold information on individual items, such as people, companies or products with one or a fixed number of records being used per item. Programs can then be designed to allow the users quickly and easily to select record(s) to view, change, print etc. The best way to design a file for this type of application is to position records in the file according to some numeric information that users associate with the item, such as an employee's code or an item's stock number. If this is not possible, you will usually have to write down the record number for each item, and have users specify the information required by this number, so that 'ACME Motor Insurance' becomes company 1, 'ACME National' company 2, etc. (Keyed files provides a much more elegant solution however - see Chapter 7!)

6.3.1 Creating a random access file

A random access file is created in five separate stages:

opening the file
defining the record layouts
assigning data to the record
writing the record to the file
closing the file

Opening the file Before you can write information to a random access file, you must open it using the keyword OPEN:

  OPEN "R",#file-number,file-name

For example:

  
  OPEN "R",#3,"person2.fil"

The "R" indicates that the file is open for Random access (rather than "O" - sequential Output, "I" - sequential Input, or "K" - Keyed access). The significance of the file-number and file-name is as described for opening a sequential file, above.

You can also include another parameter at the end of the instruction to specify the record size, if the usual 128 characters is not acceptable, ie:

  OPEN "R",#file-number,file-name,record-size

This allows you to use disc space more efficiently when the data in each record is significantly less than 128 characters. It also allows you to use larger record sizes when 128 characters is not enough, but you must first change the maximum buffer size. How you do this is described in 'Mallard BASIC: Introduction and Reference'.

Opening a file for random access creates it, if it does not exist already, like opening it for sequential output. However, if the file already exists, it is not deleted by opening it for random access. This allows you to add or change information in a random access file, as described below. It does mean that if you want to be sure of creating an entirely new random access file, you should check if there is a file with the same name on the disc first and delete it if found (using FIND$ and KILL as described above).

Opening a random access file makes BASIC reserve a space in memory for it, called a 'record buffer'. This is the same size as the record length (usually 128 characters). It is here that BASIC will assemble the information that you want to write for each record. Each file opened for random access has its own record buffer.

Although it is possible to open a random access file which is already open, this is not recommended.

Defining the record layouts Information is placed in the record buffer (to prepare the record for writing to the file) by assigning it to special variables, called 'field variables'. These are a special form of string variable, set up using the FIELD instruction. This defines the record layout by dividing the record buffer up into individual areas (fields), each of which is defined by a field variable.

The FIELD instruction takes the form:

	FIELD #file-number,field-size AS field-variable[,field-size AS 
	                                                              field-variable]

For example:

    FIELD #3, 10 AS name$, 30 AS address$, 10 AS phone$

The file-number is the number that you used to open the random access file; note that the record layout is only usable with this one file; it is not automatically available for all files.

The field-size specifies the number of characters that the field variable will have reserved for it in the record. Field variables must be string variables. The total number of characters (ie. the sum of all the field sizes in the instruction) must not exceed the record size (usually 128).

Note that the FIELD instruction does not write any information to the file, it simply labels parts (fields) of the record with the field variable names, to make it easy for you to set up records for writing to the file (see below). Any previous information stored in these field variables is lost; they immediately take the current contents of their field in the record.

You can define as many record layouts as you like for the file, so that you can write records with quite different layouts to the same file. For example, in the personnel file described above, you might occasionally want to use a different record type - a 'continuation record' - for an employee with a complex address. You would then need to define two different record layouts.

The main record type:

    FIELD #perfile%, 30 AS name$, 50 AS address$, 10 AS phone$ 
	       9 AS ni.number$, 6 AS birth.date$, 1 AS pay.scale$,
		                    1 AS ins.scale$, 1 AS cont.record$

The 'continuation record' type:

    FIELD #perfile%, 108 AS address.cont$

Field variables are rather special because of their fixed length (as defined in the FIELD instruction) and should only ever be assigned as described below.

Assigning information to a record Information is placed in a record by assigning it to one of the string variables that have been associated with the file by a previous FIELD instruction.

This assignment must always take one of the following three forms:

  LSET field-variable = string-expression
  RSET field-variable = string-expression
  MID$ (field-variable,start,length) = string-expression

Assigning to a field variable using any other method breaks its association with the file, and uses it as a normal string variable again. The value assigned to the variable is not placed in the record buffer.

The LSET command assigns characters from the string expression to the field variable, left justified. If the string expression is shorter than the field variable, spaces will be added to its (righthand) end to fill the field. If the string expression is longer than the field variable, the surplus characters (at the righthand end) will be disacarded without error.

The RSET command behaves similarly, except that spaces will be inserted at the lefthand end if the string expression is too short.

The MID$ command assigns length characters from the lefthand end of the string expression to the field variable, starting from start.

For example:

   FIELD file%,20 AS name$

After:

   LSET(name$) =       "*****************************"
   name$ is:           "********************"

After:

   LSET(name$) =       "Anne Elizabeth"
   name$ is:           "Anne Elizabeth      "

After:

   RSET(name$) =       "Catherine Louise"
   name$ is:           "    Catherine Louise"

And then after:

   MID$(name$,2,3) =   "********************"
   name$ is:           " ***Catherine Louise"

Numeric information can be assigned to a field by first converting it to the equivalent string:

  LSET(number$) = STR$(count)

However, this can be a very inefficient way to store numbers; for example, an integer such as -30000 that BASIC stores in two bytes would take up 6 characters in a file using this method. Also, some accuracy may be lost when converting from the internal binary representation to decimal. BASIC solves both potential problems by providing three functions for converting numeric information to a more compact 'string' form: MKD$, MKS$ and MKI$:

To convert a double precision number to an eight-character string, use MKD$(number).
To convert a single precision number to a four-character string, use MKS$(number).
To convert an integer to a two-character string, use MKI$(number).

For example:

    LSET(number$) = MKI$(total)

These 'strings' will not display or print correctly.

Complementary functions are provided to convert these strings back to the numeric form (CVD, CVS and CVI) and are described in more detail below.

(Information can also be assigned to the record using PRINT # and WRITE #, but that will not be described further here; see 'Mallard BASIC: Introduction and Reference' for details.)

Writing the record Having opened a random access file, defined a record layout for it and assigned data to that record, you are now ready to write the record to the file.

Records are written using the keyword PUT, which takes the form:

    PUT #file-number

    PUT #file-number,record-number

The file-number is the file number you used when opening the file.

Use the first form to write the file rather like a sequential file; each record is written after the last one written using PUT (or read using GET - see below). This is the quickest way to write a random access file, as the disc drive does not have to keep finding different parts of the file.

The second form will mainly be used when changing an existing file or creating a file with a complex structure. The record number is the position that you want the record to have in the file. Note that the length of the file is dictated by the highest record number written to it (usually 128 characters); if you write a single record using PUT #file%,100 then there will be 1000 records in the file - 999 unused and one used!

Writing a record to the file does not change the record buffer; this is only changed by assigning to the field variables or reading the file using the same file number.

Closing the file A random access file should be closed when you have finished writing it, just like a sequential file, to ensure that all the information has been written to the disc.

To close all files that are open, use CLOSE on its own. To close particular files, use CLOSE followed by a list of file numbers, separated by commas. For example:

    CLOSE #client.fil,#stock.fil

6.3.2 Reading a random access file

A random access file is read in five separate stages:

opening the file
defining the record layouts
reading records from the file
using information from the record
closing the file

Opening the file Before you can write information to a random access file, you must open it using the keyword OPEN:

  OPEN "R",#file-number,file-name

For example:

  OPEN "R",#3,"person2.fil"

You can also include another parameter at the end of the instruction to specify the record size, if the usual 128 characters is not acceptable, ie:

  OPEN "R",#file-number,file-name,record-size

This will usually be used when reading a file that was created with a record size of other than 128 characters, as it is usual (although not essential) to use the same record sizes when reading and when writing.

Defining the record layout The simplest way to read a file is to define the record length and record layouts to be the same as those used to write it. In this way, you can be sure to read the information back the way it was written. There are good reasons for reading a file with a different record length, or with a different record layout, but they are beyond the scope of this introduction.

Record layouts are defined using FIELD, exactly as described for creating a file, above.

Reading records from the file To read data from a random access file, you must first read the record containing it , using the command GET:

    GET #file-number

    GET #file-number,record-number

Using GET without a record number reads the next record after the last one read (using GET) or written (using PUT) using this file number. If the file has only just been opened, it reads the first record (record number 1). Using GET in this way you can treat a random access file a little like a sequential file.

Using GET with a record number you can read any record in the file.

Detecting the end of a random access file is a little more difficult than for a sequential file. You can still use the EOF function, but must read the file at least once before doing so. Also, EOF may return a 'true' value even when you have not attempted to read beyond the end of the file, if the record you have just read was never written to (ie. is an empty record).

In practice, this means that if a program needs to be able to find the end of a random access file, that file must either be of a fixed length (which the program knows), or each record in the file must have been written to when it is created (using a suitable null value for empty records, such as spaces or binary zeros).

Using information from the current record Information from the last record read for a particular file number is used by using the field variables as normal string variables. Note that the contents of these variables will change as soon as another record is read using this file number.

If there is more than one record layout defined for this file number, you can use the field variables from the different layouts in any combination.

Numeric information written to the file in the compact string form by using the MKD$, MKS$ and MKI$ functions should be converted back to the equivalent numbers using the complementary functions CVD, CVS and CVI:

To convert a double precision string to double precision number, use CVD(string).
To convert a single precision string to single precision number, use CVD(string).
To convert an integer string to an integer, use CVD(string).

(Information can also be read from the record with INPUT #, but this will not be described further here.)

Closing the file When a program has finished reading a random access file, it should close it using the keyword CLOSE.

To close all files that are open, use CLOSE on its own. To close particular files, use CLOSE followed by a list of file numbers, separated by commas. For example:

    CLOSE #personnel%,#paye%

Although not as essential as when the file has been written or changed by the program, closing the file releases its record buffer and other system resources for use with other files.

6.3.3 Reading, writing and changing a random access file

As remarked above, the instructions used to open a random access file and define record layouts are exactly the same, whether the file is going to be read from or written to. In fact, you can easily read, change, then rewrite records when working with a random access file, by simply combining the keywords introduced above. This is illustrated by the following example program.

6.3.4 Examples

Example 3 - simple personnel file

The task: To create, maintain and interrogate a simple personnel file, able to handle up to 500 personnel. The information to be stored is:

Employee number (3 digits)
Name (30 characters)
Address (50 characters)
Phone (10 characters)
NI number (9 characters)
Date of birth (6 characters)
Pay scale (1 character)
Insurance scale (1 character)

The facilities required are:

add a record
delete a record
replace a record
display a record

Records to be selected by employee number.

If you want to test your understanding of the facilities described so far, attempt to design and write the program now, without reading any further.

The design: The maximum amount of information to be handled (approximately 50,000 characters) precludes storing the data in a sequential file and processing it in memory, as does the need to be able to update the records easily.

Using a random access file to store the data, the record format must be decided next. Since records will be requested by employee number, the obvious order in which to put records is by that number. Since the employee number is 3 digits, this would produce a file of 999 records, which would be half full if the design limit of 500 employees is reached. This seems acceptable, so there is no need for more complex code: the record number for an employee will be the employee number. Note that this means that the employee number does not need to be stored in the record as it is given by the record number!

The overall design, then, is:

set up variables and screen
open file and set up record layouts
ask for and carry out instructions
close file and tidy up

The program: The example program below has been kept deliberately simple so as not to obscure the use of the random access keywords.

10 REM example 3 - personnel file program
20 '
30 REM ****set up variables****
40 file% = 1: limit = 500
50 '
60 REM ****user-defined functions****
70 '
80 DEF FNhead$(title$) = STRING$((74-LEN(title$))/2,"*") + "  " + title$ + "  " + STRING$((76-LEN(title$))/2,"*")
90 '
100 REM ****set up screen****
110 '
120 PRINT FNhead$("Personnel files")
130 '
140 REM ****set up file****
150 '
160 IF FIND$("person.rnd") = "" THEN GOSUB 1100 ELSE OPEN "R",file%,"person.rnd"
170 FIELD file%, 30 AS rec.name$, 50 AS rec.address$, 10 AS rec.phone$,9 AS rec.ni$, 6 AS rec.birth$, 1 AS rec.pay$, 1 AS rec.ins$
180 FIELD file%,128 AS record$
190 '
200 REM ****get a record****
210 '
220 PRINT: PRINT
230 INPUT"Type personnel number (1 to 500) followed by RETURN";person
240 IF person <1 OR person >limit THEN GOTO 230
250 '
260 PRINT FNhead$("Personnel number"+STR$(person))
270 PRINT
280 GET file%,person
290 '
300 REM ****get and obey an instruction****
310 '
320 IF rec.name$ = STRING$(30," ") THEN GOSUB 430 ELSE GOSUB 520: PRINT: PRINT
330 prompt$ = "Press F to finish or C to continue"
340 match$  =  "FfCc":GOSUB 1220
350 IF answer >2 THEN GOTO 220
360 '
370 REM ****tidy up****
380 '
390 CLOSE
400 PRINT cls$
410 END
420 '
430 REM ****subroutines****
440 '
450 REM ***empty record options***
460 '
470 PRINT "This employee not known"
480 match$ = "YyNn":prompt$ = "Add new record (Y/N)":GOSUB 1220
490 IF answer = 1 OR answer = 2 THEN GOSUB 800
500 RETURN
510 '
520 REM ***existing record options***
530 '
540 REM **print record**
550 '
560 PRINT "Name: ";rec.name$
570 PRINT "Address: ";rec.address$
580 PRINT "Phone number:   ";rec.phone$;TAB(40);"Date of birth:  ";LEFT$(rec.birth$,2);"/";MID$(rec.birth$,3,2);"/";RIGHT$(rec.birth$,2)
590 PRINT
600 PRINT "Nat. Ins No.: ";rec.ni$;TAB(30);"Pay scale: ";rec.pay$;TAB(60);"Ins. scale: ";rec.ins$
610 '
620 REM **offer and obey options**
630 '
640 match$ = "CcDdPpSs"
650 prompt$ = "Change, Delete, Print or Skip (C/D/P/S)":GOSUB 1220
660 ON answer GOSUB 800,800,690,690,1320,1320
670 RETURN
680 '
690 REM ***delete current record***
700 '
710 match$ ="YyNn"
720 prompt$ = "About to delete record. Press Y to delete, N not to"
730 GOSUB 1220
740 IF answer >2 THEN RETURN
750 LSET record$ = ""
760 PUT file%,person
770 PRINT "Record deleted"
780 RETURN
790 '
800 REM ***add/change record***
810 '
820 INPUT "Name";name$
830 INPUT "Address";address$
840 INPUT "Phone number";phone$
850 INPUT "Date of birth: day in month";day$
860 INPUT "               month in year";month$
870 INPUT "               year (last 2 digits)";year$
880 INPUT "Nat. Ins No";ni$
890 INPUT "Pay scale";pay$
900 INPUT "Ins. scale";ins$
910 '
920 match$ = "YyNn"
930 prompt$ = "OK?  Press N to retype, Y to continue"
940 GOSUB 1220
950 IF answer > 2 THEN 820
960 '
970 LSET rec.name$ = name$
980 LSET rec.address$ = address$
990 LSET rec.ni$ = ni$
1000 LSET rec.pay$ = pay$
1010 LSET rec.ins$ = ins$
1020 MID$(rec.birth$,1,2) = RIGHT$("0"+day$,2)
1030 MID$(rec.birth$,3,2) = RIGHT$("0"+month$,2)
1040 MID$(rec.birth$,5,2) = RIGHT$("0"+year$,2)
1050 LSET rec.phone$ = phone$
1060 PUT file%,person
1070 PRINT "New/changed details written"
1080 RETURN
1090 '
1100 REM ***create empty file***
1110 '
1120 PRINT "Creating empty file - please wait"
1130 OPEN "R",file%,"person.rnd"
1140 FIELD file%,128 AS record$
1150 LSET record$ = ""
1160 FOR a = 1 TO 500
1170   PUT file%
1180 NEXT
1190 PRINT "File created"
1200 RETURN
1210 '
1220 REM **get key**
1230 '
1240 PRINT: PRINT prompt$
1250 answer$ = INKEY$
1260 WHILE answer$ = ""
1270   answer$ = INKEY$
1280 WEND
1290 answer = INSTR(match$,answer$): IF answer = 0 THEN GOTO 1250
1300 RETURN
1310 '
1320 REM **print record**
1330 '
1340 LPRINT "Personnel record for employee number";person
1350 LPRINT "****************************************":LPRINT
1360 LPRINT "Name: ";rec.name$
1370 LPRINT "Address: ";rec.address$
1380 LPRINT "Phone number: ";rec.phone$;TAB(40);"Date of birth: ";LEFT$(rec.birth$,2);"/";MID$(rec.birth$,3,2);"/";RIGHT$(rec.birth$,2)
1390 LPRINT
1400 LPRINT "Nat. Ins No.: ";rec.ni$;TAB(30);"Pay scale: ";rec.pay$;TAB(60);"Ins. scale: ";rec.ins$
1410 RETURN

Many desirable facilities have been omitted, such as checking the validity of dates being typed, being able to print lists of personnel records or find personnel by information other than the employee number. You may care to design and implement these enhancements yourself.

Testing: Devise tests to check the basic functions of the program; that it creates a personnel file when there isn't one on the disc, doesn't when there is, and that the record display, add and delete facilities work correctly.

Previous chapter

Index

Next chapter

Final total:	0
Average absolute value of an entry:	5.5
Minimum value:	-10
Maximum value:	10