VCE IT Lecture Notes by Mark Kelly, McKinnon Secondary College

Serial, sequential and random access
data files

2011 -SD KK4.1.03-methods of organising files to suit particular software needs, including serial, sequential and random access

 

SERIAL FILE

A serial file consists of sets of data of the same record type (but usually the records are of different lengths) that is stored in the same order in which the records were created. They have not been sorted. Examples are transaction files (in which sales are added to the file as they occur) or shopping lists, in which items are added to the end of the list as you think of them.

A search of a serial file is performed by sequential comparison of the query with identifiers of all stored items.

CSV (comma-separated values) is a common serial/sequential file format in which text fields are delimited by commas.

Sample serial file -SALES.DAT

"Smith", "Fred", "2", "Blue widgets", "$12.22", "23/8/2010"
"Adams", "Alexander", "9", "Green grommets", "$45.33", "25/8/2010"
"Stipe", "Michael", "4", "Bones of steel", "$1.20","12/5/2009"

 

SEQUENTIAL FILE

A sequential file is one in which the differently-sized records are stored in sorted order by one or more key fields. A class list is an example of a non-computerised sequential file sorted on surname, then first name.

Searching is a little easier in a sequential file because at least you can tell that you have gone past the item you are seeking.

Sample sequential file -SALESSORTED.DAT (sorted by name)

"Adams", "Alexander", "9", "Green grommets", "$45.33", "25/8/2010"
"Smith", "Fred", "2", "Blue widgets", "$12.22", "23/8/2010"
"Stipe", "Michael", "4", "Bones of steel", "$1.20","12/5/2009"

 

PROs and CONs of SERIAL and SEQUENTIAL FILES

In both serial and sequential files, to read item N, all items from 1 to N-1 must be read first. It is impossible to jump directly to a particular record. This makes serial/sequential files simple and fast to create, but slow and awkward to use later.

Think of a library where books are stacked in the order in which they are bought, and to find a particular book, you start at the first one and keep searching until you eventually find the one you want.

Unlike random access files in which you can instantly access any record without having to traverse all the records between where you are and where you want to be.

Think of serial & sequential files as cassette tape: if you want to listen to song 13, you have to fast-forward or rewind from your current position to the start of song 13. Compare it to a CD where you can lift the read head and drop it exactly where it needs to read next.

 

RANDOM ACCESS

Random access files have a predictable, strict structure that lets you directly seek and go to the start of a desired record. This strucure is only possible because each record has a fixed length.

A random file may be define as follows:

STRUCTURE SalesRecord

Surname AS TEXT * 15
Firstname AS TEXT * 10
Quantity AS INT
ItemName AS TEXT * 20
SaleTotal AS CURRENCY
SaleDate AS DATE

END STRUCTURE

Note that variable-length fields must have strict maximum lengths defined. The other field types have lengths defined by the programming language.

Because each record has exactly the same size, it makes to easy to find the beginning of a record in a file with some simple arithmetic.

Let's assume we create many records with the structure defined above. Each record contains:

Surname = 15 bytes
Firstname = 10 bytes
Quantity = 2 bytes
Itemname = 20 bytes
Saletotal = 4 bytes
SaleDate = 6 bytes

Total record length = 57 bytes.

So, when a series of such records are stored to disk or memory they look like this:

RECORD
1
2
3
4
5
Starts at byte #
1
58
115
172
229

So, to find the starting point of record N, you seek byte (N-1)*RecordLength+1. In this way, it makes it easy to randomly access any numbered record. You don't even need to do that bit of maths because the programming language calculates it for you. All you have to do is nominate a record number.

Of course, you need a way to find out which record number you're after, and that's where a separate index file comes in handy which is very quick to search and retrieve the corresponding record number in the random file.

 

PROs and CONs RANDOM ACCESS FILES

While random files are faster to search and access, they can both waste storage space and potentially cut off data. e.g. if a name field in a random file's record is defined as being 15 characters long, the name "JONES" would have 10 bytes excess. On the other hand, the famous name

Gambolputty de von Ausfern-schplenden-schlitter-crasscrenbon-fried-digger-dingle-dangle-dongle-dungle-burstein-von-knacker-thrasher-apple-banger-horowitz-ticolensic-grander-knotty-spelltinkle-grandlich-grumblemeyer-spelterwasser-kurstlich-himbleeisen-bahnwagen-gutenabend-bitte-ein-nürnburger-bratwustle-gerspurten-mitz-weimache-luber-hundsfut-gumberaber-shönedanker-kalbsfleisch-mittler-aucher von Hautkopft of Ulm

would have his name cut off after "Gambolputty de ", leading to inaccurate data.

When design random file records, considerable thought needs to go into choosing the maximum lengths of fields to prevent problems later.

 

A Random File Example (written in VB6)

Private Type RecordType ' Define user-defined type. The vb.net equivalent is 'Structure'
ID As Integer
Name As String * 20
End Type

Private Sub Form_Load()

Dim MyRecord As RecordType, RecordNumber ' Declare variables.
Dim LastRecordNumber As Integer
Dim foundit As Boolean
Dim i As Integer

' Create a sample file for random access.

Open "c:\TESTFILE" For Random Shared As #1 Len = Len(MyRecord)

' Len = Len(MyRecord) specifies the length of a record - let VB calculate this.

'create 5 records

For i = 1 To 5
MyRecord.ID = 1000 + i 'fake an ID number
MyRecord.Name = "name" & CStr(i) & " Surname" & CStr(i) 'fake a name
Put #1, i, MyRecord ' PUT writes the record in MyRecord as record number i
Next

RecordNumber = 4 ' Choose a record number.
Lock #1, RecordNumber ' Lock record so other users can change it while I'm working on it
Get #1, RecordNumber, MyRecord ' Read record into a variable of the same structure as the record

MsgBox "Changing record 4"
MyRecord.ID = 234 ' Modify a field within the record
MyRecord.Name = "John Smith" 'Change the name data too!
Put #1, RecordNumber, MyRecord ' Write modified record to replace existing record 4

Unlock #1, RecordNumber ' Unlock record so other people can access it

' add a record
' use a private function to get the last record number

LastRecordNumber = FindLastRecord(1, Len(MyRecord))

'parameter 1 = 1, the file handle
'paramter 2 = length of the type (this lets the function be re-used in any program with any TYPEs)

Seek #1, LastRecordNumber ' go to end of file
MyRecord.ID = 999 ' create new record
MyRecord.Name = "New person"
Put #1, , MyRecord 'no record number is given, so the next one is automatically used

MsgBox "Last record accessed in random file was number " & Loc(1) 'Loc() function returns position of the read head.

' read all records starting from the last.

For RecordNumber = LastRecordNumber To 1 Step -1
Seek #1, RecordNumber ' Set position.
Get #1, , MyRecord ' Read record. Note the record number is omitted!
MsgBox "Person in record " & RecordNumber & " is " & MyRecord.Name
Next RecordNumber

'find a record
Seek #1, 1 ' go to start
foundit = False ' flag when record is found

While Not EOF(1) And Not foundit ' keep searching until file ends OR record is found
Get #1, , MyRecord ' get next record
RecordNumber = Loc(1) 'find current recordnumber
MsgBox "Looking for john smith in record " & RecordNumber

'need to trim trailing spaces because myrecord.name is padded with spaces up until its defined maximum length!

If Trim(UCase(MyRecord.Name)) = "JOHN SMITH" Then 'found it
MsgBox "Found John Smith in record " & RecordNumber
foundit = True 'set flag to show find was made so loop should end
MyRecord.Name = "DELETED" ' mark for deletion
Put #1, RecordNumber, MyRecord 'update record
Else
MsgBox "Name in record " & RecordNumber & " (" & MyRecord.Name & ") is not john smith"
End If
Wend

End Sub

Private Function FindLastRecord(filehandle As Integer, RecordLength As Integer)
'file handle is passed as a parameter
' \ is integer divide
'put return value into variable name the same as the function

FindLastRecord = LOF(filehandle) \ RecordLength ' Get number of records in file.

End Function

 

Back to the IT Lecture Notes index

Back to the last page you visited

Created 13 Sep 2010

Last changed: September 13, 2010 1:25 PM

VCE IT Lecture notes copyright © Mark Kelly 2001-