File Size

 BBS: Inland Empire Archive
Date: 04-23-92 (21:40)             Number: 125
From: PAUL LEONARD                 Refer#: NONE
  To: ZACK JONES                    Recvd: NO  
Subj: File Size                      Conf: (2) Quik_Bas
 > Hmmmm...i dunno.  Played around with the buffer sizes, too, and
 > got some unexpected results.  I've always assumed that the
 > larger the buffer, the faster the execution, due to fewer
 > executions of GET and thus fewer disk accesses.  But here's what
 > i got reading a 200K+ text file with 5865 lines (82 columns or
 > less per line)...

 >         time   time
 >         w/     w/o
 > buffer  cache  cache
 > ------  -----  -----
 > 64      4.7    8.7
 > 82      4.2    8.7 <--- my line limit, like yours of 100
 > 128     3.4    8.7
 > 256     2.9    8.7
 > 512     2.7    8.6
 > 1024    3.0    4.5 <--- here's why i picked 1K
 > 2048    4.0    4.7
 > 4096    6.0    6.3
 > 8192    10.3   10.2
 > 16384   18.9   18.6

I asked a knowledgeable friend about this - he said that
the problem is using the "DIM rec, GET rec$" method, the
reason being that the data is read into the buffer, then
has to be moved to a string.  The larger the buffer, the
bigger the string to move, so the longer it takes.

What he suggested was to use a FIELD statement and a plain GET statement...
==========================================================================
delimiter$ = CHR$(13)

INPUT "Recsize"; recsize%
in$ = "test.txt"
OPEN in$ FOR RANDOM AS #1 LEN = recsize%

FIELD #1, recsize% AS rec$

t1 = TIMER
DO UNTIL EOF(1)
  GET #1                                                 ',,rec$
  DO
    delimiter% = INSTR(delimiter% + 1, rec$, delimiter$)
    IF delimiter% THEN lines% = lines% + 1
  LOOP WHILE delimiter%
LOOP
PRINT TIMER - t1

CLOSE 1
END
=========================================================================
...which just reads the data into the buffer without the
string move (and also allows me to input the recsize%
variable for testing :,> ).  The results are much more in
line with my original assumption - as the buffer gets
larger, the program executes faster.

        time   time
        w/     w/o
buffer  cache  cache
------  -----  -----
64      4.0    8.8
82      3.3    8.8
128     2.8    8.7
256     2.3    8.7
512     1.8    8.6
1024    1.4    4.6 <-- here's why i wouldn't use 1K anymore :)
2048    1.4    2.6
4096    1.4    1.6
8192    1.4    1.5
16384   1.3    1.1
31744   1.1    1.1

These numbers are for the same file that was used for the
first set of tests.
_Much_ better.

ptl


--- msged 2.07
 * Origin: PTL Pointwork (1:105/48.111)
Outer Court
Echo Basic Postings

Books at Amazon:

Back to BASIC: The History, Corruption, and Future of the Language

Hackers: Heroes of the Computer Revolution (including Tiny BASIC)

Go to: The Story of the Math Majors, Bridge Players, Engineers, Chess Wizards, Scientists and Iconoclasts who were the Hero Programmers of the Software Revolution

The Advent of the Algorithm: The Idea that Rules the World

Moths in the Machine: The Power and Perils of Programming

Mastering Visual Basic .NET