Jump to content

Zerocopy


Recommended Posts

Hello,

I installed 2008.1 Free version on a 32-bit single cpu AMD 1.8 GHz (?), 2GB RAM and Gigabyte motherboard with an older EIDE 20 GB Seagate and have been trying to speed up disk reads (from the Seagate) using standard aio_read() and pthreads in my C application program, no GUI involved. Well... the fastest I could do is about 30 MBytes/second.

 

Since the same hardware setup with a similar Seagate and Windows XP SP2 would give me close to 55 MBytes/sec, there must be a way to speed things up in Linux. I suspect there is heavy kernel caching going on which needs to be stopped for my app. So, tried to open the file with O_DIRECT and align the buffers on 512-byte boundary but the program would not compile! The #ifdef before O_DIRECT (in fcntl.h) is coming alive somehow. I have offset64 enabled to handle huge files and thinking it could have some effects, I just copy the octal constant for O_DIRECT in my code and get past the compilation. But, during execution, the very first aio_read() ends up with err code 22 (Invalid Argument).

 

I used CreateFile( O_NONBLOCK | O_OVERLAPPED), VirtualAlloc(), ReadFileScatter() and WaitForISingleObject() in Windows to get that 55 MByte/sec. Shouldn't the O_DIRECT and aio_read/write calls be sufficient to produce equivalent performance?

 

Regards.

Link to comment
Share on other sites

Welcome on board (though a little late, sorry)

 

Actually, I had already seen your post, but could not understand what you wanted to achieve. I still can't.

In Linux, contrary to Windows, there are so many different possible filesystems, and so many different ways to use each, that it seems to me a bit unwise, or unefficient, to try and perform optimisations on the application level.

 

Were I you, provided I actually understand you goal, I would rather use the most standard file operations possible, with no attempt at optimisation other than the obvious, and then I'll try and see what can be done with tools such as “hdparm†or the “/etc/sysctl.conf†file.

 

Yves.

Link to comment
Share on other sites

Thanks for responding Yves. Within a couple of days I shall try to do exactly what you suggested and post the timing values.

 

My object is to read huge preexisting files ( > 7-8 GB) from local hard disk and do processing with them. The processing requires that a file be read in sequential manner, from start to end. The processing algorithm works on almost any practical buffer size (like 4 MByte buffer) and is fast enough in that the time to read-in the buffer into application memory is much more than the time to process the buffer.

 

So, to me, the problem seems to be how fast can I sequentially move that application buffer window (or, windows in general) over the entire file.

 

By the way, my filesystem is a freshly installed 2008.1 Free using ReiserFS. It has not seen any serious use yet and only a month old. No updates, no internet connection.

 

Regards.

Link to comment
Share on other sites

Hello,

I ran my tests again over the weekend, this time more systematically and have better results to share. I guess I was not comparing apples to apples in my initial post.

 

My test program is still the same, using aio_read/write in Linux and ReadFileScatter() in Windows. Shall try to tweak the hd-params when I get some more time. But for now, I am happy to see that Linux has similar performance numbers as Win XP for sequential disk reads.

 

Here are the results using three PCs and various EIDE/SATA drives without tweaking hard disk params. All numbers are in MegaBytes/Second :-

 

_______________________________________________________________________

PC-Gigabyte, Model Year 2005:

Gigabyte PCI motherboard, 2 GB RAM, AMD Sempron 1.8 GHz single proc CPU.

_______________________________________________________________________

 

1. With Seagate ST3250410AS with 16 MB cache, 7200 RPM, 250 GB SATA

that shows 85 MB/Sec on Buffer Reads using "hdparm -tT" under Mandriva.

 

1.1 Mandriva 2008.1, ReiserFS: 81, 76, 61.

1.2 Fedora 9, Ext3: 78, 82, 78, 83, 82.

1.3 WinXP SP2, NTFS: 84, 85, 87, 86.

 

2. With Caviar WD400BB, 8 MB cache, 7200RPM, 40 GB EIDE that shows

47 MB/Sec on Buffer Reads using "hdparm -tT" under Mandriva.

 

2.1 Mandriva 2008.1, ReiserFS: 40, 37, 34, 42, 39.

2.2 Fedora 9, Ext3: 45, 45, 45, 48.

2.3 WinXP SP2, NTFS: 46, 46, 45, 44.

 

3. With Maxtor 541DX, 2 MB cache, 5400 RPM, Year-2002. 20 GB EIDE

that shows 35 MB/Sec on Buffer Reads using "hdparm -tT" under Suse 10.0.

 

3.1 Suse 10.0, ReiserFS: 28, 26, 37.

 

4. With Western Digital WD400BB-22JHCO, 8 MB cache, 40 GB EIDE.

 

4.1 Win XP SP2, NTFS: 40, 43.

 

_______________________________________________________________________

PC-Pent4, Model Year 2005:

Dell PCI Motherboard (?), 1 GB RAM, Pentium 4 Dual Core 3.4 GHz.

_______________________________________________________________________

 

5. With Seagate ST380012AS, 8 MB cache, 80 GB SATA that shows 55 MB/Sec

on Buffer Reads using "hdparm -tT" under Suse 10.0.

 

5.1 Mandriva 2008.1, Ext3: 52, 52, 52.

5.2 Suse 10.0, ReiserFS: 41.

5.3 Win XP SP2: 54, 55, 42, 55.

 

_______________________________________________________________________

PC-Latest, Model Year 2008:

Dell PCI Expr Motherboard (?), 4 GB RAM, Intel Core 2 Duo E6850 @ 3 GHz.

_______________________________________________________________________

 

6. With Seagate ST3250310AS, 7200 RPM, 8 MB cache, 250 GB SATA.

 

6.1 Win XP SP2, NTFS: 102, 103, 103.

_______________________________________________________________________

 

Note:

1. The tests were done as follows. Create a few files of size 2 to 5 GB, then reboot the system and then immediately run the tests on these files just once.

 

2. The SUSE 10.0 filesystem in #3 with Maxtor 541DX has been heavily used over three years. All other systems are newly installed within past 20 days.

 

3. In 5.3, the number 42 looked suspicious. So, I ran the test back to back to see the reads closely. It looked as if there are some failing sectors on the disk which are to blame. The performance should haved increased on back to back reads but that did not happen in this case; the number remained as 42 in 3 successive tests.

 

Thanks.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...