Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations GregLocock on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

the efficient way

Status
Not open for further replies.

yusf

Structural
May 9, 2006
58
hi all,
I am currently dealing with programming and want to learn something about speed,in other words how fast could the reading data be.But before that i waould like to explain my problem a litle bit further:

Well i have a text file which contains hundereds of thousands of information and i would like to write a simple program that reads those datas efficiently.the time is problem for me and i am searching the fastest way...

as i searched i can read those datas as binary text or binary format ( maybe i am wrong please correct me at this and any other points) or there is any other way ..now can you help me which way ii must follow and which language is better for this type of purpose...

i hope i am clear..
thanks in advance
 
Replies continue below

Recommended for you

Unless you intend to learn how to be a professional programmer then use a relational database. There are several open source products which would easily handle such a small amount of data. I use MySQL for databases with several million items and expect search results in well under 1 second

Try MySQL for a start:

Good Luck
johnwm
________________________________________________________
To get the best from these forums read faq731-376 before posting

Steam Engine enthusiasts:
 
Read it to do what? The speed of the read is irrelevant if you're dumping it into a slow application.

Are you planning on being a programmer for the rest of your life or doing structural?

Why not just dump it into Matlab?

TTFN



 
Hi

As i am structural engineer, i am trying to write a computer program for design purpose..So the structural design on which i am currently working requires too much computer time..such as 3 hour.So i am trying to find ways to decrease calculation time, even reading data takes too much time for example..1GB data file i have ..and i will use those datas in my design approach..

that is why i am dealing with speed..

you know to be an engineer is much broad thing we can expect..
i know we can't learn everything but we have to..
 
If you read data from a disk one byte at a time, the process will be pretty slow; because of OS overhead and latency issues, you only get one character per disk revolution.

Instead, set up a ram buffer big enough to read an entire track at once, ask the OS to load the buffer when it empties, and let your application read data from the buffer.

If you are using a high level language, there may be a buffered read call already implemented; you just have to learn how to use it.

Similarly, you need to buffer your results into big chunks and write the chunks to disk, instead of writing one byte at a time.



Mike Halloran
Pembroke Pines, FL, USA
 
Hi
i would like to open this topic again.Because i would like to learn

what i can do
what possible solutions i have
which solution i must learn from scracth ..

since this is a big challenging and time consuming thing from beginning to end i really must first decide which way i must follow..

johnwm suggested me to use MYSQL..

May be i can use it but since before i must first create a database from my data file (it is text file. for those who use MSC NASTRAN it is simply a f06 file) and since my program will use this text file as input for different content each time i run i cant recreate database from scracth again and again.
So i think this solution is not suitable for me..

MikeHalloran suggested me to use high level language.I want to ask if i can do this job (reading data) by using FORTRAN.

What i really want to do is that first constructing an addressable data and reach them but i want to hear about some solution ways.
MikeHalloran :

"If you are using a high level language, there may be a buffered read call already implemented; you just have to learn how to use it"

May you please give me some detail info about which you mentioned above.. or related sites about the way of sorting,reading data

thanks in advance





 
What language are you using now?


Mike Halloran
Pembroke Pines, FL, USA
 
I suggest you use Matlab. Otherwise, you're going to burn calories getting algorithms and data visualization to work, which are already provided in Matlab.

TTFN



 
Now i am trying to improve myself in FORTRAN environment..AS i searched i can read and write data from any text file..so i am thinking to read proper data from my text file and then i am gonna make them addressable and then whenever i need them i will call those data..My plan is as i mentioned above...

to visualize data after writing some fortran code i am trying to plot them by using spreadsheet program EXCELL..this is as you said very tiring process..

so i need to hear some logical ways to read write allocate memory..in short i want to hear a solution way that enables me to play with data as i wish..

 
Hi ysuf...

You know something about Visual Basic 2005...?
I've used for my structural programs an run fast with reading large databases in Access or text files.

If you want to try,answer me
 
Hiya-

Well, to get more to the point (and I'm sorry, it's been too long for my Fortran stuff).

But, as pointed out earlier, most (and fortran is no exception) have "buffer" reads.

You have mentioned that you have 1Gigabyte files, so be it. Not a problem, however, we usually don't read in an entire buffer full of that size usually.

We do, however, make use of many megabyte buffer while reading in data. Also, we sometimes do not use the formatting of data allowed by some of the programming languages. Indeed, sometimes when convienient, we just do indeed read in the "raw" binary data.

Some of the "tricks" of the business, is that you can have more than one program doing the task. Using shared memory, in some operating systems, we can have one program (or task) read in data to "common or shared memory" while another program will read and interpret the data. This allows some asynchrounous nature of the beast or system. The reader program could schedule or buffer in many of the reads at a time. The reader program would then block while the secondary storage is being accessed. The "calculator" task can then look at a "semiphore" from the reader program to see if there is any of several buffers filled with data. It can then read the particular filled buffer while the reader task is still blocked. It processes data then puts it where it needs to go. That could be also a "writer" task that is similar to the reader task.

When the "reader" task is unblocked (allowed to run, because
the operating system has gotten around to finishing the read request), the reader program can then get another "chunk" started pointing to a different buffer.

Whew! That is quite a task that I've set out for you! Probably too much. Sorry. BUT you were asking about speed.

You were also asking about the programming languages that are used. O.K. here's the second part of the lecture (again sorry!)

C is a very popular programming language these days. Fortran, although usefull and well suited to doing formula translations, is not used too much in the computer geek area any more. Some of the "structured" programming styles allowed by C, C++, Pascal, etc. Makes for programs that are easier to follow and hence, debug.

There are other programming languages that are available, of course. But they have their disadvantages.

Assembly language- The fastest, but nowadays, not by much. This is very close to the "raw" binary bits the computer understands. It is NOT portable. It relies on "tricks" the programmer can play. Tricks are not good things. It make the code hard to understand and so, it is harder to maintain. It also requires a through understanding of the underlying architecture of the system, the instruction set, and the hardware associated in the system. Definitely not for the faint of heart. It's good for geeks like me that do embedded processing systems, not for general purpose computers.

Java and some "older" BASIC- These are "interpreter" languages. They incurr an overhead as they are not running "native" or compiled code. They either run the raw source code through, or they run an intermediate code through an interpreter. The code (or partially translated code) is run through another program that will interpret the code. The problem here is that there is a program that is running a program. Lots of overhead. I'm not sure how Visual Basic does it, the older basics worked this way. Nor am I sure about Matlab and the like.

C++ and other "object oriented programming languages". I can only answer for C++, not the ".Net" and others. In most cases however, there is additional overhead imposed on the language's operation to "protect" and "objectify" the operation. Notably, with C++ for example is the constructor and destructor code that is enforced. This tends to slow down the operation and why it is not used as much as C programming in operating system design and drivers.

Other programming languages? Who knows how many there are. I have only hit the "high" points of stuff out there today. I haven't touched the "legacy" languages like algol or forth. But they fill many books on the subject.

Whew! More crap. O.K. if you really want to get into the design side of things, C is a good choice. Now, we haven't
even talked about the operating system of choice. That depends on the delivery of your software. Is it going to a big 'nix box or is it designed to be run on Windows? Is there a choice of doing client-server information? I.e. there is a lot of data that has to be crunched, but does it yield a small amount of data out? Have a big machine do the number crunching, and serve a small "gui" (graphical user interface") client machine running windows?

If you are running just a single windows situation, then you're best bet is to run large data buffers with binary reads then do the formatting (interpret the ascii data) while in memory.

I've sure gone into a long winded answer, but well, you asked.

Hope that this helps and doesn't confuse the issue more.

Cheers,

Rich S.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor