Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations waross on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

subroutines in fortran 95 1

Status
Not open for further replies.

rk_19

Structural
Aug 7, 2012
71
0
0
AE
hi, I have a subroutine in my f95 program. this subroutine is called 100 times (by 100 call statements). I understands that each call statements are executed one after the other. is there any way I can call and execute all the 100 call statements in one go.
thanks
 
Replies continue below

Recommended for you

It's very rare to have so many routines sequentially running that don't require data from each other. If they are truly independent, and you have 100 processors then yes, otherwise, they can all be started as parallel processes and the process multiplexes between them. If you have single processor, then there is no time advantage to doing that, and the context switching might actually cost throughput.

And, unless your compiler is designed for parallel processing, it's unlikely that your program can be modified at all.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert! faq731-376 forum1529 Entire Forum list
 
@IRSTUFF - the 100 subroutines are independent in my case, but all use the same subroutine...so I have to wait for all to get executed one by one even though they don't have to wait (technically as per the algorithm)
 
@IRSTUFF - I think my last post was not clear.. I meant I have 100 call statements which are independent (ie; the execution of one call statement does not require anything from other call statements). however, all call statements refer to the same subroutine.
 
Is the calculation in the lowest level subroutine that is called when these other statements are called either

1 dependent on data which varies for each call,

or

2 does the lower level subroutine basically return the same value each time it is called independent of where it is called from?

If 2, call it once at the start and remember the data it returns!

Otherwise, it needs to be called every time as its results are dependent on each call.
 
If your compiler's preprocessor supports macro expansion, you could replace each subroutine call with a macro that inserts a complete copy of the subroutine core, with no overhead for calls and returns, at some expense in the size of the binary code, which seems to rarely be a problem these days.



Mike Halloran
Pembroke Pines, FL, USA
 
thanks @IRStuff

@MikeHalloran, I understand that the restriction is compiler preprocessor capability. I am using gfortran which came with fortrantools ( ). I don't know about the compiler capability - but if it is capable, is there any guideline on how to use this potential. i am not an expert with fortran, any guideline is appreciated.
 
I learned Fortran the Elbonian way in 1962, and have not used it since, so I can't help with questions that are specific to your toolchain.

However, macros are a powerful construct available in many languages, which allow you to do many interesting things. In this instance, you would define the subroutine as a sequence of instructions in the macro definition, and replace every instance of a call to that subroutine with an invocation of the macro, which is expanded by the preprocessor into the defined sequence of instructions, all compiled as inline code, with no call or return instructions needed because the runtime binary just runs through the sequence from top to bottom with no delay.

It's a common technique for speeding up program execution.

... which I'm not sure is your problem,
because you haven't told us what you are trying to do,
you haven't told us what you tried that didn't work,
and you haven't included even a single line of code.



Mike Halloran
Pembroke Pines, FL, USA
 
The issue is not the routine, it's everything else. Unless you have a multicore processor and a compiler that can make use of it, you'll get nothing. All of your processes have to complete to complete the program, so their total execution time can only be reduced by running on separate processors or cores. This requires a special compiler, most likely the Pro version:
TTFN (ta ta for now)
I can do absolutely anything. I'm an expert! faq731-376 forum1529 Entire Forum list
 
If your calls are truly independent and therefore parallelizable, take a look at the OpenMP options that your compiler provides. Most will. You sprinkle compiler directives around your code to tell the compiler where to split among threads and rejoin.

Don't expect linear speedup though, as there are fixed and per-thread overheads. And your computation will be throttled by the slowest thread if you're using the normal/simplest method of sharing (i.e. divide a DO loop and give each thread a fixed section of it to work through). Of course you can do fine-grained & custom thread control when you've found bottle-necks, but it rapidly leads to diminishing returns.

A data point that I can offer for a typical FORTRAN compute engine is up to 1.8x speedup for 2 cores. ~3x-4x for 6 or more cores. It's all down to the proportion of sharable code and inherent sequencing requirements.

And... OpenMP implementations I've used will cause all N threads to start and be 100% CPU intensive even when not doing any useful work. This can be confusing for newcomers.

Steve
 
Status
Not open for further replies.
Back
Top