subroutines in fortran 95 1

rk_19 · Feb 6, 2017

hi, I have a subroutine in my f95 program. this subroutine is called 100 times (by 100 call statements). I understands that each call statements are executed one after the other. is there any way I can call and execute all the 100 call statements in one go.
thanks

dik · Feb 6, 2017

Can you revise the subroutine to behave as a function?

Dik

IRstuff · Feb 6, 2017

It's very rare to have so many routines sequentially running that don't require data from each other. If they are truly independent, and you have 100 processors then yes, otherwise, they can all be started as parallel processes and the process multiplexes between them. If you have single processor, then there is no time advantage to doing that, and the context switching might actually cost throughput.

And, unless your compiler is designed for parallel processing, it's unlikely that your program can be modified at all.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

jhardy1 · Feb 7, 2017

Can you send a 100-element vector as the function argument, and have it return a vector with 100 results?

http://julianh72.blogspot.com

rk_19 · Feb 7, 2017

@IRSTUFF - the 100 subroutines are independent in my case, but all use the same subroutine...so I have to wait for all to get executed one by one even though they don't have to wait (technically as per the algorithm)

IRstuff · Feb 7, 2017

It doesn't work that way; called subroutines should have no interaction unless they programmed so.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

rk_19 · Feb 7, 2017

@IRSTUFF - I think my last post was not clear.. I meant I have 100 call statements which are independent (ie; the execution of one call statement does not require anything from other call statements). however, all call statements refer to the same subroutine.

IRstuff · Feb 7, 2017

Yes, that's what I said.

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

rapt · Feb 7, 2017

Is the calculation in the lowest level subroutine that is called when these other statements are called either

1 dependent on data which varies for each call,

or

2 does the lower level subroutine basically return the same value each time it is called independent of where it is called from?

If 2, call it once at the start and remember the data it returns!

Otherwise, it needs to be called every time as its results are dependent on each call.

MikeHalloran · Feb 7, 2017

If your compiler's preprocessor supports macro expansion, you could replace each subroutine call with a macro that inserts a complete copy of the subroutine core, with no overhead for calls and returns, at some expense in the size of the binary code, which seems to rarely be a problem these days.

Mike Halloran
Pembroke Pines, FL, USA

rk_19 · Feb 8, 2017

thanks @IRStuff

@MikeHalloran, I understand that the restriction is compiler preprocessor capability. I am using gfortran which came with fortrantools (

http://www.fortran.com/

). I don't know about the compiler capability - but if it is capable, is there any guideline on how to use this potential. i am not an expert with fortran, any guideline is appreciated.

MikeHalloran · Feb 8, 2017

I learned Fortran the Elbonian way in 1962, and have not used it since, so I can't help with questions that are specific to your toolchain.

However, macros are a powerful construct available in many languages, which allow you to do many interesting things. In this instance, you would define the subroutine as a sequence of instructions in the macro definition, and replace every instance of a call to that subroutine with an invocation of the macro, which is expanded by the preprocessor into the defined sequence of instructions, all compiled as inline code, with no call or return instructions needed because the runtime binary just runs through the sequence from top to bottom with no delay.

It's a common technique for speeding up program execution.

... which I'm not sure is your problem,
because you haven't told us what you are trying to do,
you haven't told us what you tried that didn't work,
and you haven't included even a single line of code.

Mike Halloran
Pembroke Pines, FL, USA

IRstuff · Feb 8, 2017

The issue is not the routine, it's everything else. Unless you have a multicore processor and a compiler that can make use of it, you'll get nothing. All of your processes have to complete to complete the program, so their total execution time can only be reduced by running on separate processors or cores. This requires a special compiler, most likely the Pro version:

http://www.fortran.com/products-page/compilers/absoft-pro-fortran-compiler-suite-commercial/

TTFN (ta ta for now)
I can do absolutely anything. I'm an expert!

https://www.youtube.com/watch?v=BKorP55Aqvg

faq731-376 forum1529 Entire Forum list

http://www.eng-tips.com/forumlist.cfm

SomptingGuy · Feb 14, 2017

If your calls are truly independent and therefore parallelizable, take a look at the OpenMP options that your compiler provides. Most will. You sprinkle compiler directives around your code to tell the compiler where to split among threads and rejoin.

Don't expect linear speedup though, as there are fixed and per-thread overheads. And your computation will be throttled by the slowest thread if you're using the normal/simplest method of sharing (i.e. divide a DO loop and give each thread a fixed section of it to work through). Of course you can do fine-grained & custom thread control when you've found bottle-necks, but it rapidly leads to diminishing returns.

A data point that I can offer for a typical FORTRAN compute engine is up to 1.8x speedup for 2 cores. ~3x-4x for 6 or more cores. It's all down to the proportion of sharable code and inherent sequencing requirements.

And... OpenMP implementations I've used will cause all N threads to start and be 100% CPU intensive even when not doing any useful work. This can be confusing for newcomers.

Steve

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

subroutines in fortran 95 1

rk_19

Structural

dik

Structural

IRstuff

Aerospace

jhardy1

Structural

rk_19

Structural

IRstuff

Aerospace

rk_19

Structural

IRstuff

Aerospace

rapt

Structural

MikeHalloran

Mechanical

rk_19

Structural

MikeHalloran

Mechanical

IRstuff

Aerospace

SomptingGuy

Automotive

Similar threads

Part and Inventory Search

Sponsor