Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations waross on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

Manipulating Weather Data

Status
Not open for further replies.

ahramos

Mechanical
Nov 12, 2002
12
I've got 8760 lines of weather data that I'm trying to manipulate into three different tables. Currently the data is in the following format:

Date Hour Air Temp (°F)
08/01/01 100 60.6
08/01/01 200 58.4
08/01/01 300 57.3
...
07/31/02 2200 64.3
07/31/02 2300 63.3
07/31/02 2400 62.9


I need three different tables in the following format:
01 to 08 Jan Feb Mar...
102 0 0 0
97 0 0 0
92 0 0 0
87 0 0 0
82 0 0 0
...
7 0 0 0
2 0 0 0

09 to 16 Jan Feb Mar...
102 0 0 0
97 0 0 0
92 0 0 0
87 0 0 0
82 0 0 0
...
7 0 0 0
2 0 0 0

16 to 24 Jan Feb Mar...
102 0 0 0
97 0 0 0
92 0 0 0
87 0 0 0
82 0 0 0
...
7 0 0 0
2 0 0 0

I can do it manually, but it's very tedious. The first column of the three tables represent 5°F temperature bins such that 102 => 100 < temperature <= 105 and so forth. The numbers underneath the months represent number of hours within that month that the temperature falls within the 'bin'. The three tables represent the first 8 hours, second 8 hours and final 8 hours of the day. Thanks in advance.
 
Replies continue below

Recommended for you

mkay... let's see... one way to do this is to assign a &quot;bin code&quot; to each data point, then use countif() within your summary table to count the cells that fit in each bin.

you have:
12 months
3 time bins
a whole bunch of temperature bins, but certainly less than 60

So if you use two digits for the temperature bin, two digits for a month bin, and one digit for a time bin, you should be covered.

set up a function, or a group of functions, to assemble &quot;bin codes&quot; from the supplied data. Try the format MMHTT (month/hour/temp) for your bin codes, as an example.

If you start your bins at -125F, then your bin digits would be int((YourTemp-(-125))/5)

your hour digit is int((YourHour)/800) (note that this lumps 00:00 to 07:59 in a single bin, and 08:00 to 15:59 in the next bin. You can add an offset if you want to move the bin boundaries.

Your month digit is month(YourDate)

So your bin code, in my proposed format, would be
=month(YourDate)*1000 + int((YourHour)/800)*100 + int((YourTemp-(-125))/5)

For 08/01/01 100 60.6 this gives 8000 + 0 + 37 = 8037
for 09/01/01 2200 2 this gives 9225

use that formula alongside your table of values to generate bin codes for each value

*** note that I'm changing my temp offset from -125 to -122 for the following examples, so that 97.1 and 102.0 are in the same bin ***

Then set up your summary tables:
next to your temp bins, use the above function to generate the bin digits for temp
above your month bins, type in the bin digit for the month
for each table, type in the bin digit for the hours considered

your table above would then look something like this:
1 (blank) 1 2 3
(blank) 00-08 Jan Feb Mar
126 102 0 0 0
121 97 0 0 0
116 92 0 0 0
111 87 0 0 0
106 82 0 0 0

then write out the countif() formula to compare your list values to the value that goes in whichever bin you're trying to fill.

example for the cell corresponding to 00-08 january 102, assuming the 1 at the top left of the table is in cell F1, so 00-0759 january 97-102 value goes in cell H3
(00-08 january 102 means a bin code of 1000+0+44 =1044):

=countif($A$1:$A$8760,&quot;=1044&quot;)

blah.
 
Thanks ivymike for pointing me in the right direction. The manual method involves a lot of manual highlighting and the frequency function.

I've tried out your method, but am a bit confused. I understand the month and the hour portion of the bin code, but don't understand the temperature portion. How's that -125 fit in the picture?

Also, I manually verified the highest temperature bin, and I'm quite off when compared to this method. I'm pretty sure it has something to do with the offset.
 
the -125 ensures that you don't have any negative-numbered bins. If you omit the -125, then bin 0 starts at 0degF, and -1degF falls in bin -1. If you include the -125, then -125degF falls in bin 0. Using -122 instead of -125 gives you a 97-102 bin.


 
heheh... corrections section...

I messed up when I typed in my countif example. I meant to use a formula instead of &quot;=1044&quot;. The formula should have been
=COUNTIF($D$1:$D$3,H$1*1000+$F$1*100+$F3)
assuming that F1 is the top left cell, and contains the value 0. D1:D3 are the cells with the sample data converted to a bin code. H1 is the month code (1 for jan) and F3 is the temp code.

Oh, I noticed that I put some incorrect info in the earlier sample table - I must have left out a set of parentheses in my temperature bin number calculation. 97 to 102 should be bin 44, using the -122 offset. If I use -125 and leave out the parentheses, I get int(g4-(-125)/5)= 127.

more corrections to come, I'm sure.

 
I would rather use Dynamic Tables, which is the underestimated and almost unknown tool, that Excel Provides to create cross references tables from raw data as abramos mentioned.
Of course, some previous preparation, as described by ivymike, is required:
- Add three new columns yo your sheet (Month, Hour Range & temp Range)
- Extract Month Number from Date , function Month
- Convert Hour to Range with a Nested IF (or an INT division)
- Convert Temp Value to the Temp range. I used a separate sheet for ranges and a search function
- Then build the dynamic table. This can be refreshed any time you get new raw data, so you need to solve the problem once!!

Regards
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor