Skip to content
Advertisement

Displaying Unique Temperature Values for the US

I am looking to do something similar to what I did in R (below) in SQL:

clim_data %>%
select(Year, AverageTemperature, State) %>%
group_by(Year,State) %>%
summarize(value = mean(AverageTemperature), .groups = 'drop') -> clim_data3

colnames(clim_data3)[2] <- "region" 
clim_data3$region<-tolower(clim_data3$region)

clim_data3 %>%
filter(Year==1900) -> clim_data1900
clim_data1900<-clim_data1900[,2:3]

The output of this code looks like this:

region  value
<chr>   <dbl>
alabama 17.059167
alaska  -5.146500
arizona 15.742917
arkansas    15.893417
california  14.51575

So far in SQL I have managed to output a single state using the following code:

select distinct year, round(avg(AverageTemperature) over (partition by year),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state = 'Alabama' and
    year = 1900

The output looks like this:

year    avgTemp   Country        State
1900    17.06   United States   Alabama

However, I am unable to get the unique avgTemp for each state. When I take it to more than one state, I get the same avgTemp for all states in that query. So if I run a query like this:

select distinct year, round(avg(AverageTemperature) over (partition by year),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state like 'A%' and
    year = 1900

I get the average for those states.

year    avgTemp   Country        State
1900    15.15   United States   Alaska
1900    15.15   United States   Arizona
1900    15.15   United States   Arkansas
1900    15.15   United States   Alabama

I am assuming I need to write a subquery of some sort to iterate through each state and give the avgTemp. I tried to partition over the state as well and it didn’t give me what I wanted. My overall goal is to print out each states avgTemp for the given year.

The dataset contains multiple values for each year and state:

select year, round(avg(AverageTemperature) over (partition by year),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state like 'A%' and
    year = 1900

output: 
year    avgTemp  Country         State
1900    15.15   United States   Alabama
1900    15.15   United States   Alabama
1900    15.15   United States   Alabama
1900    15.15   United States   Alabama

With more Alabama entries and the other ‘A’ states below it, so I think using distinct year is imperative. I am just stuck and expanding the single state out into multiple without then taking the average of all the states in the query. The data set can be found here: https://www.kaggle.com/berkeleyearth/climate-change-earth-surface-temperature-data?select=GlobalLandTemperaturesByState.csv

Thanks for the help!

Advertisement

Answer

Since you are looking for the average temperature for each state, you should add the State column to the partition by clause.

Try:

select year, round(avg(AverageTemperature) over (partition by year,State),2) as avgTemp, Country, State
from dbo.landTemps
where AverageTemperature is not null and 
    Country = 'United States' and 
    state like 'A%' and
    year = 1900
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement