Skip to content
Advertisement

Is there an other way to add calculated information to a table?

I have a table on BigQuery with some information and need to create another table with aggregated information by name with columns that have values with some condition.

Here is an example of a table:

CREATE TABLE EMP (
ID INT,
NAME CHAR,
ORDER_ID INT,
VALUE INT
);

INSERT INTO EMP VALUES (7369,'SMITH',1,5);
INSERT INTO EMP VALUES (7499,'ALLEN',2,10);
INSERT INTO EMP VALUES (7521,'JONES',3,15);
INSERT INTO EMP VALUES (7566,'JONES',4,5);
INSERT INTO EMP VALUES (7568,'JONES',5,10);

Here is simple aggregate by name:

SELECT name as client_name, min(order_id) as f_order,
max(order_id) as l_order, sum(VALUE) as total_order_value
FROM emp
GROUP BY name

Output:

client_name|f_order|l_order|total_order_value    
ALLEN      |2      |2      |10
JONES      |3      |5      |30
SMITH      |1      |1      |5

And needs add one more column “f_order_value” which has value from column “VALUE”, when f_order=order_id:

client_name|f_order|l_order|total_order_value|f_order_value    
ALLEN      |2      |2      |10               |10
JONES      |3      |5      |30               |15
SMITH      |1      |1      |5                |5

So try create virtual table and works with it, but it doesnt work due to i don’t use any aggragations in code, also I do not fully understand how to use the virtual table:

with first_table as (SELECT name as client_name,
min(order_id) as f_order, max(order_id) as l_order,
sum(VALUE) as total_order_value
FROM emp
GROUP BY name)
select first_table.*, IF(f.f_order=e.order_id, o.VALUE,null) as
order_value from first_table f
join EMP e on f.client_name=e.name group by name

Error:

Star expansion expression references column site which is neither grouped nor aggregated at

Advertisement

Answer

I would go with (for BigQuery Standard SQL)

#standardSQL
SELECT name AS client_name, 
  ARRAY_AGG(STRUCT(order_id AS f_order, value AS f_order_value ) ORDER BY order_id LIMIT 1)[OFFSET(0)].*,
  MAX(order_id) AS l_order, 
  SUM(VALUE) AS total_order_value
FROM `project.dataset.emp`
GROUP BY name

if to apply to sample data from your question – result is

Row client_name f_order f_order_value   l_order total_order_value    
1   ALLEN       2       10              2       10   
2   JONES       3       15              5       30   
3   SMITH       1       5               1       5    
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement