Skip to content
Advertisement

How can I create a column which computes only the change of other column on redshift?

I have this dataset:

The dataset has the product the company sells and the customers who saw the product. A customer can see more than one product, but the combination product + customer doesn’t have any repetition. I want to get how many people bought the product before the customer sees it.

This would be the perfect output:

As you can see, when the customer 122225 saw the product he wanted, two people have already bought it. In the case of customer 323455, two people have already bought the product A.

I think I should use some window function, like lag(). But lag() function won’t get this “cumulative” information. So I’m kind of lost here.

Advertisement

Answer

This looks like a window count of non-null values of buyer_position over the preceding rows:

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement