Supposed I have the collections that contain
Event type | Product |
---|---|
View | water_heater |
View | water_heater |
Purchase | water_heater |
View | television |
View | television |
View | television |
Purchase | television |
There are two types of fields in the first column, (views and purchase) The second column contains different kinds of products. The above data has been imported as a collection in MongoDB database. I want to perform the data analysis, by calculating the highest conversion rate of product ( 100 customers view 1 customer buy = 1%). By using the aggregation, it needs to perform matching for view and purchase. Then calculate the number of purchases and view the group based on the product. Lastly, divide the number of purchases by the number of views. As a beginner of MongoDB, I found out that it is quite hard and would ask if there is any ways to do it?
Advertisement
Answer
You can achieve this with a pretty standard pipeline, our strategy will be to $group
documents of the same product, while using the $cond
operator to sum up views and purchases. Lastly we just need to project the relevant ratio as required.
like so:
db.collection.aggregate([ { $group: { _id: "$product", purchases: { $sum: { $cond: [ { $eq: [ "$event_type", "purchase" ] }, 1, 0 ] } }, views: { $sum: { $cond: [ { $eq: [ "$event_type", "view" ] }, 1, 0 ] } } } }, { $project: { _id: 0, product: "$_id", ratio: { $divide: [ "$purchases", "$views" // can we have a purchase without a view? can views be 0? if so need to handle this edge case. ] } } }, { $sort: { ratio: -1 } } ])