Hello Everyone, Kindly ask us any doubt related to the Problem Statement of Hackathon2 here itself.
Few queries about dataset
Duplicate rows in training data and test data. (302 rows are duplicates i.e. same user_id, aov and category). What is the context? Or could they be dropped?
Is there a hidden test dataset, which will be used for calculating metrics, leaderboards?
Can external data or domain knowledge be used to make predictions?
Unable to post topic or reply to posts in the GHF hackathon 2 Category. Hence posting this on Site-Feedback.
I am also looking for a way to calculate the given metrics but in the target feature, we have only one category.
Based on that one category you have to calculate the three different metrics. We have already explained the metrics in the problem statement.
But to calculate the metric, we need three true values.
You have to predict the top three categories from which a person is most likely to buy.
yes.but sir, to calculate the metrics we need three true values
You predict the top 3 in the order to decreasing probability. The metric checks three things-
1 Did you make a predicition?
2 Whether one of the prediction is actually the correct category.
3 What is the rank of the correct predicted category if any.
For example, you have a user_id = 123456. The correct category for it is “Fashion”. You predicted, “Phones, Cars, Fashion” (in this particular order).
1The recall is 1 ( because you made a prediction)
2Precision is also 1 ( because one of your predicted categories is the correct category)
3MRR is 1/3 ( because it was the third recommendation given by you).
I hope it helps.
Thanks. It clarifies all my doubts.
Precision and MRR score are same number when i do submission.
It is a coincidence.
I did 8 submissions and all are giving this same precision and MRR
Have you given the top 3 predictions for each user_id or just the top 1 prediction?
i predicted top 3 category.
I think it is possible only when the first prediction of top 3 is correct. For a certain example either your precision and mrr both are either 1 or 0.
It is coming because of your approach. No need to worry.
Hi @Gaikwad_Sangram_Dash, Thanks for raising the query.
- I did understand the duplicate part. For inner joining on the training and test data, I am not getting any duplicates.
- For leaderboard calculation, we have the backend code running; it matches your answer with the correct answers.
- Yes, you can use external data and domain knowledge to make predictions.
- This issue has been resolved.
When can we expect the opening of the leaderboard?
I am getting “Unable to compute score.” error message while making submissions. I have also tried submitting with the SAMPLE SUBMISSIONS file (without altering it) however, the error remains the same.
Hi @Prabhnoor_Singh, We have resolved the issue. Please try making a submission now.
Leaderboard will be opened in a couple of days!