¿What Is A Data Science Tournament?

Daniel Morales
May 21, 2021

¿What Is A Data Science Tournament?

May 21, 2021 5 minutes read

Data science, as opposed to web or mobile development, has a relatively easy way to measure the outcome: through evaluation metrics. In software development the outcome expected by stakeholders is given by a number of subjective things, such as user onboarding, UX/UI or security. 

This evaluation is not specifically measured by some objective and unique metric. In data science the opposite happens, because we have the evaluation metrics, that we can choose a metric for a given problem, making a model easily evaluable, since we will always have an objective value as a result of the quality. The problem is to choose the right metric, and that would already be a different problem.

Those evaluation metrics leave the doors open for us to easily measure the skills of data scientists, and to evaluate the effectiveness of their models. That is why data science tournaments have been born. 

tournament-brackets.png 88.75 KB


See tournaments here: https://www.datasource.ai/en/home/data-science-tournaments

In our particular case, these are competitions funded by the competitors themselves, where the competition is run by playoffs, best players win the pool prize, and everybody else receive the winners' ML models. Test your competitive spirit, experience an adrenaline rush and prove your data skills.

Community-funded


Our tournaments, like no other, are community-funded. If you love competitions as us and you have a competitive spirit, why not give a contribution to enter the game and then share the total money raised among the winners? Well, that's our philosophy, we want to change the way competitions are funded and executed, and we want everyone to win something.


Entry Fee


You can join the tournament with an entry fee ranging from $USD 10 to $USD 300. You choose the amount you want to support the tournament and the community. The benefits of joining us are many, such as: learn applied Machine Learning, improve your competitive spirit, compare your skills with other data scientists, receive the Machine Learning models of the winners, and of course have a chance to win the final prize.


Timeline


The tournaments take place in a short period of time. We usually have 1 month to raise the money we set as a goal to start with the tournament. Then we will have 2 to 3 weeks of the regular season, and then we enter the most exciting stage of all: the playoffs! Each stage of the playoffs runs weekly (quarterfinals, semifinals and finals).


Regular Season


During this stage all competitors participate by submitting their models and our platform dynamically displays the first 8 positions, who would automatically move to the next round (quarterfinals). The goal for you in this stage is to be in the top 8 with the best scores. In this stage the registrations will still be open, so you can invite and challenge your friends and colleagues.

Playoffs


This is the start of an exciting and completely new stage in data science competitions. The system will assign you a partner with whom you will compete head-to-head for one week. Your goal in this stage is to beat your direct competitor, in order to advance to the next stage until the grand final. The system will choose the best score between the two competitors (not over the total number of competitors).

Podium


At the end we will have a podium of 3 winners. The grand final is the stage where the two best competitors of the whole tournament will face each other for a week, and the best score will be the winner and the 1th place. His opponent will be 2nd. And the 3th place will be the best competitor (score) of the semifinal among those who did not advance to the final. All this is done automatically by our platform.


Datasets


One of the peculiarities of our tournaments is that each stage has new datasets (new observations). This means that the data science problem keeps the same, but we make a release of new observations, so the competitors must re-train the models based on them. This faithfully simulates the input of new real data, and the improvement of the model based on them. It also keeps the excitement high, as no one has their position completely won!

Release Datasets


We will have a TrainQuarterfinals.csv or a TrainSemifinal.csv file depending on the stage of the tournament, and we'll also share a TestRelease.csv with the true labels of the immediately previous stage. This has a purpose, and that is that the competitor can take that data to re-train the model with new observations. It also allows to keep the transparency of each stage, since each competitor can test his model with the true labels.

Transparency


At the end of the tournament, and once we have assigned the 3 winners, all participants will receive the model of the finalists in a Notebook format. This has a double intention, the first one is to ensure the transparency of the winners (since it is a community-funded tournament). And the second is so that the participants can learn from the best!

Challenge your Friends


You have the opportunity to share a link to invite and challenge your friends and colleagues to join the tournament. This will encourage you even more, you will be able to learn as a group, sharing knowledge and comments, and finally helping each other to win the final prize. However, everyone must submit their own models individually.

Join our tournaments here: https://www.datasource.ai/en/home/data-science-tournaments

We hope you can join the tournaments and continue your learning path in data science!
Join our private community in Discord

Keep up to date by participating in our global community of data scientists and AI enthusiasts. We discuss the latest developments in data science competitions, new techniques for solving complex challenges, AI and machine learning models, and much more!