Community-funded

Our tournaments, like no other, are community-funded. If you love competitions as much as we do, and you have a competitive spirit, why not enter the game and then share the total money raised among the winners? We want to change the way competitions are funded and executed, and we want everyone to have fun, and win something meaningful in the end.

Entry Fee

You can join the tournament with an entry fee just $USD 10. With your entry fee, you directly support the tournament and the community. The benefits of joining us are many, such as: learn applied Machine Learning, stoke your competitive spirit, compare your skills with other data scientists, receive the Machine Learning models of the winners, and of course, have a good time, as you compete to win the final prize!, and of course have a chance to win the final prize.

Regular Season

The tournaments take place in a short period of time. We allocate 1 month to raise the amount we set as a goal to proceed with the tournament. Then we will have 2 to 3 weeks of the regular season, followed by the most exciting stage of all: the playoffs! Each stage of the playoffs runs weekly (quarterfinals, semifinals and finals).

Timeline

During this stage all competitors participate by submitting their models and our platform dynamically displays the top 8 positionsThese are the competitors who would automatically move to the next round (quarterfinals). Your goal in this stage is to wind up with one of the top 8 scores. In this stage, the registration will remain open, so you can invite and challenge your friends and colleagues.

Playoffs

This is the start of an exciting and completely new stage in data science competitions. The system will assign you a partner with whom you will compete head-to-head for one week. Your goal in this stage is to beat your direct competitor, in order to advance to the next stage until the grand final. The system will choose the best score between the two competitors (not over the total number of competitors).

Podium

At the end we will have a podium of 3 winners. The grand final is the stage where the two best competitors of the whole tournament will face off with each other for a week, and the best score will be the 1st place winner, while the losing opponent will take 2nd. The 3th place will be awarded to the owner of top score of the semifinal roundwho did not advance to the final. All this is done automatically by our platform.

Challenge your Friends

You have the opportunity to share a link to invite and challenge your friends and colleagues to join the tournament. This will encourage you even more, as you will be able to learn as a group, share your knowledge and comments, and ultimately help each other gain maximum benefit from your participation - as well as win the final prize. Please note, however, that everyone must submit their own models individually.

Release Datasets

We will have a TrainQuarterfinals.csv or a TrainSemifinal.csv file depending on the stage of the tournament, and we'll also share a TestRelease.csv with the true labels of the immediately preceding stage. The purpose behind it is that the competitors can take that data to re-train the model with the new observations. It also allows us us to maintain the transparency of every stage, since each competitor can test their model with the true labels.

Transparency

At the end of the tournament, once we have declared the 3 winners, all participants will receive the model of the finalists in a Notebook format. This has a dual intention: the first one is to ensure the transparency of the winners (since it is a community-funded tournament), and the second is so that the participants can learn from the best!

Datasets

One of the unique features of our tournaments is that each stage features a release of additional/expanded datasets (new observations). This means that the data science problem stays the same, but we release a set of new observations. This resets the playing field, so the competitors must re-train the models based on the newly supplied data.

Individually

In our tournaments you can only participate individually (at least for now) This is in order to keep things simple , which reduces the amount of coding required(allowing us to launch the tournament sooner!), We will certainly consider adding team play to the future iterations of our tournaments, based on your feedback.

Leagues

Soon we will have different leagues, based on the complexity of the problem and the skills needed to solve them. These will be divided into Expert and Beginner leagues Look for the announcement of this new feature soon!

Frequently Asked Questions

What is the minimum contribution to participate?

The minimum contribution is $10 usd and you can contribute up to $300 usd. The funds raised will go to the winners. There will always be transparency throughout the whole process, both to know who passed to the next round, and how they did it (with the Release dataset), as well as in the final, since the winning models will be made public. In fact, these models will be shared with those who participated, which is very valuable in itself, because if you are not among the winners, you will still have a chance to learn a great deal from them and their winning models!

What do I get as compensation for my money?

At the end of the tournament we will share with you the winning Machine Learning models in a Notebook format.This for educational purposes, as you will be able to study them and learn from the best! You will also learn about the process of participating in a tournament, do your best to advance to the next stage, and enjoy an awesome adrenaline rush once you are competing in the playoffs!

How does DataSource.ai choose the problems for the tournament?

We try to make the data science problems fun, educational and valuable, but above all, we try to make them simulate real life situations. That is why we have decided to do them in different stages, so we can simulate new observations, and re-train the models based on those new observations. For now, we will be launching tournaments with classical Machine Learning (tabular data), but we hope to soon make them more complex with deep learning problems.

How complex are the tournament problems?

They have intermediate complexity, since they are tabular problems. That is our goal at least for the first few tournaments. Once we have run several of them successfully, we will make the move to tournaments with more complex problems that can be solved with deep learning tools.

How often are tournaments run?

We constantly have tournaments open to join and support. We try to have at least one active tournament per month. Of course, for more data science and ML practice, as well as more opportunities to earn cash, we encourage you to join our regular, sponsored competitions.

What is the length of each tournament?

Each tournament will last between 1-1.5 months. The regular season lasts between 2 and 3 weeks. The playoffs, which are from the quarterfinals, semifinals and final, are held in a span of one week each.

What is the purpose of having the tournament in different stages?

Running in stages has several purposes and benefits. The first one is to simulate the input of new real data as time goes by. This new data would be to evaluate the generalizations of the completed models and in turn, will serve to re-train the models for the next phase. Another benefit is to get a model that generalizes particularly well for all stages, so it will be tested in different scenarios, and will eliminate the overfitting that we try so hard to avoid in real life. On the other hand, there is the thrill of being able to advance to higher levels, and go head-to-head with other data scientists. Enjoy the adrenaline rush!

How does the funding stage work?

During this period we will be promoting the participation and funding of the tournament. We will also have a pool prize goal, which could range from USD$ 500 to USD$ 5,000 or more, according to the demand. In order to run a tournament we must have a minimum of 8 participants. This is in order to successfully run the playoff stages and to be able to match each of them correctly. The amount raised could be larger or smallerthan the prize pool goal. The important thing is to reach a minimum of 8 participants to be able to run the tournament. If the target number of participants is not reached by the given deadline, we could extend the starting date of the tournament. In the case that we do not reach the 8 participants, after an acceptable period of time, the money raised will be returned in full to those who supported it.

How does the regular season work?

In the regular season stage of the tournament, you will be playing against all the tournament participants. You will be able to track the action through a public leaderboard, whose order is given in ascending or descending order (according to the evaluation metrics), which will clearly indicate who would be advancing to the next round (the top 8). Your goal in this stage is to wind up in one of the top 8 positions. During this stage it will still be possible to accept new competitors and add money to the prize pool. This is possible because we are still competing against each other. So, we encourage you to invite and challenge your friends and colleagues to participate and win!

How does the quarter final season work?

This is where the real excitement begins! The system will randomly assign pairs of competitors, in 4 groups of 2 people. At this point you start competing directly with your opponent, not with everyone else. This allows you to increase your adrenaline every time you submit a new solution, as you want to prove that you are the best in that bracket! A new dataset called QuarterfinalsTest.csv will be immediately released, with the samples you must now predict in the new round. This file will contain the original data plus the true labels from the regular season. This is where we "reset" the competition, as you must re-train your model with more data, and make predictions on the newly expanded data set, all associated with the same data science problem. Our system will highlight the best score between each two competitors in yellow. Here you must take into account the timing, as the quarter finals will only last 1 week! At this point, no new participants or new amounts of money will be admitted, and we will announce the final prize pool to be distributed among the winners.

How does the semifinal season work?

When the quarter finals are over, the system will automatically recognize the best of each bracket, and will randomly assign a new pair of competitors for this new round! You will continue to compete head-to-head with your new opponent. Get ready for your excitement level to step up every time you submit a new solution, as you want to prove that you are the best in that bracket! A new dataset called SemifinalsTest.csv will be immediately released, with the samples you must now predict. This file will contain the original data plus the true labels of regular season and quarterfinals. Here we "reset" the competition again, as you must re-train your model with more data, and make predictions on the newly expanded data set - once again, all focused on the same data science problem. Our system will highlight the best score between the two. Here you must take into account the timing, as the semi-finals will only last 1 week!

How does the final season work?

The finals is where the top two competitors will go head-to-head! Get ready for an adrenaline-fueled knock-down, drag-out data science action like you’ve never seen before! Our two finalists will compete for a week, which will be divided into two parts: The Final and the Final Shot. The difference is that for the Final you will have the opportunity to re-train the model on the final datasets and send multiple solutions to the system to get the scores. But in the Final Shot, you will only have ONE opportunity to submit your final solution. This guarantees that the model has correctly generated the predictions, and takes into account all the previous stages. In this last submission you will have to send the Notebook through the same submission form, and it will be used to validate the results and are the notebooks that will be delivered to all the competitors who participated. At the end you will stand on the (virtual) Podium and bask in the glory as the best!

How do we ensure the transparency of the results?

At the end of each stage we will release the true values for these datasets. This allows each competitor to evaluate their own model and check that our scores were correct. But the most important thing will happen at the end of the competition, where the winning models, in Notebook format, will be shared with ALL the competitors, thus maintaining transparency and ensuring the good practices and ethics of the winners. Of course, this also offers the benefit of a great learning opportunity, by allowing all the competitors to study what the winning players have done.

What rights do I have over the winning models?

The IP on the winning models is intended for purely academic and educational purposes, not for commercial use. This is in order to encourage participation, as well as to facilitate learning in real life scenarios. If a commercial application is of interest, please contact our team to discuss our options, including a regular data science competition that we can conduct on our platform.

How will the money be divided among the winners?

The prize pool will be distributed as follows: 50% to the winner, 30% to the second place and 20% to the third place.

How is third place chosen?

The third place would be the best score from the semifinals, of those who did not advance to the grand final.

How does DataSource.ai make money?

In order to support the platform and the promotion of the platform to keep bringing more participants, we collect a portion of the total amount raised in the competition. Our fee is 20% of the total collected.

Still have questions? Let's talk! Email us at: support@datasource.ai

Data Science

Tournaments