How to Build A Fantasy Premier League Team with Data Science

Introduction

Fantasy Premier League is a sport strategy game where we build a football team based on the teams from the English Premier League. The goal of this game is to pick players that will have a contribution to a match.

For knowing the players have a contribution, we can look at its number of points. These points are retrieved based on several statistics, such as the number of assists, goals, minutes of play, and many more. As the number of points is getting bigger, the players had a huge contribution to a match.

For picking players, we also have several constraints. Those constraints are:

  • A team consists of 15 players. In specific, a team has two goalkeepers, five defenders, five midfielders, and three forward players.
  • The budget given for a team is only 100 million Pounds.
  • We can pick only three players from a premier league team at maximum.

How can we build a team with those constraints? Rather than relying on our common sense, we can use the help from data science and mathematics!

This article will show you how to build a fantasy premier league team with a mathematical concept called Linear Programming.

Don’t worry if you don’t understand that concept deeply. We will use a library called Pulp that is based on the Python language.

Without further, let’s get started!

Optimization and Linear Programming

Before we get into the implementation, let me explain to you about optimization and the reason why we should use linear programming.

The optimization process is a workflow for optimizing a problem. The problem consists of the outcome and the constraints that we have. In general, the process is divided into several steps. They are:

  • Get the problem description.
  • Formulate the mathematical program
  • Solve the mathematical program
  • Evaluate the result
  • Finalize the result

In our case, the problem description is that we want to build a football team that has a large number of points while using the budget efficiently.

To formulate the problem into a mathematical problem, we need to know some information. Those are:

  • Variables that we want to watch. For our problem, we need data like the player’s price, point, team, and position.
  • The objective function. We want to optimize this function. In this case, we want to get lots of points from the players that we’ve picked.
  • Because the FPL has rules like the budget and number of players, we give the constraints for that.
  • Lastly, we need the data that tracks the player’s statistics.

After we get all the information that we need, the next step is to solve the problem. For doing that, we can use linear programming.

The reason for using linear programming is because the functions are based on a linear expression format. Based on that expression, linear programming will try to find the sweet spot. Therefore, we can get the optimal result.

I will not explain the details behind linear programming. If you are interested more, you can explore more on the internet. For now, we focus on how to solve the optimization problem using Python.

Implementation

Data source

For the data source, the FPL gives us access to its API for accessing the historical data. We can retrieve data like the player’s statistics, each game week result, team’s performance, and many more. The data itself is in JSON format, so we have to reformat it first.

If you cannot use the API or preprocess the JSON, don’t worry, we can use the preprocessed data from a GitHub repository by vaastav. You can access the data here. To download the data, we can use a git clone for doing that. On your terminal, write the command like this:

Load the libraries

After we get the data, the next step is to import the libraries to work on our dataset. We need libraries like pandas for analyzing and wrangling the data and pulp for applying linear programming.

If you still don’t have the libraries, you can use the pip command for installing those libraries. On the terminal, write these commands:

pip install pandas
pip install pulp

On your jupyter notebook, write these lines of code:

Load the data

After we load the libraries, the next step is to load the data. We will use the data that contains the player’s statistics from the current season (2021–22) for each game week. Let’s write these lines of code:

In this case, we only take the previous game week. As the article is currently written, the current game week is game week 6. Therefore, we take the data from the game week 5. Let’s write these lines of code:

As you can see from the table above, we have so many columns on it. Therefore, we take only columns that we actually need.

Those columns are the player’s name, club, total points, price (value), and position. The details about these columns will be explained in the next section.

Now let’s write these lines of code:

Initialize variables

Now we have the data that we need. The next step is to initialize several variables. As I’ve mentioned before, we took several columns from the table. The reason for that is because of two things, the objective, and the constraints.

We take the total points variable is because we want to optimize the number of points. And at the same time adhering to the constraints that we have. Variables like the team’s name, position, and price are the constraints that need to fulfill.

Now let’s write these lines of code for initializing variables that we need:

Initialize the problem

Now we have the variable that we need. The next step is to initialize the LpProblem object that contains our objective and constraints.

We set parameters like the name of the problem and an object to determine the goal of our problem. Because we want to maximize the number of points, we set the LpMaximize as the object’s parameter.

Here is the code for doing that:

Define the objective

After we initialize the problem, the next step is to define the objective. The objective of our problem is to maximize the number of points.

Let’s write these lines of code:

Define the constraints

Let’s recalled the constraints from the FPL:

  • A team consists of 15 players. In specific, a team has two goalkeepers, five defenders, five midfielders, and three forward players.
  • The budget given for a team is only 100 million Pounds.
  • We can pick only three players from a premier league team at maximum.

Based on the constraints above, we create the expression that mathematically describes the constraints.

Let’s write these lines of code:

Solve the problem

We have the variables and mathematical expressions that we need. Now let’s solve the problem by using this line of code:

Retrieve the list of players

After the program solves the problem, we can retrieve the player’s names that adhere to the constraints. Let’s write these lines of code:

Retrieve the expected points and the total costs

Now you have the team squad that suits the constraints from FPL. What if we want to know the expected points and the used budget?

From the prob, we can retrieve the objective and the constraints equation. We can calculate the total points and the price value by evaluating the equations.

Here is the code for doing that with the results:

Conclusion

Well done! Now you have implemented the linear programming for building the optimal team for the fantasy premier league using Python.

I hope it helps you to build your team for FPL. Therefore, you can build an optimal team that can contribute to matches while using a limited budget.

In case you are interested in my article, you can follow me on Medium. If you have any questions, you can connect with me on LinkedIn.

Thank you for reading my article.

Original post: https://towardsdatascience.com/how-to-build-a-fantasy-premier-league-team-with-data-science-f01283281236

Leave a Reply

Your email address will not be published. Required fields are marked *