Complete Example of Machine Learning

For further information on artificial intelligence, first AI story.

In this story, I will walk you through a complete coding example of a machine learning application. This application will be for an online music store that needs a reliable way to predict what kind of music its users are interested in.

The steps to follow in developing an artificial, intelligence application, using python, was mentioned in my first story. I mention them again.

  1. Import the data
  2. Clean the data — remove duplicate data. If the data is text-based. Convert the data to numerical, values.
  3. Split the data into training and test sets — Make sure our model produces the correct result.
  4. Create a model — Select an algorithm to analyze the data. Decision trace, Neural networks… Each algorithm has pros and cons. What makes python such a popular language in AI is some libraries already exist that implemented many of the algorithms. The library I will use is pcikit-learn.
  5. Train the model.
  6. Make predictions.-When you start, your predictions are likely inaccurate.
  7. Evaluate and improve.

My initial story contains the link to acquire the python programming language,

The initial data will be derived from the users of the music store it currently has. The main purpose of this sample application will be to increase music sales. It should reliably predict what kind of music each new user likes.

The final data will come in the form of an Excel spreadsheet or CSV file. This is a popular format in which initial data is stored, one location that provides data like this is

I will use the python programming language and two common libraries used on AI,

Pandas — A data analyst library that provides a concept called data framing. A data frame is a two-dimensional object, similar to an Excel spreadsheet.

Scikit-learn — provides algorithms such as decision trace and neural networks.

I will use the Jupyter code editor. a good code editor for python and machine learning projects called. Jupyter makes the inspecting of data much easier.

It’s best to use anaconda to install Jupyter, for application development similar to this. Anaconda is available here.

I created three, initial data with the free, open-source program LibreOffice. A CSV file is just a text file viewable and editable with notepad, VI, or Word pad. The data in these excel CSV files are loaded onto computer memory with the python library pandas. The input .csv file and the output .csv file is merely the user data profile data called musin.csv split onto input and output sections.

There are three columns of data. The first two are the users, age, and, gender. This was split off into the input. In the third column, the genre was split off into the output.

The three excel CSV files, for use in the program, are based on several assumptions. I then split the data into an output portion and an input portion to offer to feed the decision tree algorithm.

Keeping in mind that python is an interpreted language, I display the code, the content of the original CSV file, and the contents of the split CSV files.

The assumptions made were that males younger than 26 years old preferred hop-hop. Males between 25 and 30 liked jazz, and males over 30 liked classical. Females younger than 26 liked the dance genre, and females between 25 and 30 liked acoustic. Females over 30 liked classical.

For gender, a 1 means male, and a 0 means female.


age, gender, genre




26,1, jazz

29,1, jazz

20,0, jazz













For the model, the example uses the Decision tree algorithm. The decision tree algorithm comes from the decision tree class contained in the learn module contained in the Scikit-learn library. The use of the decision tree algorithm takes input and output data in order to form its prediction. If the results look bad, you can try using the neural network algorithm.

Noting, there is no data for a male twenty-one years old, or a female. Twenty-two years old. The model was asked for predictions of these genders and ages. So the machine learned the music preferences of these new users, and their data could be added, resulting in machine learning.

Predictions = model.predict([ [21,1], [22, 0] ] )

[‘Hip-hop’ ‘dance’]

Below is the entire code along with the displayed output listed in the interpreter.

Import pandas as pd
from sklearn. Tree import DecisionTreeClassifier

music_data = pd.read_csv(“music.csv”)

input = pd.read_csv(‘InputMusic.csv’)

age  gender
0    20       1
1    23       1
2    25       1
3    26       1
4    29       1
5    20       0
6    31       1
7    33       1
8    37       1
9    20       0
10   21       0
11   25       0
12   26       0
13   27       0
14   30       0
15   31       0
16   34       0
17   35       0

In [43]:

output = pd.read_csv('outputMusic.csv')print(output)genre
0     Hip-hop
1     Hip-hop
2     Hip-hop
3        jazz
4        jazz
5        jazz
6   classical
7   classical
8   classical
9       dance
10      dance
11      dance
12   acoustic
13   acoustic
14   acoustic
15  classical
16  classical
17  classical

In [44]:

model = DecisionTreeClassifier(), output)predictions = model.predict([ [21,1], [22, 0] ] )print(predictions)['Hip-hop' 'dance']

In [45]:

print(music-data)age  gender      genre
0    20       1    Hip-hop
1    23       1    Hip-hop
2    25       1    Hip-hop
3    26       1       jazz
4    29       1       jazz
5    20       0       jazz
6    31       1  classical
7    33       1  classical
8    37       1  classical
9    20       0      dance
10   21       0      dance
11   25       0      dance
12   26       0   acoustic
13   27       0   acoustic
14   30       0   acoustic
15   31       0  classical
16   34       0  classical
17   35       0  classical

Original post:

Leave a Reply

Your email address will not be published. Required fields are marked *