DATA VISUALIZATION WITH MATPLOTLIB — an Introduction #70DayMLStudy with Data Science Nigeria

Aminah Mardiyyah Rufai
8 min readJul 12, 2020
Matplotlib

INTRODUCTION

FEW THINGS YOU NEED TO KNOW

  1. It is Python’s primary Visualization Library.
  2. Its original Author is John D.Hunter , developed by Michael Droettboom, et al.
  3. First released in 2003, version 0.1.
  4. The latest release in June 2020, version 3.2.2
  5. It is mainly used for 2D Plots. Though it has multi-functionalities, for example, it can be used for Geographical plots such as maps, also used for Animations et cetera.
  6. It has two main styles ( **MATLAB** Style and **Object Oriented Style**)
  7. Integrates well with other libraries such as Pandas, Numpy, Seaborn et cetera.

USING MATPLOTLIB

  1. Install the Library: Just like every other library, it needs to be installed. There are several ways to achieve this.

pip install matplolib

or

conda install matplotlib

Anaconda as a package comes with this library and several other basic Python libraries pre-installed. Though, certain times, due to a hitch or slight error during installation, one might need to manually install.

2. Import the Library along with other wrappers/necessary libraries (Numpy, Pandas et cetera)

Importing the Library

HINT: The last line of code, “%matplotlib inline” is an IPython(interactive Python) magic code most effective and primarily used for plots in Jupyter Notebook/Lab.

There are multiple ways of importing the Matplotlib library.

i. “from matplotlib import * ”: Importing the library alongside all in-built functions and sub-modules.

ii. import matplotlib as mpl : Using an alias. Same as above, the difference is simply shortening the representation of the library for ease.

ii. from matplotlib import pyplot as plt: Importing a required sub-module directly for efficiency and using an alias for representation. The pyplot module is the key module for basic plots/visualization.

iv. import matplotlib.pyplot as plt: Again, this is the same as above, the difference is using the dot method in Python. I have included some links at the end of the article for a better understanding of this.

3. Check for version:

This is not always necessary. Though it is recommended when using the library for the first time. This is because there is most often a difference in functionality with every version release, either a method has been deprecated or improved on. You would notice a slight difference in plots. Though this does not always occur. However, it is a good idea to know the version currently in use for ease of debugging if an error is encountered or when one of the library’s dependencies is yet to support a particular version.

To check for the version, note that you would need to have imported the library in full as shown below:

The following syntax can be used to check for the version.

print(f‘Matplotlib:’, matplotlib.__version__)

4. Create/Import the data:

Now everything is in place, you can finally get into making some cool plots.

BASIC PLOTS IN MATPLOTLIB

  • Histograms
  • Barcharts
  • PieCharts
  • Line Graphs
  • Scatter Plots et cetera

DIFFERENT FUNCTIONS USED FOR CREATING PLOTS IN MATPLOTLIB

  • plot() — for line plots
  • scatter() — for scatter plots
  • pie() — for piechart
  • hist() — for Histogram
  • bar() — for BarCharts
  • barh() — for horizontal bar charts
  • et cetera

BASIC PLOTS USING THE MATLAB STYLE IN MATPLOTLIB

If you are quite familiar with MATLAB, you will notice a lot of similarities in syntax with this style. The examples below show a very simple line graph, and a scatter plot using the MatLab style.

  1. A simple Line graph

The screenshot below shows a simple plot of two variables using the MATLAB style. Notice that because the library has been imported with an alias “plt”, every attribute associated with the library was called out using that format.

The .plot() function is used for creating an argument for the variables or relationships one wishes to visualize. Arguments for marker style, preferred color, preferred line style/width, font-size et cetera can also be included. Check the documentation for more details, the link has been included below.

“plt.xticks, plt.yticks” are the scales used in both the x and y-axis respectively. Matplotlib provides options for adjusting the sizes and scale preference, also a rotation option as desired.

“plt.show”is used for displaying the plots. However, if Jupyter notebook is been used, including the Ipython Magic code(%matplotlibinline) shown earlier, makes this line of code unnecessary.

A line graph with Matplotlib

“plt.legend” — is used for providing information on the relationship(s) being plotted. Maplotlib provides options for desired positions for a Legend in a given plot. If a location is not chosen, a default position is assumed by the library automatically.

Moving the position of a legend

The size of the legend can also be adjusted.

You will find more useful information on this in the documentation.

2. Scatter Plots: A very useful plot in statistical and data analysis, it provides insights on the relationship between two variables, showing the degree of correlation. For scatter plots, the .plot() function is switched to .scatter(), then the necessary variables are included as arguments in the parentheses.

A simple scatter plot with Matplolib

Just like the line graph, one may decide to choose a color of preference, or size of points et cetera.

For changing colors of plots in Matplolib, there are three common ways:

  • The Hex-strings
  • The C-strings
  • The RGB tuple method.

For the c-string method, the preferred color is referenced by either using the full name such as red, blue, yellow, or using just the first letter of the color in lower-case, such as ‘m- for magenta’, ‘c-for cyan’, ‘r-for red’ et cetera.

Using the c-strings.

For the hex-strings, the colors are referenced using the following syntax:

  • #e5ae38 — for Yellow
  • #444444 — for Black
  • #008fd5 — for a lighter shade of blue

More options in the documentation

Using the hex-strings

For the RGB(Red Green Blue color Spectra) tuples, the following syntax is used.

  • (0, 0, 0) for Black
  • (0, 0, 1) for Blue
  • (0, 1, 0) for Green
  • (1, 0, 0) for Red
  • (1, 1, 0) for Yellow
  • Et cetera
Using RGB tuples

3. Histograms: A histogram is a summary of the variation in a measured variable. It shows the number of samples that occur in a category. A histogram is a type of frequency distribution, useful in univariate analysis during Exploratory Data Analysis. Histograms can be plotted using “.plt.hist()”. Since Histograms work by binning, options for adding specified ‘bins’ and grids are also available.

By default, the grid option is turned off for plots in Matplotlib, to include that, one would need to include an argument and set the parameter to True

Setting grid=True

Look up how to make other plots in the documentation

OBJECT-ORIENTED STYLE PLOTTING

This is more often used in situations where you want more control over your Plots. One very useful advantage is fewer lines of code as compared to the MATLAB style, and integration of the MATLAB style syntax when required.

Object-orient style Plotting

The object-oriented style uses the concept of representing plots as ‘ objects’, and divides plots into two categories:

  • Figure Object
  • Axes Object
A Simple line Graph using Object-Oriented Style

Notice the switch in syntax and the fewer lines of code as compared to the MATLAB STYLE. Also, notice how one can create a blend for using both syntaxes when required. Most often, you will encounter a mix of both syntaxes in codes.

I have included links to useful resources for a better intuition on this.

ADDING STYLES AND THEMES TO PLOTS

Another interesting feature of this Library is been able to switch themes/plotting styles. It integrates styles from other visualization libraries such as Seaborn, Ggplots, as well as styles from tools such as Tableau.

To check out available styles use the following syntax.

plt.style.available

Checking for Available styles in Matplotlib

SAVING PLOTS TO LOCAL COMPUTER FOR FURTHER REFERENCES OR EXTERNAL USE

Sometimes, for the purpose(s) of non-technical presentations, or future reference(s), you might want to save code-based plots for ease of access. Plots created using the Matplotlib library can be saved using the syntax as shown in the screenshot below:

Saving Plots

Options for choosing a resolution, transparency et cetera can also be included as arguments.

CONCLUSION

The aim of this article was to share a brief insight on the usage and features of this library. For in-depth knowledge, please make reference to the links I have shared below. If you found this useful please share and give a clap.

Also, find below the link to the Github repository for the Notebook used in this article.

Thank you for reading!

FIND ME ON SOCIAL MEDIA

Twitter: @diyyah92

LinkedIn: https://www.Linkedin.com/in/aminah-mardiyyah-rufa-i/

--

--

Aminah Mardiyyah Rufai

Machine Learning Researcher | Machine Intelligence student at African Institute for Mathematical Sciences and Machine Intelligence | PHD Candidate