How do you add labels to a scatter plot in python?

Step by step guide to how to add text labels to scatterplot in python when using seaborn or matplotlib libraries

Python is great for data visualization! Matplotlib is very fast and robust but lacks the aesthetic appeal. Seaborn library built over matplotlib has greatly improved the aesthetics and provides very sophisticated plots. However when it comes to scatter plots, these python libraries do not have any straight forward option to display labels of data points. This feature is available in other data visualization tools like Tableau and Power BI, with just a few clicks or hovering the pointer over the datapoints.

In this article, I will explain how to add text labels to your scatter plots made in seaborn or any other library which is built on matplotlib framework.

The Data

The dataset is English Premier League table. We are interested in three columns:
i. Team : Team Name
ii. G : Goals Scored
iii. GA : Goals Conceded

Scatter Plot : Goals Scored vs Goals Conceded

A simple scatter plot can plotted with Goals Scored in x-axis and Goals Conceded in the y-axis as follows.

plt.figure(figsize=(8,5))
sns.scatterplot(data=df,x=’G’,y=’GA’)
plt.title(“Goals Scored vs Conceded- Top 6 Teams”) #title
plt.xlabel(“Goals Scored”) #x label
plt.ylabel(“Goals Conceded”) #y label
plt.show()

Basic scatter plot

Label Specific Items

Most often scatter plots may contain large amount of data points, we might be interested how some specific items fare against the rest. Labelling all the data points may render your plot too clunky and difficult to comprehend.
For example, if we are examining a socio-economic statistic of USA, it makes no sense to display the labels of all countries in scatter plot. It would be useful if USA’s and other selected competitors data is labelled so that we can understand how these countries are performing with respect to each other and rest of the world.
Coming to our dataset, I am a Totenham Hotspur(TOT) fan and am interested only in the performance of TOT against the other teams.
I can add the label using plt.text()

Syntax: 
plt.text(x=x coordinate, y=y coordinate, s=string to be displayed)

He x and y are Goals scored and Goals conceded by TOT respectively. The string to be displayed is “TOT”.
x, y and s are positional arguments and need not be explicitly mentioned if their order is followed.

plt.text(df.G[df.Team=='TOT'],df.GA[df.Team=='TOT'],"TOT", color='red')

Additional arguments like color, size, alpha(transperency) etc. can be used to change to text format. It can also be grouped within fontdict to make your code easy to read and understand.

plt.text(df.G[df.Team==’LIV’],df.GA[df.Team==’LIV’],”LIV”, 
fontdict=dict(color=’black’, alpha=0.5, size=16))

Scatter Plot with specific label (Image by author)

Adding Background Box

bbox parameter can be used to highlight the text.

sns.scatterplot(data=df,x=’G’,y=’GA’)
plt.text(x=df.G[df.Team==’TOT’]+0.3,
y=df.GA[df.Team==’TOT’]+0.3,
s=”TOT”,
fontdict=dict(color=’red’,size=10),
bbox=dict(facecolor=’yellow’,alpha=0.5))

Note that an indentation of 0.3 is added to x and y coordinates so that the text and the background box does not overlap with the datapoint.
It is optional but can improve the aesthetics of the chart.

Scatter Plot with Text Box (Image by author)

Labelling All Points

Some situations demand labelling all the datapoints in the scatter plot especially when there are few data points.
This can be done by using a simple for loop to loop through the data set and add the x-coordinate, y-coordinate and string from each row.

sns.scatterplot(data=df,x=’G’,y=’GA’)for i in range(df.shape[0]):
plt.text(x=df.G[i]+0.3,y=df.GA[i]+0.3,s=df.Team[i],
fontdict=dict(color=’red’,size=10),
bbox=dict(facecolor=’yellow’,alpha=0.5))

Scatter Plot with all labels (Image by author)

Final Touch

We have completed constructing a labelled scatter plot. However, we can observe that a few text boxes are jutting out of the figure area.
It would be aesthetically more pleasing if the text could be wrapped within the plot’s canvas. This can be done by changing the position, size etc. of the text.
I generally achieve this by increasing the plot area by using xlim() and ylim() functions in matplotlib.
In the below code you can see how I have applied a padding of 1 unit around the plot while setting x and y limits.

plt.figure(figsize=(8,5))
sns.scatterplot(data=df,x=’G’,y=’GA’)
for i in range(df.shape[0]):
plt.text(x=df.G[i]+0.3,y=df.GA[i]+0.3,s=df.Team[i],
fontdict=dict(color=’red’,size=10),
bbox=dict(facecolor=’yellow’,alpha=0.5))
plt.xlim(df.G.min()-1,df.G.max()+1) #set x limit
plt.ylim(df.GA.min()-1,df.GA.max()+1) #set y limit
plt.title(“Goals Scored vs Conceded- Top 6 Teams”) #title
plt.xlabel(“Goals Scored”) #x label
plt.ylabel(“Goals Conceded”) #y label

plt.show()

Padded Scatter Plot with labels (Image by author)

If you know any better methods of wrapping the elements in plot within the canvas area please let me know in comments.

Resources:

You can check out the notebook for this article in GitHub.

Become a Member

I hope you like the article, I would highly recommend signing up for Medium Membership to read more articles by me or stories by thousands of other authors on variety of topics.
Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.

Photo by Michael Dziedzic on Unsplash

How do you add a label to a scatter plot?

Do add the data labels to the scatter chart, select the chart, click on the plus icon on the right, and then check the data labels option. This will add the data labels that will show the Y-axis value for each data point in the scatter graph.

How do you add labels to graphs in Python?

With Pyplot, you can use the xlabel() and ylabel() functions to set a label for the x- and y-axis..
Add labels to the x- and y-axis: import numpy as np. ... .
Add a plot title and labels for the x- and y-axis: import numpy as np. ... .
Set font properties for the title and labels: import numpy as np. ... .
Position the title to the left:.

How do I add data labels in Matplotlib?

Add Value Labels on Matplotlib Bar Chart Using pyplot..
The parameter text is the label that will be added to the graph..
The parameter xy accepts a tuple (x,y) where x and y are the coordinates where the label will be added to the graph..
The function accepts many different arguments..

How do you add text to a scatter plot?

Create Text Scatter Plot Plot a string array of numbers at random points on a text scatter plot. x = rand(50,1); y = rand(50,1); str = string(1:50); figure textscatter(x,y,str); Alternatively, you can pass the coordinates x and y as a matrix xy , where x and y are the columns of xy .