Seaborn plot two data sets on the same scatter plot

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Seaborn plot two data sets on the same scatter plot



I have 2 data sets in Pandas Dataframe and I want to visualize them on the same scatter plot so I tried:


import matplotlib.pyplot as plt
import seaborn as sns

sns.pairplot(x_vars=['Std'], y_vars=['ATR'], data=set1, hue='Asset Subclass')
sns.pairplot(x_vars=['Std'], y_vars=['ATR'], data=set2, hue='Asset Subclass')
plt.show()



But all the time I get 2 separate charts instead of a single one
enter image description here
How can I visualize both data sets on the same plot? Also can I have the same legend for both data sets but different colors for the second data set?





For the first question, can you concatenate the datasets?
– Charlie
Aug 7 at 18:05





@Charlie I can but then I have to make another column to distinct between data sets?
– Michael Dz
Aug 7 at 18:11





Can you post the sample of set1 and set2?
– harvpan
Aug 7 at 18:14





What version of seaborn are you using? '0.9.0' has a scatter plot function that may make this easier
– johnchase
Aug 7 at 18:18




1 Answer
1



The following should work in the latest version of seaborn (0.9.0)


seaborn


import matplotlib.pyplot as plt
import seaborn as sns



First we concatenate the two datasets into one and assign a dataset column which will allow us to preserve the information as to which row is from which dataset.


dataset


concatenated = pd.concat([set1.assign(dataset='set1'), set2.assign(dataset='set2')])



Then we use the sns.scatterplot function from the latest seaborn version (0.9.0) and via the style keyword argument set it so that the markers are based on the dataset column:


sns.scatterplot


style


dataset


sns.scatterplot(x='Std', y='ATR', data=concatenated,
hue='Asset Subclass', style='dataset')
plt.show()





Perfect! That's what I was looking for.
– Michael Dz
Aug 7 at 18:33





This is not always required. You can easily plot different dataframes on same axis with different colors and style. @MichaelDz
– harvpan
Aug 7 at 18:34





Glad it helped you out! @harvpan, not quite sure what you mean. Do you mean the pd.concat call is unnecessary and one could instead just write two calls to sns.scatterplot or plt.scatter?
– tobsecret
Aug 7 at 18:48



pd.concat


plt.scatter





Yes @tobsecret, that's what I meant. concatenating can be computationally sensitive and hog some large memory.
– harvpan
Aug 7 at 18:54





Fair point, though if your dataset is large enough to get you into computationally sensitive plotting territory, then you would likely have to opt for something like datashader anyways due to overplotting concerns. In the example plot given in the question, the amount of points appear to be in the hundreds, so concatenation should not be a limiting factor.
– tobsecret
Aug 7 at 19:00






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

Creating a leaderboard in HTML/JS