Seaborn plot two data sets on the same scatter plot
Clash Royale CLAN TAG#URR8PPP
Seaborn plot two data sets on the same scatter plot
I have 2 data sets in Pandas Dataframe and I want to visualize them on the same scatter plot so I tried:
import matplotlib.pyplot as plt
import seaborn as sns
sns.pairplot(x_vars=['Std'], y_vars=['ATR'], data=set1, hue='Asset Subclass')
sns.pairplot(x_vars=['Std'], y_vars=['ATR'], data=set2, hue='Asset Subclass')
plt.show()
But all the time I get 2 separate charts instead of a single one
How can I visualize both data sets on the same plot? Also can I have the same legend for both data sets but different colors for the second data set?
@Charlie I can but then I have to make another column to distinct between data sets?
– Michael Dz
Aug 7 at 18:11
Can you post the sample of set1 and set2?
– harvpan
Aug 7 at 18:14
What version of seaborn are you using? '0.9.0' has a scatter plot function that may make this easier
– johnchase
Aug 7 at 18:18
1 Answer
1
The following should work in the latest version of seaborn
(0.9.0)
seaborn
import matplotlib.pyplot as plt
import seaborn as sns
First we concatenate the two datasets into one and assign a dataset
column which will allow us to preserve the information as to which row is from which dataset.
dataset
concatenated = pd.concat([set1.assign(dataset='set1'), set2.assign(dataset='set2')])
Then we use the sns.scatterplot
function from the latest seaborn version (0.9.0) and via the style
keyword argument set it so that the markers are based on the dataset
column:
sns.scatterplot
style
dataset
sns.scatterplot(x='Std', y='ATR', data=concatenated,
hue='Asset Subclass', style='dataset')
plt.show()
Perfect! That's what I was looking for.
– Michael Dz
Aug 7 at 18:33
This is not always required. You can easily plot different dataframes on same axis with different colors and style. @MichaelDz
– harvpan
Aug 7 at 18:34
Glad it helped you out! @harvpan, not quite sure what you mean. Do you mean the
pd.concat
call is unnecessary and one could instead just write two calls to sns.scatterplot or plt.scatter
?– tobsecret
Aug 7 at 18:48
pd.concat
plt.scatter
Yes @tobsecret, that's what I meant. concatenating can be computationally sensitive and hog some large memory.
– harvpan
Aug 7 at 18:54
Fair point, though if your dataset is large enough to get you into computationally sensitive plotting territory, then you would likely have to opt for something like datashader anyways due to overplotting concerns. In the example plot given in the question, the amount of points appear to be in the hundreds, so concatenation should not be a limiting factor.
– tobsecret
Aug 7 at 19:00
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
For the first question, can you concatenate the datasets?
– Charlie
Aug 7 at 18:05