Visualizing Data in One Dimension with Pandas
In this edition of my Substack newsletter, I want to delve into a fundamental aspect of data analysis and visualization: representing data in one dimension using Pandas. It might sound basic, but mastering this skill is crucial for any data scientist, analyst, or anyone working with data.
Why One-Dimensional Visualization?
Before we dive into visualization, let's briefly discuss why visualizing data in one dimension is so important. One-dimensional data, often called univariate data, typically consists of a single variable, making it the simplest form of data. It's a starting point for exploring data and understanding its characteristics. We use one-dimensional visualizations to gain insights, identify patterns, and spot outliers or anomalies.
One of the quickest and most effective ways to visualize all numeric data and their distributions in Python is to leverage histograms using pandas
Here, we will plot a histogram plot that shows all the attributes.
Step 1: Load Data
First, load your data using Pandas. Think of it as gathering all your ingredients before cooking.
We are using a BTC cryptocurrency dataset.
import pandas as pd
import matplotlib.pyplot as plt
btc_crypto = pd.read_csv('/content/BTC.csv')
btc_crypto
Step 2: Peek at Data
Take a quick look at your data to understand what's inside.
Step 3: Visualize Data
Now, let's make it visual!
To see how values spread, create a simple histogram chart with the code attached. It's like painting a small picture of your data.
btc_crypto.hist(bins=15, color= 'steelblue', edgecolor='black', linewidth=1.0, xlabelsize=8, ylabelsize=8, grid=False)
plt.tight_layout(rect=(0,0,1.2,1.2))
Relevant Link: GitHub