Variables
So far, you’ve been using Python and pandas commands without diving too deeply into how they work. To advance further in your analytics journey, it's crucial to grasp some foundational Python concepts behind these commands. Let's start by dissecting a common piece of code you've been using to import datasets:
dataset_name = pd.read_csv('filename.csv')
There are three key components to this code:
- Right of the
=
sign: This is the pandas command that imports the dataset:pd.read_csv('filename.csv')
. - Left of the
=
sign: This is the name you've assigned to the dataset:dataset_name
. - The
=
sign: This connects the namedataset_name
to the imported dataset.
This demonstrates a broader concept in Python known as variable assignment.
Variables in Python
In Python, a variable is essentially a name that refers to an object. For instance, dataset_name
refers to the imported dataset. The object referred to by a variable is known as the value of the variable.
To assign a value to a variable, you use the =
sign like this:
variable = value
Here:
variable
is the name you're assigning to an object, always placed on the left.value
is the object being named, always placed on the right.
For example, the following code assigns the imported repair dataset to the variable repair
:
repair = pd.read_csv('repair.csv')
The term "variables" is used because the value of a variable can change. Consider this example:
repair = pd.read_csv('repair.csv')
repair = pd.read_csv('laptops.csv')
In the first line, the repair dataset is imported and assigned to the variable repair
. In the second line, the laptop dataset is imported and assigned to the same variable repair
.
Python executes code from top to bottom. After both lines are executed, the variable repair
now refers to the laptop dataset, not the repair dataset.
Naming Variables
When naming variables in Python, keep the following rules in mind:
- They must start with a letter or underscore.
- They cannot contain spaces or special characters.
- They are case-sensitive (e.g.,
Repair
andrepair
are different variables).
Variable Types
Just like data, variables in Python can have different types:
int
: Numbers without decimals.float
: Numbers with decimals (both4.0
and4.1
are floats).str
: Text, which must always be enclosed in quotes.bool
: Boolean values, eitherTrue
orFalse
(first letter capitalized, no quotes).
For example:
three = 3
pi = 3.14
text = "Hello!"
python_is_cool = True
Notice that "Hello!"
has quotes around it, while True
does not. Strings require quotes, but Booleans do not.
Working with DataFrames
Your datasets typically aren't just numbers or text; they are tables containing both. These tables have a special variable type provided by pandas: pandas.core.frame.DataFrame
or simply DataFrame
. From now on, we'll refer to datasets in pandas as DataFrames.
Checking Variable Types
Sometimes, you may want to verify what type Python thinks a variable is. You can do this with the type()
function:
type(variable_name)
For instance, to check the type of the variable three, you'd write:
type(three)
which would output int
.
In various checkpoints, you might be asked to assign all your code to a variable for automatic verification. For example, you might need to assign type(three)
to a specific variable datatype
:
datatype = type(three)
To display the output, you can include a line like this in the code cell:
# show output
datatype
In this line:
- The
# show output
is a comment explaining the following code, indicated by the#
. - The variable name
datatype
at the end of the cell tells Jupyter to display its value. Sometimes, you may also see the variable name inside aprint
statement likeprint(datatype)
, which is necessary for displaying multiple outputs at once.
Conclusion
Understanding variables in Python is fundamental to mastering the language and advancing in data analytics. Variables allow you to store and manipulate data efficiently. By adhering to proper naming conventions and understanding variable types, you can write more readable and error-free code. Keep these concepts in mind as you continue exploring more complex topics in Python and data analysis.