Python Basics — I (for data science)

Bharadwaj Narayanam
AlmaBetter
Published in
5 min readMay 28, 2021

--

What comes to your mind when you think of programming? Complex codes, confusing logics, unintuitive syntax? Well, python isn’t something of that sort. Let us understand the basic datatypes of python, defining its variables and we’ll learn about some data structures in python.

Don’t worry if you’re not aware of the terms, you’ll be well versed by the end of this article!

Photo by Hitesh Choudhary on Unsplash

Basic Data types in Python

Not only the syntax, even the data types are so intuitive in python. Any integer you input, carries the “integer” datatype and is represented as “int”.

6 is treated as an integer. But is 6 the same as 6.0? Uh, nope. 6.0 is treated as a “float” datatype. By now you must have understood what a float datatype is, any number with a decimal would be treated as a float datatype.

Okay, how long would you work with numbers? Quite boring, aren’t they?

I want to input the name of the city I live in, Hyderabad. Any name or alphabets we provide in quotes will be treated as a “string”

So how do I define a string? To do that, we need to provide the value in quotes as shown :- “Hyderabad”

Any other datatype you could think of? Ummm.. Yes?… No?

So your thinking process can have two outcomes, yes or no. Which is also known as binary outcome. Don’t you think we need to have a datatype for these kind of outcomes? Yes we should, and we have one, which is the “boolean” datatype.

It is represented as “bool”. Boolean datatype considers only two values “True” and “False”.

We have another datatype “datetime” which is used for date and time type of variables and we’ll discuss this in the upcoming articles.

How do we define a variable in Python?

Firstly, what is a variable?

VARIABLES are entities which help us store information and retrieve it later.

There are some protocols to define a variable in python.

  1. A variable name can NOT start with a number, it should start with an alphabet or an underscore. However, remainder of the variable name can consist of numbers.
  2. Variable name should not contain special characters such as . and ,
  3. Variable names are case sensitive i.e., a is not equal to A.

For example:

“python_123” is a valid variable name. “123_python” isn’t, as it starts with a number.

“python.1,2,3” isn’t a valid variable name, same with “python-123” as it contains hyphen, and hyphen is a special character.

“_python_123” is NOT the same as “_Python_123”.

Note: Here I am providing quotes for the variable names to distinguish them from the sentence. Quotes are not allowed to be used while naming a variable.

Data Structures in python

There are numerous data structures in python but we are going to try and discuss some of the basic data structures.

I want to list out all of the 11 players of the Indian Cricket team. Some of the players get replaced by other players depending on their fitness and performance. So, basically these 11 players keep changing for different games. I need a data structure so that I can do that comfortably, I need something that is “mutable”. In this case, “list” comes to our rescue.

LIST is a data structure which could consist of elements of multiple data types and data structures. All the elements in the list must be enclosed in [square brackets]. Empty list can be initialized with an empty [].A list can contain another list as its element, which we call it as “Nested list”. Examples of lists are given below:

  1. [“Warner”, “Kane Williamson”, “Manish Pandey”, “Jonny Bairstow”, “Rashid Khan”]
  2. [1, “Hi”, True,1]
  3. [“Hello”, [“this”, “is”, “a”, “nested list”], “Orange army”]
  4. [1]

We’ll discuss how to modify the elements in a list in the upcoming articles.

So now, I want to store all the door numbers of my apartment. In an apartment, neither new door number gets added nor an existing door number needs to be deleted. So we need to store these numbers in something which is “immutable”. So in this case we can use something called a “tuple”.

Major difference between a list and a tuple is that the list is mutable and the tuple is immutable.

Since we are talking about immutability, I would like to highlight that the “string” data type is immutable.

Elements in the tuple must be enclosed within a (parentheses). Even a tuple can contain different data types and data structures as its elements. Given below are some examples of tuples.

  1. (1, 2, 3, 4, 5, 4, True, (1,2), “I am a string”, [“and many more”])
  2. a = 1 ,2 (Here, a is a tuple object because even if we don’t provide parentheses, python by default considers it as a tuple)
  3. (1) is not a tuple object. We need to provide a comma after the element if we want a single element in our tuple. (1,) would be a tuple.

You might have observed that lists and tuples can contain duplicate elements. What if I want a data structure which only returns me the unique elements even if I pass duplicates? As we have learnt in our lower grades that sets don’t have repeated elements, the case is the same here.

“Set” is a data structure which has all the properties of set that we have learnt in our math class. We can perform union, intersection, set difference etc. on them. Elements of the set must be enclosed in the {flower brackets}. Sets are mutable like lists.

Empty set cannot be initialized with just empty {} as this is reserved for an empty dictionary which we are going to learn later in this article. We need to initialize an empty set by writing “set()”.

Sets cannot contain mutable objects such as lists and sets. So we cannot define a set inside a set.

  1. {1, 2, 3, (1, 2), “hi”}
  2. {“hello_world”, [1, 2, 3], 1} will not create a set as it contains a mutable object.

I am a big fan of Sunrisers Hyderabad so I am very much concerned about the individual score each batsman scored and I want to map them accordingly. So I need another data structure for this type of operation. “Dictionaries” are very handy in this situation. They are represented as “dict”. Dictionaries are also enclosed in {flower brackets}.

Dictionary has key-value pairs so that we can access the value by the key. Dictionaries are mutable, meaning we can change the value of the key anytime just by assigning another value, and we can append a key-value pair to a dictionary even after creating it. Let us have a look at the scores of the players.

srh_scores = {“Warner”:36, “Kane Williamson”:74, “Manish Pandey”:23,     “Jonny Bairstow”:52, “Rashid Khan”:17}

So I can just access the score of Warner by just passing srh_scores[“Warner”]. I want to update the score of Rashid Khan as he just hit a six, I can do that by using the following code.

srh_scores["Rashid Khan"] = srh_scores["Rashid Khan"] + 6

--

--

Bharadwaj Narayanam
AlmaBetter

On a mission of writing 100 quality articles related to statistics and data science.