Pandas for data science !

Hey!, how are you? Today lets take a dive into data science with python.

What's Pandas ??????

Pandas is a python module used to work with data or manipulate data for better use.

Its an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Right now we will take a look at some pandas cool stuff.

Installation:

To install pandas all you need to do is type

pip install pandas

Introduction to Data Structures

Pandas deals with the following three data structures −

Series DataFrame Panel These data structures are built on top of Numpy array, which means they are fast.

Python Pandas - Series

A series can be created using various inputs like −

Array Dict Scalar value or constant Create an Empty Series A basic series, which can be created is an Empty Series.

Example

#import the pandas library and aliasing as pd

import pandas as pd
s = pd.Series()
print s

Its output is as follows −

Series([], dtype: float64)

Create a Series from ndarray If data is an ndarray, then index passed must be of the same length. If no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].

#import the pandas library and aliasing as pd

import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

Its output is as follows −

0   a
1   b
2   c
3   d
dtype: object

Reading csv file with pandas

import pandas as pd
df = pd.read_csv('my.csv')

What this those is that it selects the csv file you inserted from your file folder

Now we can view its first head

df.head()

its as simple as that.