Pandas for data science !

Pandas for data science !

Hey!, how are you? Today lets take a dive into data science with python.

What's Pandas ??????

Pandas is a python module used to work with data or manipulate data for better use.

Its an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Right now we will take a look at some pandas cool stuff.

Installation:

To install pandas all you need to do is type

pip install pandas

Introduction to Data Structures

Pandas deals with the following three data structures −

Series DataFrame Panel These data structures are built on top of Numpy array, which means they are fast.

Python Pandas - Series

A series can be created using various inputs like −

Array Dict Scalar value or constant Create an Empty Series A basic series, which can be created is an Empty Series.

Example

#import the pandas library and aliasing as pd

import pandas as pd
s = pd.Series()
print s

Its output is as follows −

Series([], dtype: float64)

Create a Series from ndarray If data is an ndarray, then index passed must be of the same length. If no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].

#import the pandas library and aliasing as pd

import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

Its output is as follows −

0   a
1   b
2   c
3   d
dtype: object

Reading csv file with pandas

import pandas as pd
df = pd.read_csv('my.csv')

What this those is that it selects the csv file you inserted from your file folder

Now we can view its first head

df.head()

its as simple as that.

Read more here on pandas: Pandas docs