Data Handling Using Pandas -1 - Mykvs.in

3y ago
54 Views
2 Downloads
2.66 MB
43 Pages
Last View : 1d ago
Last Download : 3m ago
Upload by : Aiyana Dorn
Transcription

Newsyllabus2020-21Chapter 1Data Handlingusing Pandas -1Informatics PracticesClass XII ( As per CBSE Board)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Python Library – MatplotlibMatplotlib is a comprehensive library for creating static, animated,and interactive visualizations in Python.It is used to create1. Develop publication quality plots with just a few lines of code2. Use interactive figures that can zoom, pan, update.We can customize and Take full control of line styles, font properties,axes properties. as well as export and embed to a number of fileformats and interactive environmentsVisit : python.mykvs.inforregularVisit : python.mykvs.in forregularupdates updates

Data Handling using Pandas -1Python Library – PandasIt is a most famous Python package for data science, which offerspowerful and flexible data structures that make data analysis andmanipulation easy.Pandas makes data importing and data analyzingmuch easier. Pandas builds on packages like NumPy and matplotlibto give us a single & convenient place for data analysis andvisualization work.Visit : python.mykvs.inforregularVisit : python.mykvs.in forregularupdates updates

Data Handling using Pandas -1Basic Features of Pandas1. Dataframe object help a lot in keeping track of our data.2. With a pandas dataframe, we can have different data types(float, int, string, datetime, etc) all in one place3. Pandas has built in functionality for like easy grouping &easy joins of data, rolling windows4. Good IO capabilities; Easily pull data from a MySQLdatabase directly into a data frame5. With pandas, you can use patsy for R-style syntax indoing regressions.6. Tools for loading data into in-memory data objects fromdifferent file formats.7. Data alignment and integrated handling of missing data.8. Reshaping and pivoting of data sets.9. Label-based slicing, indexing and subsetting of large datasets.Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas – Installation/Environment SetupPandas module doesn't come bundled with Standard Python.If we install Anaconda Python package Pandas will beinstalled by default.Steps for Anaconda installation & Use1. visit the site https://www.anaconda.com/download/2. Download appropriate anaconda installer3. After download install it.4. During installation check for set path and all user5. After installation start spyder utility of anaconda from start menu6. Type import pandas as pd in left pane(temp.py)7. Then run it.8. If no error is show then it shows pandas is installed.9. Like default temp.py we can create another .py file from newwindow option of file menu for new program.Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas – Installation/Environment SetupPandas installation can be done in Standard Pythondistribution,using following steps.1. There must be service pack installed on our computer if weare using windows.If it is not installed then we will not beable to install pandas in existing Standard Python(which isalready installed).So install it first(google it).2. We can check it through properties option of my computericon.3. Now install latest version(any one above 3.4) of python.Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas – Installation/Environment Setup4.Now move to script folder of python distribution in commandprompt (through cmd command of windows).5. Execute following commands in command prompt serially. pip install numpy pip install six pip install pandasWait after each command for installationNow we will be able to use pandas in standard pythondistribution.6. Type import pandas as pd in python (IDLE) shell.7. If it executed without error(it means pandas is installed onyour system)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Data Structures in PandasTwo important data structures of pandas are–Series, DataFrame1. SeriesSeries is like a one-dimensional array like structure withhomogeneous data. For example, the following series is acollection of integers.Basic feature of series are Homogeneous data Size Immutable Values of Data MutableVisit : python.mykvs.in for regular updates

Data Handling using Pandas -12. DataFrameDataFrame is like aheterogeneous nalStudent NamearrayClass Section Gender Date OfBirthNIDHI MANDAL YASHREYAANGIABoy29/12/2010SHANDILYABasic feature of DataFrame are Heterogeneous data Size Mutable Data MutableVisit : python.mykvs.in for regular updateswith

Data Handling using Pandas -1Pandas SeriesIt is like one-dimensional array capable of holding dataof any type (integer, string, float, python objects, etc.).Series can be created using constructor.Syntax :- pandas.Series( data, index, dtype, copy)Creation of Series is also possible from – ndarray,dictionary, scalar value.Series can be created using1. Array2. Dict3. Scalar value or constantVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesCreate an Empty Seriese.g.import pandas as pseriess pseries.Series()print(s)OutputSeries([], dtype: float64)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesCreate a Series from ndarrayWithout indexe.g.With index positione.g.import pandas as pd1import numpy as np1data np1.array(['a','b','c','d'])s pd1.Series(data)print(s)import pandas as p1import numpy as np1data np1.array(['a','b','c','d'])Output1 a2 b3 c4 ddtype: objectNote : default index is startingfrom 0Output100 a101 b102 c103d dtype:objects p1.Series(data,index [100,101,102,103])print(s)Note : index is starting from 100Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesCreate a Series from dictEg.1(without index)import pandas as pd1import numpy as np1data {'a' : 0., 'b' : 1., 'c' : 2.}s pd1.Series(data)print(s)Eg.2 (with index)import pandas as pd1import numpy as np1data {'a' : 0., 'b' : 1., 'c' : 2.}s pd1.Series(data,index ['b','c','d','a'])print(s)Outputa 0.0b 1.0c 2.0dtype: float64Outputb 1.0c 2.0d NaNa 0.0dtype: float64Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Create a Series from Scalare.gimport pandas as pd1import numpy as np1s pd1.Series(5, index [0, 1, 2, 3])print(s)Output0 51 52 53 5dtype: int64Note :- here 5 is repeated for 4 times (as per no of index)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesMaths operations with Seriese.g.import pandas as pd1s pd1.Series([1,2,3])t pd1.Series([1,2,4])u s t #addition operation print (u)u s*t # multiplication operationprint (u)0 21 42 7dtype: int64output0 11 42 12dtype: int64Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesHead functione.gimport pandas as pd1s pd1.Series([1,2,3,4,5],index ['a','b','c','d','e'])print (s.head(3))Outputa 1b. 2c. 3dtype: int64Return first 3 elementsVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas Seriestail functione.gimport pandas as pd1s pd1.Series([1,2,3,4,5],index ['a','b','c','d','e'])print (s.tail(3))Outputc 3d. 4e. 5dtype: int64Return last 3 elementsVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1Accessing Data from Series with indexing and slicinge.g.import pandas as pd1s pd1.Series([1,2,3,4,5],index ['a','b','c','d','e'])print (s[0])# for 0 index positionprint (s[:3]) #for first 3 index valuesprint (s[-3:]) # slicing for last 3 index valuesOutput1a. 1b. 2c. 3dtype: int64 cd. 4e. 5dtype: int643Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesRetrieve Data Using Label as (Index)e.g.import pandas as pd1s pd1.Series([1,2,3,4,5],index ['a','b','c','d','e'])print (s[['c','d']])Output c3d 4dtype: int64Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesRetrieve Data from selectionThere are three methods for data selection: loc gets rows (or columns) with particular labels fromthe index. iloc gets rows (or columns) at particular positions inthe index (so it only takes integers). ix usually tries to behave like loc but falls back tobehaving like iloc if a label is not present in the index.ix is deprecated and the use of loc and iloc is encouragedinsteadVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas SeriesRetrieve Data fromselectione.g. s pd.Series(np.nan,index [49,48,47,46,45, 1, 2, 3, 4, 5]) s.iloc[:3] # slice the first three rows49 NaN48 NaN47 NaN s.loc[:3] # slice up to and includinglabel 349 NaN48 NaN47 NaN46 NaN45 NaN1 NaN2 NaN3 NaN s.ix[:3] # the integer is in the index sos.ix[:3] works like loc49 NaN48 NaN47 NaN46 NaN45 NaN1 NaN2 NaN3 NaNVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1PandasDataFrameIt is a two-dimensional data structure, just like any table(with rows & columns).Basic Features of DataFrame Columns may be of different typesSize can be changed(Mutable)Labeled axes (rows / columns)Arithmetic operations on rows and columnsStructureRowsIt can be created using constructorpandas.DataFrame( data, index, columns, dtype, copy)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameCreate DataFrameIt can be created with followings Lists dict Series Numpy ndarrays Another DataFrameCreate an Empty DataFramee.g.import pandas as pd1df1 pd1.DataFrame()print(df1)outputVisit : python.mykvs.in for regular updatesEmptyDataFrameColumns: [ ]Index: [ ]

Data Handling using Pandas -1Pandas DataFrameCreate a DataFrame from Listse.g.1import pandas as pd1data1 [1,2,3,4,5]df1 pd1.DataFrame(data1)print (df1)e.g.2outputimport pandas as pd1data1 [['Freya',10],['Mohak',12],['Dwivedi',13]]df1 pd1.DataFrame(data1,columns ['Name','Age'])print (df1)outputWrite below for numeric value as floatdf1 pd1.DataFrame(data,columns ['Name','Age'],dtype float)Visit : python.mykvs.in for regular updates00123412345Name Age1 Freya 102 Mohak 122 Dwivedi 13

Data Handling using Pandas -1Pandas DataFrameCreate a DataFrame from Dict of ndarrays / Listse.g.1import pandas as pd1data1 {'Name':['Freya', 'Mohak'],'Age':[9,10]}df1 pd1.DataFrame(data1)print (df1)OutputName Age1 Freya 92 Mohak 10Write below as 3rd statement in above prog for indexingdf1 pd1.DataFrame(data1, index ['rank1','rank2','rank3','rank4'])Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameCreate a DataFrame from List of Dictse.g.1import pandas as pd1data1 [{'x': 1, 'y': 2},{'x': 5, 'y': 4, 'z': 5}]df1 pd1.DataFrame(data1)print (df1)Outputx y z0 1 2 NaN1 5 4 5.0Write below as 3rd stmnt in above program for indexingdf pd.DataFrame(data, index ['first', 'second'])Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameCreate a DataFrame from Dict of Seriese.g.1import pandas as pd1d1 {'one' : pd1.Series([1, 2, 3], index ['a', 'b', 'c']),'two' : pd1.Series([1, 2, 3, 4], index ['a', 'b', 'c', 'd'])}df1 pd1.DataFrame(d1)print (df1)Outputone twoa 1.01b 2.02c 3.03d NaN 4Column Selection - print (df ['one'])Adding a new column by passing as Series: - df1['three'] pd1.Series([10,20,30],index ['a','b','c'])Adding a new column using the existing columns valuesdf1['four'] df1['one'] df1['three']Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Create a DataFrame from .txt fileHaving a text file './inputs/dist.txt' as:11 12.9212 90.7513 60.9021 71.34Pandas is shipped with built-in reader methods. For example thepandas.read table method seems to be a good way to read (also in chunks)a tabular data file.import pandasdf pandas.read table('./input/dists.txt', delim whitespace True,names ('A', 'B', 'C'))will create a DataFrame objects with column named A made of data of typeint64, B of int64 and C of float64Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Create a DataFrame from csv(comma separated value) file / import datafrom cvs filee.g.Suppose filename.csv file contains following dataDate,"price","factor 1","factor ,1.258,1.554import pandas as pd# Read data from file 'filename.csv'# (in the same directory that your python program is based)# Control delimiters, rows, column names with read csvdata pd.read csv("filename.csv")# Preview the first 1 line of the loaded datadata.head(1)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameColumn additiondf pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})c [7,8,9]df[‘C'] cColumn Deletiondel df1['one'] # Deleting the first column using DEL functiondf.pop('two') #Deleting another column using POP functionRename columnsdf pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]}) df.rename(columns {"A": "a", "B": "c"})a c0 1 41 2 52 3 6Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameRow Selection, Addition, and Deletion#Selection by Labelimport pandas as pd1d1 {'one' : pd1.Series([1, 2, 3], index ['a', 'b', 'c']),'two' : pd1.Series([1, 2, 3, 4], index ['a', 'b', 'c', 'd'])} df1 pd1.DataFrame(d1)print (df1.loc['b'])Outputone 2.0two 2.0Name: b, dtype: float64Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrame#Selection by integer locationimport pandas as pd1d1 {'one' : pd1.Series([1, 2, 3], index ['a', 'b', 'c']),'two' : pd1.Series([1, 2, 3, 4], index ['a', 'b', 'c', 'd'])}df1 pd1.DataFrame(d1)print (df1.iloc[2])Outputone 3.0two 3.0Name: c, dtype: float64Slice Rows : Multiple rows can be selected using ‘ : ’ operator.print (df1[2:4])Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameAddition of Rowsimport pandas as pd1df1 pd1.DataFrame([[1, 2], [3, 4]], columns ['a','b'])df2 pd1.DataFrame([[5, 6], [7, 8]], columns ['a','b'])df1 df1.append(df2)print (df1)Deletion of Rows# Drop rows with label 0df1 df1.drop(0)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameIterate over rows in a dataframee.g.import pandas as pd1import numpy as np1raw data1 {'name': ['freya', 'mohak'],'age': [10, 1],'favorite color': ['pink', 'blue'],'grade': [88, 92]}df1 pd1.DataFrame(raw data1, columns ['name', 'age','favorite color', 'grade'])for index, row in df1.iterrows():print (row["name"], row["age"])Outputfreya 10mohak 1Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameHead & Tailhead() returns the first n rows (observe the index values). The default number ofelements to display is five, but you may pass a custom number. tail() returns thelast n rows .e.g.import pandas as pdimport numpy as np#Create a Dictionary of seriesd 6,3.20,4.6,3.8])}#Create a DataFramedf pd.DataFrame(d)print ("Our data frame is:")print dfprint ("The first two rows of the data frame is:")print df.head(2)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameIndexing a DataFrame using .loc[ ] :This function selects data by the label of the rows and columns.#import the pandas library and aliasing as pdimport pandas as pdimport numpy as npdf pd.DataFrame(np.random.randn(8, 4),index ['a','b','c','d','e','f','g','h'], columns ['A', 'B', 'C', 'D'])#select all rows for a specific columnprint df.loc[:,'A']Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Python PandasPandas DataFrameAccessing a DataFrame with a boolean index :In order to access a dataframe with a boolean index, we have to create adataframe in which index of dataframe contains a boolean value that is “True”or “False”.# importing pandas as pdimport pandas as pd# dictionary of listsdict {'name':[“Mohak", “Freya", “Roshni"],'degree': ["MBA", "BCA", "M.Tech"],'score':[90, 40, 80]}# creating a dataframe with boolean indexdf pd.DataFrame(dict, index [True, False, True])# accessing a dataframe using .loc[] functionprint(df.loc[True]) #it will return rows of Mohak and Roshni only(matching true only)Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Python PandasPandas DataFrameBinary operation over dataframe with seriese.g.import pandas as pdx pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })y pd.Series([1, 2, 3])new x x.add(y, axis 0)print(new x)Output00 11 42 9141018271627Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameBinary operation overdataframe with dataframeimport pandas as pdx pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })y pd.DataFrame({0: [1,2,3], 1: [4,5,6], 2: [7,8,9] })new x x.add(y, axis 0)print(new x)Output0 1 20 2 8 141 4 10 162 6 12 18Note :- similarly we can use sub,mul,div functionsVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1Pandas DataFrameMerging/joining dataframee.g.import pandas as pdleft pd.DataFrame({'id':[1,2],'Name': ['anil', 'vishal'],'subject id':['sub1','sub2']})right pd.DataFrame({'id':[1,2],'Name': ['sumer', 'salil'],'subject id':['sub2','sub4']})print (pd.merge(left,right,on 'id'))OutputidName x0 1 anil1 2 vishalsubject id x Name y subject id ysub1 sumersub2 salilVisit : python.mykvs.in for regular updatessub2sub4

Data Handling using Pandas -1Pandas DataFrameMerging/combining dataframe(different styles)pd.merge(left, right, on 'subject id', how 'left') #left joinpd.merge(left, right, on 'subject id', how 'right') #right joinpd.merge(left, right, how 'outer', on 'subject id') #outer joinpd.merge(left, right, on 'subject id', how 'inner') # inner joinVisit : python.mykvs.in for regular updates

Data Handling using Pandas -1Concate two DataFrame objects with identicalcolumns.df1 pd.DataFrame([['a', 1], ['b', 2]],.columns ['letter', 'number']) df1letter number0 a11 b2 df2 pd.DataFrame([['c', 3], ['d', 4]],.columns ['letter', 'number']) df2letter number0 c31 d4 pd.concat([df1, df2])letter number0 a11 b20 c31 d4Visit : python.mykvs.in for regular updates

Data Handling using Pandas -1Export Pandas DataFrame to a CSV Filee.g.import pandas as pdcars {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],'Price': [22000,25000,27000,35000]}df pd.DataFrame(cars, columns ['Brand', 'Price'])df.to csv (r'C:\export dataframe.csv', index False, header True)print (df)Visit : python.mykvs.in for regular updates

Visit : python.mykvs.in for regular updates Data Handling using Pandas -1 Visit : python.mykvs.in for regular updates Python Library –Matplotlib Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.It is used to create 1. Develop publication quality plots with just a few lines of code 2.

Related Documents:

Pandas : Pandas is an open-source library of python providing high-performance data manipulation and analysis tool using its powerful data structure, there are many tools available in python to process the data fast Like-Numpy, Scipy, Cython and Pandas(Series and DataFrame). Data o

Lesson Plan Magic Tree House #48: A Perfect Time for Pandas Endangered Species Lapbook After reading A Perfect Time for Pandas and the accompanying Fact Tracker, Pandas and Other Endangered Species, your students will have a wealth of knowledge as well as a new-found interest in helping to save some of the most magnificent animals on the planet

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. The official Pandas documentation can be found here .

Python-based ecosystem of open-source software for mathematics, science, and engineering. In particular, these are some of the core packages: https://www.scipy.org Pandas pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. https://pandas.pydata .

Numpy’s std is the biased estimate while Pandas std is the unbiased estimate. Finn Arup Nielsen 22 October 5, 2013. Pandas Cross-tabulation For categorical variables select two columns and generate a matrix with counts for occurences (McKinney, 2012

answers to your personal questions. Treatment What are the treatment options for children with PANDAS? Treatment with Antibiotics The best treatment for acute episodes of PANDAS . is to treat the strep infection causin

Giant pandas do not have many babies. They also have been losing their habitat for many years. When they lose their food supply they do not survive long. 3. Does a giant panda make sounds? Yes, giant pandas do make sounds. They do not roar like a bear. They make a bleating sound,

Devices in ST’s ARM Cortex‑M0‑based STM32F0 series deliver 32‑bit performance while featuring the essentials of the STM32 family and are particularly suited for cost‑sensitive applications. STM32F0 MCUs combine real‑time performance, low‑power operation, and the advanced architecture and peripherals of the STM32 platform.