Data Visualization By Python Using SAS Dataset: Data From .

3y ago
37 Views
2 Downloads
876.24 KB
12 Pages
Last View : 2m ago
Last Download : 3m ago
Upload by : Elisha Lemon
Transcription

PharmaSUG SDE JapanData Visualization by Python usingSAS dataset: Data from Pandas toMatplotlibYuichi Nakajima, Principal Programmer,NovartisSeptember 4, 2018

Pre-requirement Focus on “WindowsPC SAS”connection. See reference forother connectiontype.Available fromAnaconda distributionPharmaSUG SDE 2018 Japan2Business Use Only As of July2018, v2.2.4 isthe latestversion.SAS 9.4 orhigher.Saspy2.2.4*Python3.Xor higher.Jupyternotebook Previously called“IPython Notebook”. Run Python on theweb browse.

Overview process1) Convert SAS datasetto Pandas Data Frame2) Drawing library inPythonSaspySAS DatasetMatplotlob.pyplotPandasPython libraryPharmaSUG SDE 2018 Japan3Business Use Only

1. Access to SAS datasets There will be 3 possible way to handle SAS data in Jupyternotebook.– Saspy API (Please refer to SAS User group 2018 Poster)– Jupyter Magic %%SAS– Pandas DataFrame(DF)Pandas DataFrame “Pandas” is the Python PackagePharmaSUG SDE 2018 Japan4Business Use OnlyUSUBJID0Indexproviding efficient data handlingprocess. Pandas data structuresare called “Series” for singledimension like vector and“Dataframe” for two dimensionswith “Index” and “Column”.Column123.SITEIDVISIT

1. Access to SAS datasets Import necessary library in Jupyter notebook.import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport saspy Access to SAS datasets (sas7bdat or xpt) and convert toPandas DF.1. Use Pandas to read SAS dataset (both xpt and sas7bdat are acceptable).# “%cd” is one of magic command.%cd C:\Users\NAKAJYU1\Desktop\tempdsadsl pd.read sas('adsldmy.sas7bdat', format 'sas7bdat', encoding "utf-8")2. Saspy API to read SAS dataset as sas7bdat. Then covert to Pandas DF.# Create libname by Saspy APIsas.saslib('temp', path "C:\\Users\\NAKAJYU1\\Desktop\\tempds")# Read SAS datasets in .sas7bdatadvs sas.sasdata('advsdmy', libref 'temp')# Convert sas dataset to DFadvsdf sas.sasdata2dataframe('advsdmy', libref 'temp')PharmaSUG SDE 2018 Japan5Business Use OnlyRecommended to use Saspyto avoid character set issue

2. Data Visualization- Get started In order to plot data by Matplotlib, first generate1)figure and 2)sub plot. At least one sub plot must becreated.# 1) Call figure instancefig plt.figure()# 2) Call subplotax fig.add subplot(111)dat [0, 1]# Line plot by plot functionax.plot(dat)# Display with show functionplt.show()PharmaSUG SDE 2018 Japan6Business Use Only

2. Data Visualization- Get started #Apply 'ggplot' style to figureplt.style.use('ggplot')fig plt.figure()ax1 fig.add subplot(221)ax2 fig.add subplot(222)ax3 fig.add subplot(223)dat1 [0.25, 0.75]dat2 [0.5, 0.5]dat3 [0.75, .show()PharmaSUG SDE 2018 Japan7Business Use Only Here’s an example to show 3subplots. Applied ‘ggplot’style(added grid line)

2. Data Visualization- Line Plot 1 / mean with SD plot Prepare summary statistic from data(DF) . “wk1” is a dummy datawith pandas DF which is following ADaM BDS structure.#Calcurate summary statistic per ARM, AVISITNsum wk1.groupby(['TRT01P a', 'AVISITN'])['AVAL'].describe()#Get mean and std into pandas Series.mean1 sum.loc['DRUG X', 'mean']mean2 sum.loc['DRUG Y', 'mean']mean3 sum.loc['Placebo', 'mean']sum: Pandas DFstd1 sum.loc['DRUG X', 'std']mean1-3: Pandas Seriesstd2 sum.loc['DRUG Y', 'std']std1-3: Pandas Seriesstd3 sum.loc['Placebo', 'std']Index:[TRT01 P, AVISITN]Column:PharmaSUG SDE 2018 Japan[count, mean, std, .]8Business Use OnlyIndex:AVISITNColumn:mean1

2. Data Visualization- Line Plot 1 / mean with SD plot # Define array for x-axis label settingvis num np.array([0, 1, 2, 7, 14, 28, 56])vis order np.array(["Baseline", "Day 1", "Day 2", "Week 1", "Week 2", "Week 4", "Week 8"])plt.style.use('ggplot')fig plt.figure(figsize (20,10))ax fig.add subplot(111)One subplotexample#subplot settingax.plot(mean1.index-0.5, mean1, color 'r', label 'DRUG X')ax.plot(mean2.index, mean2, color 'g', label 'DRUG Y')ax.plot(mean3.index 0.5, mean3, color 'b', label 'Placebo')#Show legend on upper left.ax.legend(loc "upper left")x: AVISITN as index#Apply label ticks and labels y: meanax.set xticks(vis num)ax.set xticklabels(vis order, rotation 90)#Set errorbar by errorbar functionax.errorbar(mean1.index-0.5, mean1, yerr std1, fmt 'ro', ecolor 'r', capsize 4)ax.errorbar(mean1.index, mean2, yerr std2, fmt 'ro',ecolor 'g', capsize 4)ax.errorbar(mean1.index 0.5, mean3, yerr std3, fmt 'ro',ecolor 'b', capsize 4)#Figure settingplt.title('SBP (mmHg), Mean with SD')plt.xlabel('Analysis Visit')plt.ylabel('SBP (mmHg)')#Display plotPharmaSUG SDE 2018 Japanplt.show()9Business Use Only

2. Data Visualization- Line Plot 2 / Patient level plot # Pre-define DFwk1 wk[(wk['PARAMCD'] 'STSBPSI') & (wk['AVISITN'] 199)]arm1 wk1.loc[wk1['TRT01P a'] 'DRUG X']arm2 wk1.loc[wk1['TRT01P a'] 'DRUG Y']arm3 wk1.loc[wk1['TRT01P a'] 'Placebo']# Define array for x-axis label settingvis num np.array([0, 1, 2, 7, 14, 28, 56])vis order np.array(["Baseline", "Day 1", "Day 2", "Week 1", "Week 2", "Week 4", "Week 8"])fig plt.figure(figsize (20,15))ax1 fig.add subplot(222)ax2 fig.add subplot(223)ax3 fig.add subplot(224)Three subplotexampleax1.plot(arm1['AVISITN'], arm1['AVAL'], label 'DRUG X', color 'r', linewidth 0.5)ax2.plot(arm2['AVISITN'], arm2['AVAL'], label 'DRUG Y', color 'g', linewidth 0.5)ax3.plot(arm3['AVISITN'], arm3['AVAL'], label 'Placebo', color 'b', linewidth 0.5)# Common setting to each subplotaxes [ax1, ax2, ax3]for ax in axes:ax.set ylim(75, 170)ax.set xticks(vis num)ax.set xticklabels(vis order, rotation 90)ax.set xlabel("Time")ax.set ylabel("SBP (mmHg)")ax.legend(loc "upper right", labelspacing 1.25)# Adjust width/height spacingfig.subplots adjust(wspace 0.2)plt.show()PharmaSUG SDE 2018 Japan10Business Use OnlyEach plotsettingcan bedone infor loop

2. Data Visualization- Other plots Plot typeFunctionDescriptionQuick examplesTo try below examples, run following code first. import numpy as np import matplotlib.pyplot as plt x1 np.array([1, 2, 3, 4, 5]) x2 np.array([100, 200, 300, 400, 500]) x np.random.normal(50, 10, 1000) y np.random.rand(1000)Histgramshist()Compute and drawthe histogram of x. plt.hist(x, bins 16, range (50, 100),rwidth 0.8, color 'red')Bar chartsbar()The bars arepositioned at x withthe given align ment.Their dimensions aregiven by width andheight. plt.bar(x1, x2, width 1.0,linewidth 3, align 'center',tick label ['Jan', 'Feb', 'Mar', 'Apr','May'])Pie chartspie()Make a pie chart ofarray x. The fractionalarea of each wedge isgiven by x/sum(x). plt.pie(x3, labels ['Tokyo', 'Osaka','Hiroshima', 'Kyoto'],counterclock False, startangle 90,autopct "%1.1f%%") plt.axis('equal')Scatter plotscatter()A scatter plot of y vs xwith varying markersizeJapanand/or color.PharmaSUG SDE 201811Business Use Only plt.scatter(x, y, s 15, c 'blue',marker '*', linewidth '2')

Summary Pandas and Matplotlib enables you to make an easydata visualization from standardized SAS datasets likeCDISC. Using Saspy will make a first step to start Python for aSAS programmer. Thus combination of severallanguage with data handling choices (Saspy, Jupytermagic and Pandas), you may find process improvementin your daily work. References– Matplotlib manual released 2.2.3 https://matplotlib.org/Matplotlib.pdfPharmaSUG SDE 2018 Japan12Business Use Only

data visualization from standardized SAS datasets like CDISC. Using Saspy will make a first step to start Python for a SAS programmer. Thus combination of several language with data handling choices (Saspy, Jupyter magic and Pandas), you may find process improvement in your daily work. References

Related Documents:

Python Programming for the Absolute Beginner Second Edition. CONTENTS CHAPTER 1 GETTING STARTED: THE GAME OVER PROGRAM 1 Examining the Game Over Program 2 Introducing Python 3 Python Is Easy to Use 3 Python Is Powerful 3 Python Is Object Oriented 4 Python Is a "Glue" Language 4 Python Runs Everywhere 4 Python Has a Strong Community 4 Python Is Free and Open Source 5 Setting Up Python on .

Python 2 versus Python 3 - the great debate Installing Python Setting up the Python interpreter About virtualenv Your first virtual environment Your friend, the console How you can run a Python program Running Python scripts Running the Python interactive shell Running Python as a service Running Python as a GUI application How is Python code .

Python is readable 5 Python is complete—"batteries included" 6 Python is cross-platform 6 Python is free 6 1.3 What Python doesn't do as well 7 Python is not the fastest language 7 Python doesn't have the most libraries 8 Python doesn't check variable types at compile time 8 1.4 Why learn Python 3? 8 1.5 Summary 9

site "Python 2.x is legacy, Python 3.x is the present and future of the language". In addition, "Python 3 eliminates many quirks that can unnecessarily trip up beginning programmers". However, note that Python 2 is currently still rather widely used. Python 2 and 3 are about 90% similar. Hence if you learn Python 3, you will likely

There are currently two versions of Python in use; Python 2 and Python 3. Python 3 is not backward compatible with Python 2. A lot of the imported modules were only available in Python 2 for quite some time, leading to a slow adoption of Python 3. However, this not really an issue anymore. Support for Python 2 will end in 2020.

The Monty Python : œuvres (62 ressources dans data.bnf.fr) Œuvres audiovisuelles (y compris radio) (20) Monty Python live (mostly) (2014) Monty Python live (mostly) (2014) Monty Python live (mostly) (2014) "Monty Python, almost the truth" (2014) de Alan Parker et autre(s) avec The Monty Python comme Acteur "Monty Python, almost the truth" (2014)

Launch Eclipse Install Python plug-in for Eclipse Add a Python Interpreter Create a Python Project Create a Python Program Run a Python Program Debug a Python Program 0 Introduction This tutorial is for students who want to develop Python projects using Eclipse. E

Introduction to basic Python Contents 1. Installing Python 2. How to run Python code 3. How to write Python code 4. How to troubleshoot Python code 5. Where to go to learn more Python is an astronomer's secret weapon. With Python, the process of visualizing, processing, and interacting with data is made extremely simple.