Pandas Basics
Reading an excel sheet
Typical just use: import pandas as pd
pd.read_excel(“workbook_name.xlsx”)
Read specific sheet
pd.read_excel(“workbook_name.xlsx”, sheetname = “Sheet1)
Combining two data frames
df4 = pd.DataFrame({‘B’: [‘B2’, ‘B3’, ‘B6’, ‘B7’], …: ‘D’: [‘D2’, ‘D3’, ‘D6’, ‘D7’], …: ‘F’: [‘F2’, ‘F3’, ‘F6’, ‘F7’]}, …: index=[2, 3, 6, 7]) …:
In [9]: result = pd.concat([df1, df4], axis=1, sort=False) see [https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html]
Slicing a dataframe by row value in specific column
Readin in a json
import pandas as pd patients_df = pd.read_json(‘E:/datasets/patients.json’)
see [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html] or [https://stackabuse.com/reading-and-writing-json-files-in-python-with-pandas/]
Read in a text file as a table
’’’ pd.read_csv(‘data/nodes.txt’,sep=”\t”,header=None) ‘’’
see [https://stackoverflow.com/questions/25013792/how-to-read-a-dataset-from-a-txt-file-in-python]
Change entire series to a dictionary
use the map function [https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html]
Two series to dictionary
pd.Series(df.A.values,index=df.B).to_dict()
see https://stackoverflow.com/questions/17426292/what-is-the-most-efficient-way-to-create-a-dictionary-of-two-pandas-dataframe-co
Mask dataframe if value is in list
use .isin() df[df[‘A’].isin([3, 6])] see https://stackoverflow.com/questions/12096252/use-a-list-of-values-to-select-rows-from-a-pandas-dataframe