Practical 1 - Number of Cyclists

How to Find Columns that are in Common Between Tables?
Use Set and set.intersection(set1, set2) to do this. One use case is to vertically concatenate tables. Such as combine data from different years (2014, 2015, 2016) together.
Year-2014.csv
| Date | Station Number | Number Cyclists |
|---|---|---|
| 2014-01-10 | 897123 | 15 |
| 2014-03-09 | 362300 | 21 |
Year-2015.csv
| Date | Station Number | Number Cyclists | Note |
|---|---|---|---|
| 2015-01-10 | 897123 | 25 | good |
| 2015-03-09 | 362300 | 52 | nice measure |
all_columns = {"Date", "Station Number", "Number Cyclists", "Note" }
common_columns = all_columns.intersection(year_2014.columns.values,
year_2014.columns.values)
The above code will drop the Note column.
info
Set intersection example:
first = {"c", "d", "e"}
second = {"c", "f", "g"}
all_columns = {*first, *second} # {'c', 'd', 'f', 'e', 'g'}
result = all_columns.intersection(first, second)
print(result) # 'c'
How to Convert Date String Into Datetime Object?
Use dataframe.apply(func, axis) or pandas.to_datetime(data["Date"], format="%d/%m/%Y")
Date Information is Located in a Single Column
| Date | Number Cyclists |
|---|---|
| 2014-03-22 | 91 |
| 2014-03-23 | 21 |
data["Date"] = pandas.to_datetime(data["Date"], format="%Y-%m-%d")
Date Information is Spread Across Multiple Columns
| Year | Month | Day | Number Cyclists |
|---|---|---|---|
| 2014 | 03 | 22 | 91 |
| 2014 | 03 | 23 | 21 |
from datetime import datetime
# A new column called 'Date' will be created and its data type is datetime64
data["Date"] = data.apply(lambda row: datetime(row.Year, row.Month, row.Day), axis = 1)
How to Concatenate Tables?
There are two types of concatenation. Horizontal and Vertical.
Horizontal (left and right)
pandas.merge(left,right, on="Column Name")
Vertical (Stack)
pandas.concat([table1, table2, table3])
How to Filter Rows By Date?
For example, get the data between 2014~2018
# 2014-2018
data = data[ (2014 <= data["Date"].dt.year)
& (data["Date"].dt.year <= 2018)
]
warning
- & stands for
and - Operands need to be surrounded by brackets.
References
Practical 1 link: https://github.com/xiaohai-huang/cab420-workspace/tree/master/work/week1/CAB420_Prac1