General Notes
Everything is an object in Python.
in command prompt: python file_name.pyin command promot: type ‘python’ to access python directly.in command prompt: type ‘quit()’ to quit python
# – symbol for single line comments.
""" - symbol for multi line comments.
B = None # placeholder for setting a variable but not defending it.
Variables are dynamically typed. Meaning the following code:
x = 4
x ==4.0
x #will equal 4, not 4.0.
Commenting Best Practices
# – symbol for single line comments.
“””
This is a multi line comment.
“””
comment block at the start of a function.
Comment anything that is unclear, like math.
Comment loop, if-then, etc if it is complex.
Commenting is about explaining motive.
Data Types
Boolean – True or False
Integer – no decimals
floating point – decimals, real numbers. 0.1+0.2-0.3 should equal 0, but with floating points due how how they work it will not.
Strings – characters, basically text. They are set by single or double quotes, but it is better practice to use double quotes.
NoneType – None. This is basically null??
type() – this function will tell you what data type a variable is
Math
– # substraction
+ # addition
/ # Divide
// # Divide but round down to nearest whole number.
* # multiply
% # modulus, calculates remainder of a division
** # power
() # for order, as usual.
== # Equals
!= # Does not equal
a > b # a is greater than b
a < b # a is less than b
a >= b # a is greater than or equal to b
a <= b # a is less than or equal to b
round() #Rounds to nearest whole number, up from 5 onwards down from. Do more work here???
int() #Converts to integer, rounds down.
Variables
Python is dynamically typed, you can create a variable that starts as an integer and then moves on to be defined later as a string. You dont need to define data type.
my_variable = 3 # define variable ‘my_variable’ the number 3.type(my_variable) # return what type of variable ‘my_variable’ is.
variable as a pointer???? What what what???
Strings
Strings are immutable in Python, they may not be changed.
Strings in python are thought of as a sequence of characters or a list. As a result of this the following code:
my_variable = 'John' list(my_variable)
Will output:
[‘J’,’o’,’h’,’n’]
my_variable = ‘john’
my_variable = “john”
3 * “john” = “johnjohnjohn”
“o” in my_variable = True # Searches my_variable for the letter o.
Char Indj – 0o – 1h – 2n – 3
IE: my_variable[1] would return the letter o, by itself.
Reverse indexing also exists, so my_variable[-1] would return the letter n.
Slicing: my_variable[1:3] would output ohn
my_variable[2:] would output hn, from the 2 onward.
my_variable[:2] would output joh, from the 0 up to 2.
my_variable[::] would output john, from the beginnings to end.
my_variable[::2] would output jh, from the beginnings to end with jump sizes of 2.
my_variable[0:1:2] would output j, from 0 to 1 with jump sizes of 2.my_variable[::-1] reverse the string
\n #add new line
len(“I am”) #counts the string’s characters, in this case it would be 4.
string concatanation: my_variable + ‘ Smith’ #this would output John Smith
if you type my_variable. and press tab you will get a list of commands such as lower, capitalise, etc that will allow you to do stuff with the string automatically.
my_variable.upper() # this would output JOHN, all in capitals. It needs the paranthesis.
.format() method
print(‘This is a string{}’.format(‘INSERTED_TEXT)) # This will output ‘This is a string INSERTED_TEXT.’
print(‘The {} {} {}’.format(‘TEXT_A’,’TEXT_B’,’TEXT_C’)) # This will output ‘The TEXT_A TEXT_B TEXT_C’
print(‘The {2} {1} {0}’.format(‘TEXT_A’,’TEXT_B’,’TEXT_C’)) # This will output ‘The TEXT_C TEXT_B TEXT_A’. The order is reversed, as per the numbers.
print(‘The {2} {2} {2}’.format(‘TEXT_A’,’TEXT_B’,’TEXT_C’)) # This will output ‘The TEXT_C TEXT_C TEXT_C’. TEXT C is all of the outputs now.
print(‘The {C} {C} {C}’.format(A=’TEXT_A’,B=’TEXT_B’,C=’TEXT_C’)) # This will output ‘The TEXT_C TEXT_C TEXT_C’. In a way this is the same as a variable assignment, assining ‘TEXT_A’ to the variable A.
variable = 200/777print(“The result is {z:1.3f}”.format(z=variable)) #This assigns variable to Z, and the {z:1.3f} aspect displays Z, with a spacing of 1 and a precision or deciminal point of 1.3. The f after it is needed for the syntax.
New to python 3.0″
name = “John”print(f’Hello, my name is {name}’) # the f at the start, an f-string, is a fast way to format strings.
my_variable = input(‘Type your input here:’) # allows you to get input from the user. Type your input is the label that is displayed.
print(‘the conents of the variable are’ + my_variable + ‘.’)
‘a’ in ‘apple’ # This checks if the letter a is in apple, and returns boolean True or False.
Lists
list_a = [‘string’,5,15.5] # This creates a list called ‘list_a’. Lists in python can store different variable types.
len(list_a) # checks the length of list_a, in the case of list_a the answer would be 3.
list_a[0] # chooses the first option from the list, in this case it would be ‘string’list_a[1:] # chooses variables from 1 onward, in this case it would be 5 and 15.5. Remember that the count in python starts from 0.
after typing list_a. hit tab to get a list of methods you can apply to the list.
list_a.append(‘hello’) # this would append the word hello to the end of the listlist_a.pop() # this will remove an item from the end of a list. It defaults to -1.list_a.pop(1) # this will remove the item in location 1 from the list, so the second item.list_a.sort() # this will sort the list based on alphabetical or numerical order depending.
list_a = [5,[4,15.5]] # It is possible to nest one list inside another, in this case [4,15.5] is a list nested within list_a.
Examples:
list_a = [5,15.5] sum(list_a) #This will sum the contents of list_a
['a1', 'b2', 3] * 3
cheeses = ['Cheddar', 'Edam', 'Gouda'] 'Edam' in cheeses
cheeses = ['Cheddar', 'Edam', 'Gouda'] 'Edam' not in cheeses
Dictionaries
Retrieves item by a key name rather than location in array like a list. Dictionaries do not have an order.
Dictionaries are not immutable and can be edited. They can be redefined.
dict_name = {‘key_name_1′:’value1′,’key_name_2′:’value2’} # This creates a list of dictionary values that can be called on by referencing the key name.dict_name[‘key_name_1’] # Typing this command would output ‘value1’
dict_name = {‘key_name_1′:’value1′,’key_name_2′:{’embeded_key_name’:500}} # Note you can have a dictionary or list inside another dictionary. Basically stacked one on the other.dict_name[‘key_name_2′][’embeded_key_name’] # This would output 500. The first brackets are the first outer key, the second brackets are the inner key.
dict_name.keys() # Lists all the key namesdict_name.values() # Lists all the values
Tuples
Like a list, but immutable.
tuple_name = (1,2,3) # Created the same as a list but use paranthesis rather than brackets.
len(tuple_name) # will output 3, as that is the length of the tuple.
tuple_name. # If you hit tab after, there are some methods that can act on the tuple.
Sets
#???? Investigate single character/string issue with sets to understand it.
set_name = set() # Creates an empty setset_name.add(1) # Adds the number 1 to the set. YOu cannot add copies, values need to be unique. There can only be one ‘1’ value, so you cannot add 1 twice in the way you could with a list or dictionary.set_name # Would output {1}
Booleans
Needs first letter capitalised. True and False.
B = None # placeholder for setting a variable but not defending it.
I/O
%%writefile text_file.txt # This creates a txt file called text_filemyfile = open(‘text_file.txt’) # Open file called text_file.txt and assign it to myfile. The file path can go here.
myfile.read() # Outputs the contents of the file from the start to the end, then leaves the cursor at the end of the file.
myfile.seek(0) # Places cursor at the very start of the file. Changing 0 to another value will change this.
myfile.readlines() # Reads the file, but organizes it into strings based on lines.
myfile.write(‘John Smith’) # This writes the string ‘John Smith’ to a text file.
myfile.close() # Closes the opened file.
file_contents = myfile.read() # Assigns contents of read to file_contents variable.
with open(‘text_file.txt’) as myfile # Automatically closes file after reading it.file_contents = my_new_file.read() # Assigns files contents to file_contents
with open(‘text_file.txt’, mode=’r’) as myfile # mode, this stands for read. There are other options like ‘w’ for write. Basically restricts what can be done, its permissions esentially.file_contents = my_new_file.read()
Modes:
r – readw – writea – appendr+ – read and write ??w+ – write and read ??
pwd # ?????? current File path???
Comparison Operators
1 == 1 # Checks for equality, in this case it will output True. Works for strings.1 == 2 # Checks for equality, in this case it will output False. Works for strings.1 != 1 # Checks for inequality, in this case False because 1 is equal to 1.1 > 2 # Checks if 1 is greater than 2, FALSE1 < 2 # Checks if 1 is less than 2, TRUE1 >= 2 # Checks if 1 is greater or equal to 2, FALSE1 <= 2 # Checks if 1 is lesser or equal to 2, TRUE
Logical Operators
1 < 2 and 1 < 3 # If 1 is less than 2 and 1 is less than 3, it will return TRUE. Basically combining the two tests. If one test fails, it all returns a False
(1 > 2) or (1 < 3) # If one of the conditions is True, the statement is True. Else it is False. Note I have included paranthesis to make things cleaner.
not(1 < 2) # If 1 is not less than 2, return True. Else False. It basically just inverts the booleans.
Statements – Control Flow
Python does use indentation as part of the code.
If Statements (Branching)
if condition_x # if condition_x is met, indented code will be executed.
elif condition_y # else if condition_y is met, indented code will be executed.
else # else excute this indented code when all other conditions have failed to be met.
remember that indention (tab or four spaces) is required by convention.
pass # this is to add a way to skip if statement, so you can add the internal logic later.
Examples:
if 3 * 2 == 5: print("Yes")
var = "Word" if var == "word": print("lower case w") elif var == "Word": print("upper case w") else: print("not word at all")
For Loops
list_name = [1,2,3,4,5,6,7,8,9,10] # just creating a list
for variable_name in list_name: # for every entry in list_name print(variable_name) # print its name
list_name = [1,2,3,4,5,6,7,8,9,10] # just creating a listlist_sum = 0
for variable_name in list_name: list_sum = list-sum + variable_name # this just sums all the item in list_name, then assigns that sum to list_sum. Its count is defined before as 0.
print(list_sum) # Displys the variable list_sum. If this were indented it would be considereed part of the for loop, and would print for each calculation until it reached 55.
for variable_name in ‘random text’: print(variable_name) # This would print each letter, one after the other. If you put ‘variable_name’ instead it would print the word variable_name every time the loop occured.
list_name = [(1,2),(3,4),(5,6)] #While this list as sub items, it actually is considered to have 4 items.
for variable_name in list_name: print(variable_name) # This will print 3 lines: 1: (1,2) 2: (3,4) 3:(5,6)
# If instead of the above, the following coded were run:
for a,b in list_name: print(a) # This will print 3 lines: 1: 1 2: 3 3: 5
# A dictionary can also be used as follows
dictionary_name = {}
for variable_name in dictionary_name: print(variable_name)
While Loops
while boolean_condition # execute indented code.else # If none of the while conditions are met, excute this code.
a_variable = 0
while a_variable < 5:print(f’The value of a_variable is {a_variable}’) a_variable = a_variable + 1 # This can also be written as a_variable += 1else: print(‘a_variable is not less than 5 so loop stops’)
# This will print the value of a_variable and then add 1 to it, stopping when it reaches 5.
a_list = [1,2,3]for variable_name in a_list # random comment that python won’t like as it expects code pass # gives python something, while this remains empty. Placeholder so code can still excude, ie: pass over this.
string_var = ‘John’for letter in string_var: if letter == ‘o’: # If this conditions is met continue runs, which will revert loop to start without running that which is after continue, so the letter o will not be printed. continue print(letter)
Jhn
string_var = ‘John’for letter in string_var: if letter == ‘o’: # If this conditions is met break runs, which will stop the loop break print(letter)
J # note that only j will be output, because teh moment it gets to o the loop stops.
Examples:
num = 0 while num &amp;amp;amp;amp;amp;amp;amp;lt;= 3: print(num) num += 1 print("Out of loop") print(num)
Operators
for num in range(0,10,2): # 0 is the starting number, 10 is the maximum and 2 is the interval jump between numbers. print(num) # This will ouput 0,2,4,6,8 on a single line respectively.
Enumerate:
# The following code counts through the letters a,b,c,d,e.index_count = 0 word = ‘abcde’for letter in word: print(word[index_count]) index_count += 1
# An alternative to the above is to use the enumerate function, which does the same thing.
word = ‘abcde’for item in enumerate(word): print(item)
# Basically the above is a simpler way, it will return tuples.
Zip:
a_list1 = [1,2,3]a_list2 = [‘a’,’b’,’c’]
for item in zip(a_list1,a_list2): print(item)
# Zip will only add together lists up to the shortest count. So if there were a list, 1,2,3 and another a,b,c,d the ‘d’ would be cut off as it goes beyond the 3 of the smaller list.
in:
‘b’ in [1,2,3] # Checks if b is in the list, in this case will return False.’b’ in [‘a’,’b’,’c’] # Checks if b is in the list, in this case will return True.’b’ in ‘bravo’ # Checks if b is in the string, in this case will return True.
min and max:
list_name = [1,10,100]min(list_name) # This will return the minimum value in list_name, which is 1max(list_name) # This will return the maximum value in list_name, which is 100
from random import shuffle # This imports the function shuffle from the random library.shuffle(list_name) # This will shuffle the contents of list_name then ouput them.
List Comprehensions:
a_string = ‘word’
a_string = [x for x in ‘word’] # This will return [‘w’,’o’,’r’,’d’]
??? do more here ???
Methods and Functions
attributes are objects stored inside other objects
methods are functions associated with an object
Functions basically wrap code, called encapsulation.
a_list = [1,2,3] # Create a list with values 1,2,3.
a_list. # after typing this hit tab to see all potential methods available.
help(a_list.method_name) # This will provide some help documentation that explains what method_name does.
Functions:
def my_function(): # this creates a function called my_function. Inside the bracket is a list of parameters.
return [expression] # This is the end of the function, by default returns none but can return tuples and the like.
Examples:
PIP
Libraries
pysvg – library for creating svg graphic files in python, useful for creating web displays at times?
http://codeboje.de/pysvg/ –
SQL
psycopg2 – Library for interacting with python postgreSQL database.
import psycopg2 as pg2 # this imports the library and gives it the call pg2, for simplicity.
conn = pg2.connect(database=’database_name’,user=’postgres’,password=’the_password’) #this connects to the database. Database is the database name/file, user is the username which defaults to postgress and password is the login password.
cur = conn.cursor() # cur stands for cursor, which is the control structure. Basically it allows you to input commands in SQL.
cur.execute(‘SQL_GOES_HERE’) # Whatever you want to do, the SQL goes in here.
cur.fetchmany(10) # this will fetch the first 10 rows, by pressing tab after typing cur. you will get a list of commands.
data = fetchmany(10) # Assigns the contents of fetchmany(10) query to data
data[0][2] #selects a specific aspect of the fetchmany(10) query mentioned. This selects the first[0] row, third[2] item.??????
conn.close() # closes the connection to the database.
Data Science
To Start
Conda – Packaging of environments. Miniconda is bare minimum, anaconda is miniconda plus heaps of packages.
Using this you can install packages with:
$ conda install numpy
You can create test environments with:
$ conda create -n
Libraries
Importing Example:
%matplotlib inline import matplotlib import numpy as np import matplotlib.pyplot as plt
The main five libraries:
scipy – For computation. Plotly.
MatplotLib – Basic data visualization. It is a lot like matlab. Pandas and Matplotlib work well together.
ipython – For shell.
NumPy – For unified array library. Scientificic comput library for data arrays. A lot of core math stuff like Fourier transforms are done in this language. The key to using NumPy is vectorization. This is the key math library and is for doing numerical mathematical calculations as opposed to symbolic.
Pandas – Data analysis library for reading and manipulating from multiple sources, csv, etc. If you have data stored somewhere you use pandas to get it, basically. Pandas can also do SQL-like grouping operations. Pandas and Matplotlib work well together.
The other libraries:
Seaborn – For creating statistical plots.
SciKit- Machine learning, regressions, etc. Basically fitting lines to data.
PySpark – Big data technology
Tensor Flow – machine learning, google’s algorithm etc.
Transfer learning: Transferring previously learnt information to a new session.
Bokeh – renders plot in browser, interactive. ???
Plotly – renders plot in browser, interactive.
plot9 – is ggplot, that is used in R, for python. Incomplete though.
Dask – Parallel computation.
Numba – code optimization. Speeds up math by converting python to LLVM quickly as it runs.
Cython – code optimization. Speeds up python by converting python to C.
Scrapy – for scraping data from the web https://scrapy.org/
Sympy – For doing symbolic mathematics, in contrast to numpy doing numerical maths.
Mayavi – for advanced 3d beyond what matplotlib offers http://code.enthought.com/pages/mayavi-project.html
MoviePy – for video editing https://zulko.github.io/moviepy/
NumPy
import numpy as np
Pandas
import pandas as pd
pd.<TAB> # to open selection menu.
Pandas Series
The following code will create a pandas series. Sort of a one dimensional array with an index that can be referenced with the words stored in it:
dataVar = pd.series ([1,2.4,3],index=['First','Second','Third'])
dataVar['First'] # This will access the first item in dataVar
dataVar['First':'Second'] # This will access the all items in the range from first to second
If no index is defined, the default index is assigned, aka 1,2,3,4:
dataVar = pd.series ([1,2.4,3])
dataVar[0] # This will access the first item in dataVar
dataVar[0:1] # This will access the all items in the range from 0 to 1
You can load dictionaries into a pandas series to give the dictionary an index. In fact, a pandas series is basically a dictionary with an index.
dataVar = pd.series(dictionaryVar) # Converts dictionary to a pandas series. Index defaults to dictionary keys
Pandas DataFrame
fnameVar = pd.dataframe({‘ColumnName1’:dataVar, ‘ColumnName2’:dataVar2 }) # This creates a dataframe with two columns, called ColumnName1 and ColumnName2. It then assigns dataVar to ColumnName1 and dataVar2 to ColumnName2.
fnameVar # Displays the dataframe called fnameVar
dataframes have an index and a column.
fnameVar.index # Selects indices from the dataframe called ‘fnameVar’
fnameVar.columns # Selects columns from the dataframe called ‘fnameVar’
When a dataframe is referenced it defaults to accessing columns.
Pandas Index
ind = pd.Index([2, 3, 5, 7, 11])
indA = pd.Index([1, 3, 5, 7, 9])
indB = pd.Index([2, 3, 5, 7, 11])
indA & indB # intersection – items appearing in both
indA | indB # union – items appearing in either
indA ^ indB # symmetric difference
indA.difference(indB) # This does, difference – items in A but not in B
MatplotLib
Matplotlib is basically for plotting graphs and displaying them.
import matplotlib as mpl
import matplotlib.pyplot as plt
plt.style.use('classic') # This sets the visual style, we are going with classic.
plt.show # This will display the plot in a window when the code is run.
Styles
plt.style.use('classic') # The default, classic style.
plt.style.use('seaborn-whitegrid') # This style is a nice clean white grid.
Jupyter Notebook Specific
%matplotlib inline # This will display static plots in the notebook
%matplotlib notebook # This will display interactive plots in the notebook.
Object Oriented Interface
import matplotlib as mpl # imports ... import matplotlib.pyplot as plt # imports ... plt.style.use('seaborn-whitegrid') # Sets the style for the graph. import numpy as np fig = plt.figure() ax = plt.axes() x = np.linspace(0, 10, 1000) # Start at 0, end at 10, the 1000 is the number of points plotted between 0 and 10. ax.plot(x, np.sin(x)); # first value, x, is the x-axis. Second value, np.sin(x), is y or f(x). plt.show
MatLab Style Interface
plt.plot(x, np.sin(x));
Differences between matlab style and object orientated style.
plt.xlabel()
→ax.set_xlabel()
plt.ylabel()
→ax.set_ylabel()
plt.xlim()
→ax.set_xlim()
plt.ylim()
→ax.set_ylim()
plt.title()
→ax.set_title()
Coloring a plot
plt.plot(x, np.sin(x – 0), color=’blue’) # specify color by name
plt.plot(x, np.sin(x – 1), color=’g’) # short color code (rgbcmyk)
plt.plot(x, np.sin(x – 2), color=’0.75′) # Grayscale between 0 and 1
plt.plot(x, np.sin(x – 3), color=’#FFDD44′) # Hex code (RRGGBB from 00 to FF)
plt.plot(x, np.sin(x – 4), color=(1.0,0.2,0.3)) # RGB tuple, values 0 to 1
plt.plot(x, np.sin(x – 5), color=’chartreuse’); # all HTML color names supported
Line Style of Plot
plt.plot(x, x + 0, linestyle=’solid’)
plt.plot(x, x + 1, linestyle=’dashed’)
plt.plot(x, x + 2, linestyle=’dashdot’)
plt.plot(x, x + 3, linestyle=’dotted’);
# For short, you can use the following codes:
plt.plot(x, x + 4, linestyle=’-‘) # solid
plt.plot(x, x + 5, linestyle=’–‘) # dashed
plt.plot(x, x + 6, linestyle=’-.’) # dashdot
plt.plot(x, x + 7, linestyle=’:’); # dotted
Both at Once
plt.plot(x, x + 0, ‘-g’) # solid green
plt.plot(x, x + 1, ‘–c’) # dashed cyan
plt.plot(x, x + 2, ‘-.k’) # dashdot black
plt.plot(x, x + 3, ‘:r’); # dotted red
Adjusting Plot Axes
plt.plot(x, np.sin(x))
plt.xlim(-1, 11)
plt.ylim(-1.5, 1.5);
plt.plot(x, np.sin(x))
plt.axis([-1, 11, -1.5, 1.5]);
plt.plot(x, np.sin(x))
plt.axis(‘tight’);
plt.plot(x, np.sin(x))
plt.axis(‘equal’);
Labelling Plots
plt.plot(x, np.sin(x))
plt.title(“A Sine Curve”)
plt.xlabel(“x”)
plt.ylabel(“sin(x)”);
plt.plot(x, np.sin(x), ‘-g’, label=’sin(x)’)
plt.plot(x, np.cos(x), ‘:b’, label=’cos(x)’)
plt.axis(‘equal’)
plt.legend();
Scatter Plots
x = np.linspace(0, 10, 30) # This defines x as ranging from 0 to 10, with 30 values placed between.
y = np.sin(x) # This defines y as sin(x)
plt.plot(x, y, ‘o’, color=’black’); # the basic plot needs an x and a y defined to form a scatter plot.
plt.scatter(x, y, marker=’o’); # this will create a scatter plot from the start, regardless of the charts properties.
rng = np.random.RandomState(0)
x = rng.randn(100)
y = rng.randn(100)
colors = rng.rand(100)
sizes = 1000 * rng.rand(100)
plt.scatter(x, y, c=colors, s=sizes, alpha=0.3,
cmap=’viridis’)
plt.colorbar(); # show color scale
Note: plt.plot is more efficient plt.scatter.
Error Bar
These are like the hocky stick graph of climate change.
One cool thing is that you can use a Gaussian generator from scikitlearn to generator a confidence interval around the data, even on a continuous chart like a line graph.
Histogram
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
plt.style.use(‘seaborn-white’)
data = np.random.randn(1000)
plt.hist(data);
Contour Plot
Plot Legends
leg = ax.legend(); # enables a plot legend, defaults to a box with some basic details.
3d Plotting
https://jakevdp.github.io/PythonDataScienceHandbook/04.12-three-dimensional-plotting.html
from mpl_toolkits import mplot3d #
3d Plotting
mplot3d
https://jakevdp.github.io/PythonDataScienceHandbook/04.12-three-dimensional-plotting.html
This comes by default with matplotlib and is imported with the following command:
from mpl_toolkits import mplot3d #
Mayavi
Mayavi is more detailed than mplot3d, but more complex to install and use as well.
http://code.enthought.com/pages/mayavi-project.html
https://www.scipy-lectures.org/advanced/3d_plotting/index.html
Common Issues
One day something will go here, but today is not that day.
Practice Questions
Basic Practice:
http://codingbat.com/python
More Mathematical (and Harder) Practice:
https://projecteuler.net/archives
List of Practice Problems:
http://www.codeabbey.com/index/task_list
A SubReddit Devoted to Daily Practice Problems:
https://www.reddit.com/r/dailyprogrammer
A very tricky website with very few hints and touch problems (Not for beginners but still interesting)
http://www.pythonchallenge.com/
Extra
https://colab.research.google.com/ – A potential alternative to anacadona. Needs chrome and is available for use online.
https://tools.google.com/seedbank/ – machine learning examples.