Monday, June 29, 2020

Numbers

 Numbers

An int can be unlimited length.

Very small floats are auto converted to scientific notation when you try and print them.

Very large or very small floats auto convert to an a.bEd format on print. a is a single digit when d is positive. d can be positive or negative and must be greater than 13 or any negative integer to get the auto conversion to occur. Otherwise, the scientific notation is expanded out and a decimal point is used.


population  = 327E+6

print(population)

# result is:
327000000.0

-----------------

population  = 327E+14

print(population)

# result is:

3.27E+16

-----------------

population  = 327E+13

print(population)

# result is:

3270000000000000.0

-----------------------


population  = 327E-1

print(population)

# result is:

32.7

-----------------------


Math can be performed on very small float numbers, even if it means making them smaller.


Printing a complex number that only has a negative imaginary part will format with a -0 for the real part.

imag = -9j

print(imag)

# result is:

-0-9j

pos_imag = 9j

print(pos_imag)

# result is:

9j

import the random module

There is a randrange() function that takes two integers A&B. The range covered is from A to B-1 inclusive. B must be greater than A.

randrange(1,8) covers from 1 to 7 and returns an integer.








Saturday, June 27, 2020

User Input and String Formatting

 User Input


Use the input() function to return a value to the variable. The input function has an optional string prompt that can be displayed to the screen. Python code will wait indefinitely for user input. 


basket = input("groceries to buy?:")
print("The food is ", basket)


After typing in the response to the keyboard at the input function's prompt, just press Enter to move the code along.


String Formatting


Use the format() function with arguments passed into the format function which will match up with pairs of curly braces in a string variable from left to right.


vacations = "There are {} , {} and {} islands."

# format is a member function of the vacations string object.
# just by making a string assignment, you have created a string 
# object and you do not need a constructor for that.

print(vacations.format("Caribbean", "Greek", "Japanese"))

# result is:

There are Caribbean, Greek, and Japanese islands.

Leaving out the curly braces in the string variable with a call to the format function will just ignore any arguments in the format function and print null instead.

vacations = "There are  ,  and islands."

# format is a member function of the vacations string object.

print(vacations.format("Caribbean", "Greek", "Japanese"))

# result is:

There are  ,  and islands.


In the case if you make a mistake where you have more curly braces in a target variable than arguments into the format function, python returns a IndexError: tuple index out of range response.

vacations = "There are {} , {} and {} islands."

# format is a member function of the vacations string object.

print(vacations.format("Caribbean", "Greek"))

# result is:

IndexError: tuple index out of range


If however on the other hand, you have a number of arguments passed into the format() which exceed the number of {} then the extra rightmost arguments are ignored and no errors are generated.

vacations = "There are {} , {} and islands."

# format is a member function of the vacations string object.

print(vacations.format("Caribbean", "Greek", "Japanese"))

# result is:

There are Caribbean, Greek, and islands.


You can specify a large number of field properties also known in python as "standard format specifiers". They allow you to fine-tune the format, placement, or style of a string.


debt = 1000000

# :e is the standard format specifier for scientific notation

pay = "I owe this much: {:e}"


print(pay.format(debt))

# result is:

I owe this much: 1.000000e+06


Arguments to a format function call can be comma separated or key-value pairs or both. They can be of any data type including an index.



There are three options for formatting the curly braces pairs in a target string variable: 


1. You can put the name of a key in the {}; in that case a key-value pair must be passed in for the format function.


dessert = "I like {cookie} cookies, {cake} cakes and {ice_cream} ice cream."

print(dessert.format(cookie = "oatmeal raisin", cake = "vanilla", ice_cream = "strawberry"))

# result is:


I like oatmeal raisin cookies, vanilla cakes and strawberry ice cream.


2. You can use an integer index starting as low as 0 to indicate which of the members of the argument collection in the format call are to be inserted in various places. Placement can be out of order and also arguments can be skipped (not used). It must be in list format; no key=value pairs are allowed.

dessert = "I like {0} cookies, {1} cakes and {2} ice cream."

print(dessert.format(cookie = "oatmeal raisin", cake = "vanilla", ice_cream = "strawberry"))

# result is an error because you used key-value pairs:

IndexError: tuple index out of range

dessert = "I like {0} cookies, {1} cakes and {2} ice cream."

print(dessert.format("oatmeal raisin", "vanilla", "strawberry"))

# result is valid because the flavors are given as a list:

I like oatmeal raisin cookies, vanilla cakes and strawberry ice cream.

This also works, but note you have to index into the list.

foods = ["oatmeal raisin", "vanilla", "strawberry"]
dessert = "I like {0} cookies, {1} cakes and {2} ice cream."

print(dessert.format(foods[0], foods[1], foods[2]))

# result is:

I like oatmeal raisin cookies, vanilla cakes and strawberry ice cream.

3. If you just have a plain empty pair of {} you can use a list or key value pairs.

dessert = "I like {} cookies, {} cakes and {} ice cream."
# arguments as a list
print(dessert.format("oatmeal raisin", "vanilla", "strawberry"))

# result is:

I like oatmeal raisin cookies, vanilla cakes and strawberry ice cream.

The same result is achieved with:

dessert = "I like {a} cookies, {b} cakes and {c} ice cream."
# arguments as key-value pairs
print(dessert.format(a = "oatmeal raisin", b = "vanilla", c = "strawberry"))

# result is:

I like oatmeal raisin cookies, vanilla cakes and strawberry ice cream.


You can even mix key=value pairs and list items in the same format() call:

dessert = "I like {} cookies, {b} cakes and {c} ice cream."

print(dessert.format("oatmeal raisin", b = "vanilla", c = "strawberry"))

# result is:

I like oatmeal raisin cookies, vanilla cakes and strawberry ice cream.


Python is even smart enough to handle key-value pairs and list items out of order:

dessert = "I like {b} cookies, {} cakes and {c} ice cream."

print(dessert.format("oatmeal raisin", b = "vanilla", c = "strawberry"))


# result prints without errors but now the cookies are vanilla and cakes are oatmeal raisin:

I like vanilla cookies, oatmeal raisin cakes and strawberry ice cream.


However, forgetting one of the keys, which is the same as having too many empty {} in a mixed call, will confuse Python and generate an error:

# the b key is left out by mistake, and there are more {} than list items now.

dessert = "I like {} cookies, {} cakes and {c} ice cream."
# oatmeal raisin is a list-type item
print(dessert.format("oatmeal raisin", b = "vanilla", c = "strawberry"))


# result is

IndexError: tuple index out of range


When you're using field properties, again also known as standard format specifiers, you only use the colon once, even if you have multiple options passed in. If you try to use two colons - say one for each of two options - python kicks back a ValueError.

boats = 334
marinas = 7

# :+ puts a plus sign in front of positive numbers by brute force
# :e is for scientific notation
# a specifier of :+:e will generate an error; drop the rightmost of the colons as shown correctly below.

docking = "{1:+e} boats are too many for {0} marinas."

print(docking.format(marinas, boats))

# result is:

+3.340000e+02 boats are too many for 7 marinas.

The example above also works without a run-time python error if you leave off the index of 0 and 1 since variables are being passed in.  However, the meaning of the sentence is confused, so be careful about using indices out of order.


boats = 334
marinas = 7

# :+ puts a plus sign in front of positive numbers by brute force
# :e is for scientific notation
# a specifier of :+:e will generate an error; drop one of the colons as shown correctly below.

docking = "{:+e} boats are too many for {} marinas."

print(docking.format(marinas, boats))

# result is:

+7.000000e+00 boats are too many for 334 marinas.


However, with strings passed in, it will fail because specifiers won't work with pure strings.


docking = "There are only 7 {0} for 334 {1:+e}."

print(docking.format("bays", "ships"))

# results in:

ValueError: Unknown format code 'e' for object of type 'str'


# Also incorrect:

docking = "There are only 7 {} for 334 {1:+e}."

print(docking.format("bays", "ships"))

# results in:

ValueError: cannot switch from automatic field numbering to manual field specification, meaning, mixing empty parentheses with indexed parentheses.


However, turning "ships" into a proper variable gets the job done without error. 

Also, mixing variables and string list items in a format() function call is valid code.

ships = 334

# standard format specifiers and open parentheses are OK 
# in the same string.

docking = "There are only 7 {} for {:e} ships."

print(docking.format("bays", ships))

# results in:

There are only 7 bays for 3.340000e+02 ships.

Finally, you can even use the same index more than once.

docking = "There are only 7 {1} for 334 {1}."

print(docking.format("bays", "ships"))

# results in:

There are only 7 ships for 334 ships.



Anyhow, you get the idea that there many permutations for mixing and matching index vs. empty curly braces vs. variables vs. key=value pairs for getting the format function to handshake with {} in a target string variable.


See the following website for the full description of all the standard format specifiers, also known as custom string formaters:


 a few examples are:

 :,       --> display thousands separator


 :e      --> for scientific format


:f       --> fixed point number format 

:>      --> right align the result


Thursday, June 25, 2020

try ... except blocks

try ... except blocks


Use the try keyword with a colon and some code with the idea that python needs to see if it fails or generates an error. If it does fail you need to have the except keyword with a colon to catch or trap that error or failure.

try:
  print(football)
except:
  print("An exception occurred")

 Optionally you can use an else: with a colon if the try block finishes cleanly without error or failure and also optionally a finally: block which executes regardless of whether the try block's code fails or succeeds, with or without error. The finally block is good for closing files or cleaning up your variable space. 

football = "inflated"

try:
  print(football)
except:
  print("An exception occurred")
else:
  print("Caught for a touchdown!")
finally:
  print("score may have changed")

You can use a series of excepts, one for each type of error.

else will only execute if something went wrong and it must be written after all the except calls. There can only be one else for each try.

football = "inflated"

try:
  print(football)
except:
  print("A name exception occurred")
except:
  print("A type exception occurred")
else:
  print("Caught for a touchdown!")
finally:
  print("score unchanged")

You can force/throw an exception - a programmer induced error or failure - with the raise keyword with no colon. raise needs an argument which is an object of general type of the built-in Exception class and its behavior will be like a real error to the terminal rather than a forced one by the programmer; the two will look the same other than the customized text message, if any (the Exception object can be devoid of any arguments in its parentheses).

time_on_clock = 5

if time_on_clock == 0
    raise Exception("Game over.")
else:
    print("Keep playing")

# result is

Keep Playing

another example:

try:
        print(score)
except:
        raise Exception()

# result is:

NameError: name 'score' is not defined. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "./yourscript", line 123, in Exception

Note there are 2 errors, one naturally occurring from Python and the other forced by the programmer.


You can also raise more specific error types such as TypeError or NameError. See this URL for a complete list of python errors: 


When the programmer uses a raise keyword, no code will be executed after the raise command is complete.




PIP Package Manager

PIP

PIP is a package manager. You should check to see if it's installed by going to your python script subdirectory and typing

 pip --version. 

Version 3.4 and newer should already have PIP, but if not, go here to get it:



Typical ways to use IP would be: 

pip install package-name
pip uninstall package-name
pip list package-name

You can find packages at pypi.org

Package are really just another name for modules, so use import to bring in the package functionality into your code.



Tuesday, June 23, 2020

Regular Expressions

Regex

The name of the built-in package for doing regular expression parsing is re

 An asterisk * means there are zero or more occurrences and * needs a character to the left of it to know what to match against.

 A .* pair is like a wild card.

 \b looks to see if the target string is at the start of a word. You can use the optional lowercase r outside of the double quotes for "raw strings". A "raw string" is one where the \ escape character is NOT honored, and instead treated just like any other character. This is particularly useful for Windows directory pathnames that have pairs of \ or when a string has \ before a letter that normally would escape it.

 Generally lower case specials are for positive findings of a target string and uppercase for NOT finding the target string.

 [] notes a set of eligible matching characters.

 + * . | () $ {} have no special meaning in a set so just treat them literally.

 The indices on a match on a substring goes from the first character matched to the first character not matched.

.search() returns what's called a match object that has its own functions.

.span() returns a tuple of matching substrings

.string() function returns the searchable string itself and 

.group() returns what is actually matching

re objects have four workhorse functions: findall(), search(), split() and sub()

re objects have 10 main metacharacters

[] \ . ^ $ * + {} | ()

A metacharacter is like a placeholder related to a particular type of character and is part of the calling function's parameter list.

re objects have 10 primary special characters

\A \b \B \d \D \s \S \w \W \Z

The special characters are useful when the position of the match matters or when distinct words or whitespace considerations matter.

re objects have 6 primary types of sets:

[letters]: one or more characters possibly unrelated
[a-z]: anything in that continuous range
[^]: one or more NOT in the set
[digits]: one or more numbers possibly unrelated
[0-9]: numbers in the range
[a-zA-Z]: broadens out for capitalization matches

findall(): returns matches in order found.

search(): returns a match object but only for the first match found.

split(): returns all the fragments that match in a tuple.

sub(): replaces all matches with a substitution string.

These functions generally return None if there is no match.

Examples

import re

txt = "That will be 43 pesos"

# Find any 2 consecutive digit characters:

x = re.findall("\d{2}", txt)
print(x)

# result is ['43']

------------------------

import re

txt = "Basketball is my favorite sport."

#Search for a sequence that starts with "ke", followed by two (any) characters, and an "a":

x = re.findall("ke..a", txt)
print(x)


# result is ['ketba']

------------------------------------------


import re

txt = "Basketball is my favorite sport."

# Search for a sequence that begins the string only

x = re.findall("^Bas", txt)
print(x)


# result is ['Bas']


---------------------------------

import re

txt = "Basketball is my favorite sport."

# Search if the target string ends in ort

x = re.findall("ort$", txt)
print(x)


# result is [] because the period was not included in the target of ort


import re

txt = "Basketball is my favorite sport."


# Check if the string contains "ll" followed by 1 or more periods 

# note that since a period is a regex metacharacter, we have to escape it with a \.

x = re.findall("ll\.*", txt)



# result is ['ll is my favorite sport.']

------------------------------

import re

txt = "Basketball is my favorite sport."



# Check if the string contains "a" followed by any characters, and only return the second match. Within that second match, search for "a" again and return everything that comes after it:

x = re.findall("(a(.*)){2}", txt)

print(x)

[('avorite sport.', 'vorite sport.')]


-----------------------------


import re

txt = "Basketball is my favorite sport."

x = re.search("is", txt)

print(x)

# This is what a raw dump of the Match Object looks like.

# result is <_sre.SRE_Match object; span=(11, 13), match='is'>


------------------------------------


import re

txt = "Basketball is my favorite sport."

x = re.split("\s", txt)

print(x)

# result is ['Basketball', 'is', 'my', 'favorite', 'sport.']

-------------------------------------


import re

txt = "Basketball is my favorite sport."

# replace all spaces with underscores
 y = sub("\s", "_", txt)

print(y)

# result is Basketball_is_my_favorite_sport.

# Note that the special character \s has a different meaning in the sub() function vs. the split() function.

---------------------------------------------


import re

# The search() function returns a Match object:

txt = "Basketball_is_my_favorite_sport."

x = re.search("or", txt)
print(x)



# result is <_sre.SRE_Match object; span=(20, 22), match='or'>

# Notice that it only found the first instance of "or" and ignored the "or" in "sport"