Posted: November 3rd, 2015
Intro to programming
Part a
Your goal is to generate strings of the form “Lastname, Firstname” where the last
name is randomly chosen from a surnames list and the first name is randomly chosen
from either a female names list or a male names list.
The three lists of names should be downloaded here.
• Surnames list: http://www.ics.uci.edu/~harris/python/surnames.txt
• Female names list: : http://www.ics.uci.edu/~harris/python/femalenames.txt
• Male names list: : http://www.ics.uci.edu/~harris/python/malenames.txt
(1) Write a function called random_names that takes an integer and returns a list
of that many strings, with each string a randomly generated name as described
above.
This function is the ultimate goal of part (a) of this lab assignment. If you and
your partner feel comfortable designing a solution without further guidance, go
right ahead. But if you’d like a little more guidance, that’s also fine. The
remaining subparts of part (a) break the task down and give you some hints and
approaches.
(2) To start, you’ll need to read the three files of names into your program.
You’ll also notice that
there’s more data on each line of the name files than just the name. The first
line of the surnames
file, for example, is “SMITH 1.006 2,501,922 1”, which means that the surname
Smith
accounts for 1.006% of the surnames in America. For this assignment, you can get
by with just
extracting the name and fixing its capitalization.
(3) Your function random_names should call a function to generate a single random
name—a
random surname, a random choice of male or female, and a random first name chosen
from that
list. It will be most convenient for your single-random-name function to call a
function that takes
one of the three name lists as a parameter and returns a name chosen at random
from that list.
Part b
You might not be surprised to know that the Caesar cipher we’ve been working with
—one key
for the whole message, spaces and punctuation unchanged—is pretty easily breakable
by hand
with messages as short as one moderate-length sentence.
(1) Write a function called Caesar_break that takes a ciphertext string (encrypted
using a Caesar
cipher as we did last week) and returns the plaintext for that string, without
having the key.
We’ll take a “brute force” approach:
• We’ll generate decryption alphabets for each of the 26 possible keys.
• We’ll “decrypt” the ciphertext using each of the 26 alphabets. (Only one of
these attempted
decryptions will be the correct plaintext message, of course. But we don’t know
which one in
advance. Trying all the possibilities is what we mean when we call this a “brute
force”
approach.)
• For each of the 26 possibly-decrypted messages, our program needs to figure out
whether it
“looks like English” instead of encrypted gibberish. Here’s how: We’ll take each
word in the
possibly-decrypted message and look it up in a dictionary (a list of English
words). If the word is
in the dictionary, then it’s an English word; if there are a lot of English words
in this possiblydecrypted
message, it’s likely that this message is the correct decrypted plaintext. (If
very few
words in the message are in the dictionary, then this message isn’t the English
plaintext.) So we need to count up how many of the words in each possibly-
decrypted message we find in the dictionary, saving that total along with the
message that produced it.
• Once we’re done with all 26 possible decryptions, we should expect that the
possibly-decrypted
message that had the most “hits” in the dictionary is in fact the correctly
decrypted plaintext,
and that’s the message we return.
To get the dictionary, download the file
http://www.ics.uci.edu/~harris/python/wordlist.txt onto
your machine and read it in to your program. Remove newline characters if
necessary.
Part c
This Python code copies a file, line by line. It presumes that the input and
output files will be in
the same directory (folder) as the code itself.
infile_name = input(“Please enter the name of the file to copy: “)
infile = open(infile_name, ‘r’)
outfile_name = input(“Please enter the name of the new copy: “)
outfile = open(outfile_name, ‘w’)
for line in infile:
outfile.write(line)
infile.close()
outfile.close()
(1) Copy this code into your lab7.py file on your own system (or, temporarily,
into a separate
file if that makes it easier for you to experiment). Package it into a function
called copy_file
that takes no parameters and returns no value (because it does all its work by
prompting the user
and reading and writing files). Test it out by copying a short text file.
Then download the Project Gutenberg version of The Adventures of Sherlock Holmes
from
http://www.gutenberg.org/cache/epub/1661/pg1661.txt (Project Gutenberg is a
wonderful resource for non-copyright-protected texts). Call your file-copying
function to make a
copy of this file. [Some problems have been reported with reading Project
Gutenberg files. If you
run into messages saying that Python can’t decode a character, open the file with
open(infile_name, ‘r’, errors=’ignore’).]
(2) Modify your copy_file function to
take one parameter, a string. If the parameter is ‘line
numbers’, the copied file includes line numbers at the start of each line :
1: Project Gutenberg’s The Adventures of Sherlock Holmes, by Arthur Conan
Doyle
2:
3: This eBook is for the use of anyone anywhere at no cost and with
…
13052: subscribe to our email newsletter to hear about new eBooks.
If the parameter is anything else, the function just copies the file as before.
Note that the line number is formatted and right-justified in a five-character
field.
(3) If you examine the file from Project Gutenberg, you see that it contains some
“housekeeping”
information at the beginning and at the end. You’ll also see that the text itself
starts after a line
beginning with “*** START” and ends just before a line beginning with *”*** END”.
Modify
your copy_file function so that if its parameter is ‘Gutenberg trim’ it will copy
only the
body of a Project Gutenberg file, omitting the “housekeeping” material at the
front and end. (You
may assume—you don’t have to check—that if this parameter is specified, there will
be a
“*** START” line and an “*** END” line in the file.)
(4) Modify your copy_file function so that if its parameter is ‘statistics’ it
will copy the
file as before but also print out these statistics (which should be familiar)
about the text in the
file, following the formatting shown:
16824 lines in the file
483 empty lines
53.7 average characters per line
65.9 average characters per non-empty line
PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET A GOOD DISCOUNT 🙂
Place an order in 3 easy steps. Takes less than 5 mins.