Apr 16, 2025 | 356 words | 4 min read
9.2.4. File Analysis#
Instructions#
Write a program that processes the two provided text files
(python_1.txt
and python_2.txt
) and produces the following
output files.
python_1_word_frequency.txt
This file should contain all of the unique words in the file
python_1.txt
printed one per line in alphabetical order. Each word should be followed by a colon, a space, and the number of times that word appears inpython_1.txt
.python_2_word_frequency.txt
This file should contain all of the unique words in the file
python_2.txt
printed one per line in alphabetical order. Each word should be followed by a colon, a space, and the number of times that word appears inpython_2.txt
.common_words.txt
This output file should contain all of the words that appear in both
python_1.txt
andpython_2.txt
, printed one per line in alphabetical order.eitherbutnotboth.txt
This file should contain all of the words that appear in either
python_1.txt
orpython_2.txt
but no words that appear in both. The words should be printed one per line in alphabetical order.
For comparison purposes, all the words should be converted to lower case and all leading and trailing punctuation should be removed.
Hints:
The
string
module includes apunctuation
string which contains all of the ASCII punctuation characters.It might be helpful to write a function that accepts a filename as an argument and returns a list of all the words in that file. Then have another function that accepts a list of words as an argument and returns a dictionary with the unique words in the list as keys and the number of times each word appears in the file as values.
You can use the
sorted()
function to get a sorted list of the keys and values in a dictionary.You can use the
set()
function to create a set from the keys in a dictionary.
Sample Output#
Ensure your program’s output matches the provided samples exactly. This includes all characters, white space, and punctuation. In the samples, user input is highlighted like this for clarity, but your program should not highlight user input in this way.
1a
2also
3an
4and
5application
6applications
7are
8as
9available
10be
11but
12c
13can
14data
15development
16easy
17even
18extension
19for
20has
21have
22high-level
23in
24is
25it
26its
27language
28languages
29library
30many
31may
32modules
33more
34new
35not
36of
37on
38or
39other
40programming
41programs
42python
43several
44simple
45standard
46the
47to
48types
49will
50with
51write
52you
11
2able
3about
4additional
5after
6all
7allows
8api
9appetite
10applicable
11approach
12areas
13around
14arrays
15at
16attempt
17automate
18awk
19basic
20basis
21batch
22because
23being
24best
25binary
26books
27built
28bunch
29c/c++/java
30callable
31changing
32checking
33code
34collection
35comes
36commonly
37complicated
38comprehensive
39computers
40concepts
41contains
42could
43cover
44covering
45custom
46customizable
47cycle
48database
49definition
50depth
51described
52description
53design
54developer
55dictionaries
56distributed
57distributions
58do
59documentation
60does
61domain
62don't
63done
64dynamic
65easily
66effective
67efficient
68elegant
69embedding
70error
71eventually
72every
73example
74examples
75experience
76extended
77extending
78extensions
79extensive
80feature
81features
82files
83find
84first-draft
85flavor
86flexible
87form
88formal
89free
90freely
91from
92functions
93game
94games
95general
96get
97give
98gives
99good
100gui
101hand
102hands-on
103handy
104help
105helps
106https://www.python.org
107idea
108ideal
109if
110implement
111implemented
112informally
113instead
114interpreted
115interpreter
116into
117introduces
118job
119just
120language's
121large
122larger
123learn
124least
125libraries
126like
127lot
128macos
129major
130make
131manual
132maybe
133most
134moving
135much
136nature
137noteworthy
138number
139object-oriented
140objects
141off-line
142offer
143offering
144offers
145operating
146over
147party
148perform
149perhaps
150perl
151photo
152platforms
153pointers
154powerful
155problem
156professional
157program
158python's
159python/c
160quickly
161rapid
162read
163reader
164reading
165ready
166real
167rearrange
168reference
169rename
170reused
171same
172script
173scripting
174scripts
175search-and-replace
176see
177self-contained
178shell
179simpler
180single
181site
182slow
183small
184so
185software
186some
187source
188specialized
189split
190structure
191structures
192style
193such
194suitable
195suite
196support
197syntax
198system
199systems
200take
201task
202tasks
203tedious
204test
205testing
206text
207than
208that
209there
210there's
211these
212things
213third
214this
215those
216time
217together
218too
219tools
220tutorial
221typing
222unix
223use
224used
225usual
226various
227very-high-level
228want
229way
230web
231well
232well-suited
233whetting
234whole
235windows
236wish
237work
238write/compile/test/re-compile
239writing
240written
241yet
242you'd
243you're
244you've
245your
1a: 5
2able: 1
3about: 1
4additional: 1
5after: 1
6all: 2
7also: 3
8an: 3
9and: 20
10api: 1
11application: 1
12applications: 1
13approach: 1
14are: 3
15areas: 1
16as: 2
17attempt: 1
18available: 1
19basic: 1
20be: 5
21binary: 1
22books: 1
23but: 2
24c: 5
25callable: 1
26can: 1
27commonly: 1
28comprehensive: 1
29concepts: 1
30contains: 1
31cover: 1
32covering: 1
33customizable: 1
34data: 2
35definition: 1
36depth: 1
37described: 1
38description: 1
39development: 1
40distributed: 1
41distributions: 1
42documentation: 1
43does: 1
44dynamic: 1
45easily: 1
46easy: 1
47effective: 1
48efficient: 1
49elegant: 1
50embedding: 1
51even: 1
52every: 2
53examples: 1
54experience: 1
55extended: 1
56extending: 1
57extension: 1
58extensions: 1
59extensive: 1
60feature: 2
61features: 2
62flavor: 1
63for: 5
64form: 1
65formal: 1
66free: 1
67freely: 2
68from: 2
69functions: 1
70give: 1
71gives: 1
72good: 1
73hands-on: 1
74handy: 1
75has: 1
76have: 1
77helps: 1
78high-level: 1
79https://www.python.org: 1
80idea: 1
81ideal: 1
82implemented: 1
83in: 6
84informally: 1
85instead: 1
86interpreted: 1
87interpreter: 4
88introduces: 2
89is: 3
90it: 5
91its: 1
92language: 6
93language's: 1
94languages: 1
95learn: 2
96library: 4
97major: 1
98make: 1
99manual: 1
100many: 3
101may: 1
102modules: 4
103more: 2
104most: 2
105nature: 1
106new: 1
107not: 1
108noteworthy: 1
109object-oriented: 1
110objects: 1
111of: 6
112off-line: 1
113on: 1
114or: 5
115other: 1
116party: 1
117platforms: 2
118pointers: 1
119powerful: 1
120programming: 2
121programs: 2
122python: 16
123python's: 2
124python/c: 1
125rapid: 1
126read: 3
127reader: 1
128reading: 1
129ready: 1
130reference: 2
131same: 1
132scripting: 1
133see: 1
134self-contained: 1
135several: 1
136simple: 1
137single: 1
138site: 2
139so: 1
140source: 1
141standard: 4
142structures: 1
143style: 1
144suitable: 1
145syntax: 1
146system: 1
147the: 17
148there: 1
149third: 1
150this: 2
151to: 9
152together: 1
153tools: 1
154tutorial: 4
155types: 1
156typing: 1
157used: 1
158various: 1
159web: 1
160well: 1
161will: 3
162with: 2
163write: 2
164you: 3
11: 1
2a: 21
3allows: 1
4also: 1
5an: 1
6and: 10
7appetite: 1
8applicable: 1
9application: 2
10applications: 1
11are: 2
12around: 1
13arrays: 1
14as: 4
15at: 2
16automate: 1
17available: 1
18awk: 1
19basis: 1
20batch: 2
21be: 1
22because: 1
23being: 1
24best: 1
25built: 1
26bunch: 1
27but: 4
28c: 1
29c/c++/java: 2
30can: 4
31changing: 1
32checking: 1
33code: 1
34collection: 1
35comes: 1
36complicated: 1
37computers: 1
38could: 3
39custom: 1
40cycle: 1
41data: 3
42database: 1
43design: 1
44developer: 1
45development: 1
46dictionaries: 1
47do: 1
48domain: 1
49don't: 1
50done: 1
51easy: 1
52error: 1
53even: 2
54eventually: 1
55example: 1
56extension: 1
57files: 5
58find: 3
59first-draft: 1
60flexible: 1
61for: 7
62game: 1
63games: 1
64general: 1
65get: 2
66gui: 2
67hand: 1
68has: 1
69have: 1
70help: 1
71high-level: 1
72if: 2
73implement: 1
74in: 5
75into: 1
76is: 6
77it: 4
78its: 1
79job: 1
80just: 1
81language: 5
82languages: 1
83large: 3
84larger: 1
85least: 1
86libraries: 1
87library: 1
88like: 2
89lot: 1
90macos: 1
91many: 1
92may: 2
93maybe: 1
94modules: 2
95more: 4
96moving: 1
97much: 4
98new: 1
99not: 1
100number: 1
101of: 7
102offer: 1
103offering: 1
104offers: 1
105on: 3
106operating: 1
107or: 8
108other: 2
109over: 1
110perform: 1
111perhaps: 2
112perl: 1
113photo: 1
114problem: 1
115professional: 1
116program: 4
117programming: 1
118programs: 3
119python: 8
120quickly: 1
121real: 1
122rearrange: 1
123rename: 1
124reused: 1
125script: 1
126scripts: 2
127search-and-replace: 1
128several: 1
129shell: 3
130simple: 2
131simpler: 1
132slow: 1
133small: 1
134software: 1
135some: 2
136specialized: 1
137split: 1
138standard: 1
139structure: 1
140such: 2
141suite: 1
142support: 1
143systems: 1
144take: 1
145task: 2
146tasks: 1
147tedious: 1
148test: 1
149testing: 1
150text: 2
151than: 3
152that: 4
153the: 6
154there's: 1
155these: 1
156things: 1
157those: 1
158time: 1
159to: 10
160too: 1
161types: 2
162unix: 2
163use: 4
164usual: 1
165very-high-level: 1
166want: 1
167way: 1
168well-suited: 1
169whetting: 1
170whole: 1
171will: 1
172windows: 2
173wish: 1
174with: 2
175work: 2
176write: 3
177write/compile/test/re-compile: 1
178writing: 2
179written: 1
180yet: 1
181you: 11
182you'd: 2
183you're: 2
184you've: 1
185your: 4
Deliverables#
Save your finished program as file_analysis_login.py
,
replacing login
with your Purdue login. Then submit it along with
all the deliverables listed in
Table 9.8 below.
Deliverable |
Description |
---|---|
|
Your finished program. |
Screenshot(s) |
PNG(s) capturing the test case. |