2025 FALL CS-007 PROJECT #6. Jumbles

YOU -MUST- EXECUTE LIKE THIS: C:> java Jumbles dictionary.txt jumbles.txt
You must put the dictionary file FIRST on the cmd line.

YOU MUST HAND IN A FILE NAMED: Jumbles.java

There are many possible overall approaches to solving the Jumbles problem. You may solve it any way you can as long as you avoid processing any dictionary word more than once. Also you must generate the EXACT OUTPUT and you must *NOT* use TreeSet, TreeMap, HashSet, HashMap etc or any variant of these templated <> containers. Stick to plain arrays and ArrayLists. You may emulate a HashMap using ArrayList and or plain arrays if you happen to know what a HashMap is, but you may not use Java's Hash or Tree containers because those containers actually trivialize the problem. For this exercise you must use only the tools we have given you thus far.

READ THIS HELP DOCUMENT: A JUMBLES STRATEGY
REMOVE ALL COMMENTS AND ALL TIMING CODE FROM YOUR PROGRAM BEFORE HANDING IN

The Assignment

You do not get a starter file. You must write it completely from scratch. Use Rec#7

There is NO PARTIAL CREDIT ON OUTPUT.

Your program should take two command args: a dictionary file and a jumbles file. For each jumbled word you must find all the matching dictionary words and print them after the jumbled word in the line.

Here is a tiny version of each input that you should test with until you are sure your code is correct:
tinyDictionary.txt and tinyJumbles.txt and correct output for it: tinyCorrectOuput.txt.

Here is the dictionary.txt and jumbles.txt I will test with.

Here is the correct execution command and output for the big files from a sample run of my solution: YOUR OUTPUT MUST MATCH EXACTLY. Notice the list of jumbles words is sorted vertically and the matching dictionary words that come after are sorted left to right.


java  Jumbles dictionary.txt jumbles.txt   // <== dictionary.txt is args[0]   then jumbles.txt is args[1]
addej jaded
ahicryrhe hierarchy
alvan naval
annaab banana
baltoc cobalt
braney nearby
celer creel
couph pouch
cukklen knuckle
dica acid cadi caid
dobeny beyond
dobol blood
dufil fluid
dupled puddle
eslyep sleepy
ettniner renitent
ettorp potter
genjal jangle
gluhc gulch
hartox thorax
hoybis boyish
hucnah haunch
iddec diced
irrpo prior
kutbec bucket
lappor poplar
lasia alias
laurib burial
lubly bully
meefal female
milit limit
mopsie impose
mycall calmly
nekel kneel
nokte token
noper prone
nwae anew wane wean
nyegtr gentry
perrim primer
preko poker
pudmy dumpy
pypin nippy
rebisc scribe
rodug gourd
rpeoims imposer promise semipro
shewo howes whose
wardty tawdry
warllc
yaldde deadly

Solving the jumbles problem requires comparing each jumbled word against all the words in the dictionary. The tricky part is finding an efficient way to test the jumbled word against a word from the dictionary. You can't just use the .equals() method to compare them since the jumbled word has it's letters scrambled from the real word it represents. A brute force algorithm would take the jumbled word and generate every possible permutation (reordering) of the letters. As each permutation is generated it would compare that permutation to a particular dictionary word.

Can you think of some issues created by this brute force approach? What if you were to use the all permutations approach with some of the following words?

otorhinolaryngological - 22 letters
immunoelectrophoretically - 25 letters
psychophysicotherapeutics - 25 letters
thyroparathyroidectomized - 25 letters
pneumoencephalographically - 26 letters
radioimmunoelectrophoresis - 26 letters
psychoneuroendocrinological - 27 letters
hepaticocholangiogastrostomy - 28 letters
spectrophotofluorometrically - 28 letters
antidisestablishmentarianism - 28 letters
floccinaucinihilipilification - 29 letters
pseudopseudohypoparathyroidism - 30 letters
pneumonoultramicroscopicsilicovolcanoconiosis - 45 letters

The deal killer for using a brute-force all-permutations approach is that the number of permutations on a String having N letters is N factorial ( N! ). To understand why this is a problem we need to understand the explosiveness of the factorial function.

We cannot solve this problem in general using the brute force approach. We must look for a heuristic. A heuristic is an approach which eliminates a large portion of the search space and focuses only on a portion of the space that is likely to contain solutions. Our heuristic will be to transform of both the jumbled word and the dictionary word to canonical form for easy comparison. Even with this efficient means of comparing a jumble to real word, be sure that your overall approach is better than something like this:


THIS IS A VERY INEFFICIENT OVERALL APPROACH (but it does produce the correct out)
for each jWord from the sorted ArrayList of jumbles (read from the jumbles file)
{	print jWord + " "
	covert it to canonical: i.e. jCanonical
	open up the dictionary file, again ... sigh :(
	for each dWord in the dictionary
	{
		convert to canonical: i.e. dCanonical
		if jCanonical.equals(dCanonical) // YOU GOT A HIT :)
			print dWord + " "
	}
	print "\n";
	close dictionary file
}

Study the above algorithm. Recognize that even though it does correctly generate the answer it does so in a very, very inefficient manner. Your solution should only open the dictionary file ONCE and convert the dictionary words to canonical form ONCE (and store them in a container).