Portfolio element – Haskell
Portfolio element – Haskell |
|
Unit |
Programming languages: principles and design (6G6Z1110) |
Programming languages – SE frameworks (6G6Z1115) |
|
Lecturer |
Dr Ivan Olier |
Week |
11 |
Portfolio element |
Haskell (15% of coursework) |
Introduction |
Figure 1 shows the “main” function of a Haskell program that reads a plain text file whose name is asked to the user, and reports:
1) The total number of words in the text.
2) The total number of the top 20 most commonly used English words that appears in the text according to the Oxford English Corpus (OEC) rank. See the list in Figure 2.
3) A histogram of the top 20 most frequent words in the text excluding common words.
Figure 3 shows an example of the program execution. The plain text file (text_sample.txt) used for this example is available on Moodle. You should use it to check the output of your implementation matches the one shown in the figure.
Figure 1. “main” function of the program (file available on Moodle)
the |
and |
have |
not |
as |
||
be |
a |
I |
on |
you |
||
to |
in |
it |
with |
do |
||
of |
that |
for |
he |
at |
Figure 2. List of the top 20 most frequently used words in English according to the OEC rank.
Assignment
You must complete the program shown in Figure 1 by implementing the missing functions. Your complete program should execute as shown in Figure 3. Table 1 shows the list of missing functions you should implement, along with brief descriptions and examples of use. You can use those examples to test the output of your functions on GHCi before you add them to your program.
You may need to implement additional functions, but your program must contain the implementation of the functions listed on Table 1, at least. Modifications of the “do” block (Figure 1) are not permitted.
6G6Z1110 & 6G6Z1115, Dr Ivan Olier
Portfolio element – Haskell
Figure 3. Example of the program execution
Table 1. List of the missing functions that must be implemented
Function name |
Brief description |
Function call example |
|
toWordList |
Takes a string, lowercases it, |
> toWordList "Hello, World! HELLO!! :-)" |
|
drops any character that is |
["hello","hello","world"] |
||
not a letter, and returns a list |
|||
with the words in the string. |
|||
countCommonWords |
Takes a list of strings and |
> countCommonWords ["the","planet","of","the","apes"] |
|
returns the number of times |
3 |
||
the 20 most frequently used |
|||
English words appears in the |
|||
list. |
|||
dropCommonWords |
Takes a list of strings and |
> dropCommonWords ["the","planet","of","the","apes"] |
|
drops any word that is within |
["planet","apes"] |
||
the top 20 most commonly |
|||
used in English. Returns a |
|||
list of strings without those |
|||
words. |
|||
countWords |
Takes a list of strings and |
> countWords ["friend","she","she"] |
|
returns a list. Each element |
[("friend",1),("she",2)] |
||
of the returned list is a tuple |
|||
which contains a string (a |
|||
word)and |
aninteger |
||
(representing |
the number of |
||
times the word appears in the |
|||
text). |
6G6Z1110 & 6G6Z1115, Dr Ivan Olier
Portfolio element – Haskell
sortWords |
It |
sorts |
words |
by |
their |
> sortWords [("friend",1),("she",2)] |
||||
frequency |
in |
descending |
[("she",2),("friend",1)] |
|||||||
order. It takes and returns a |
||||||||||
list of tuples. Each element |
||||||||||
of the tuple consists of one |
||||||||||
string (the word) and one |
||||||||||
integer (its frequency). |
||||||||||
makeHistogram |
Makes |
a |
histogram |
using |
> makeHistogram [("her",4),("she",2),("friend",1)] |
|||||
asterisks. |
It |
takes |
a |
list |
of |
"**** -> her\n** -> she\n* -> friend\n" |
||||
tuples |
(string, |
integer) |
and |
|||||||
returns |
a |
string |
which |
|||||||
contains |
the |
histogram. |
It |
|||||||
also |
contains |
the |
required |
|||||||
line breaks (“\n”) such that it |
||||||||||
should print the histogram as |
||||||||||
shown in Figure 3 when |
||||||||||
using |
a |
function |
like |
|||||||
“putStr” (see last code line, |