‘Found Poem’ Generator – Counting Syllables and Words

I was thinking about future possibilities for the ‘Found Poem’ Generator to make it a much more interactive experience or at least requiring less hassle in setting up the phrases csv file. Where I’d left the last version anyone could customise an installation by adding their own phrases to the csv file but they would still then have to provide a syllable count and word count for each phrase line. On the face of it, if doesn’t sound too much hassle, after all it only needs doing once. But what if we were dealing with hundreds or even thousands of lines? What if we were crowd-sourcing the phrases or reading them in from a Twitter feed or other automation possibilities? 

I did a quick internet search and was surprised to find that syllable counting actually breaks down to a very simple set of rules.

Syllable Counting

Another search found some sample code which I was able to bolt into the program to count the number of syllables of any phrases where the syllable count hasn’t been supplied in the csv file. Having played with it for a while I’m not overlay happy with some of the numbers it comes up with, for example this phrase:

“When your hopes and fears are drowned”

I count this to have 7 syllables, with a stretch I could turn ‘drowned’ into ‘drown-ed’ and give it two syllables. The code I have used counts it as 10. I can easily overcome this by having the correct syllable count in the csv file but I will be looking at the code further to try to improve it.

Word Counting

 

A further search uncovered some code that counts the number of words in a phrase. I also added this into the program to count the number of words where the word count hasn’t been provided in the phrases cvs file.

Tidying Up

 

One last little update was to start making the csv file import more robust. At the moment I am in complete control of the csv import file so I know it will work, looking ahead to the future again this import mechanism needs to be much more robust to handle any issues in a csv file created by other people.

An updated version of the source code is here in my Dropbox.

 

 

Pony Songs – Bo

Bo is a rescue pony, rescued from a pharmaceutical test laboratory, and was the inspiration for this song. Unfortunately he was a bit camera shy for the video and so has Crumble here standing in for him.

V = Am C Dsus2 Am
C = Am Am Am C

Dreaming of a blue sky
Inside gets me down

Blinded by the neon lights
I can barely see the ground

Here you come again
Lead me by the hand

Take me now, I’m waiting
For the promised land

Where you gonna run to?
Where you gonna run to?
Where you gonna run to?
Where they can’t find you

Where you gonna run to?
Where you gonna run to?
Where you gonna run to?
Where they can’t find you

Dreaming of the blue skies
And those golden plains

I know I’m never going to leave this place
Come stay with me a while

Where you gonna run to?
Where you gonna run to?
Where you gonna run to?
Where they can’t find you

Where you gonna run to?
Where you gonna run to?
Where you gonna run to?
Where they can’t find you

Where you gonna run to?
Where they can’t find you

‘Found Poem’ Generator – Loading Data From A File

#poem #poetry #python #programming

The next stage in my Found Poem Generator project was to work on item number 6 in my list of enhancements. This is to update the program to read in the found poem lines from a csv text file rather than having them hard-coded in the program itself:

1. Output the created poems to a larger screen ie a graphical type large fonted output rather than  using the console window.
2. Provide two methods of running:
a. Auto-generate a new poem every 5 minutes or so
b. Provide buttons for user input to create poems on demand
3. Package the program up to be self-contained
4. Deploy the program onto some sort of platform that wasn’t my Mac laptop ie a Raspberry PI and large screen monitor or TV
5. Make the program auto-run when the device is started / restarted

Further enhancements could then be:

6. Use a text / csv file to load in the words and lines of the poems rather than the hardcoded ones used currently
7. Check the first letter of each line to capitalise it for poems and un-capitalise it for haikus, and add commas to the end of lines if they are missing or remove for haikus.

This tutorial got me most of the way there, opening the csv file and reading in the records. I could then access the individual fields of each row using the ‘row’ command ie ‘row[0]’ gives me the data from column 0 of the current row, ‘row[1]’ gives me the data from column 1 etc. I was then ready to read the lines from the csv file into my data structure in the program. This is where I came a little unstuck.

 

My project was based on each poem line or phrase being an individual object. Each object was created one after the other and given a sequential name using code like this:

p1 = phrase(blah, blah, blah)
p2 = phrase(blah, blah, blah)

So ‘p1’ becomes an object called ‘p1’ which has some attributes ie the poem phrase or line, the word and syllable counts. Once all the objects are created they get loaded into a registry so I can keep track of them and mark them off as used once they have been selected and displayed for a particular poem. My problem here was auto-generating the ‘p1 =‘ part of the code for each record in the csv file. 

There just didn’t seem to be an easy way of generating the new object names in order to instantiate objects for them. A re-think was required.

In my previous life I would have read the data from the csv into a multi-dimensional array and worked with that, so I thought I would explore this option first. Unfortunately Python doesn’t appear to support multi-dimensional arrays. It has lists which are single-dimensional arrays and dictionaries which are more structured versions of lists but in order to make either of these multi-dimensional one has to make a list of lists or a dictionary of dictionaries which started to look quite complex for my dataset which has 4 columns plus the ‘line has been used’ flag in the registry. So I googled for possible solutions and came across someone asking a similar question and one of the answers intrigued and surprised me – Python has SQLLite built in and it can be instantiated and run solely in memory. 

Neat. Now all I have to do is chuck out most of my code and rewrite using a SQL table in memory instead of instantiating all those phrase objects and using a registry to track them all.

This is of course a lesson in how not to develop, or one of the pitfalls of having bespoke software written, and potentially a pitfall of the agile software development methodology. If we don’t know at the start where we are going to end up then we can end up writing ourselves into a corner. Ever asked for what you feel like would be a small simple change to a piece of software and been quoted an extortionate amount of effort for it? This is where I was right at this moment, my whole design no longer supported the new functionality I need to write into it.

In reading the poem phrases in from a file I could no longer use my object based data structure. Now that I was no longer using this structure it meant that my method and code for selecting the lines of poem no longer worked and were no longer appropriate either. 

I began working methodically, finishing the function that reads the data in from the csv file and then working through the two functions that created the poem output and the Haiku output. Once I had finished the poem function, I was able to reuse much of the code to update the haiku function.

The previous methods used a random number generator to pick a line number and then simply iterated through all the poem line objects until it reached that line and then checked a flag to see if it had already been used. If it hadn’t been used before the line was selected and the flag updated. Either way the process then ran again until all the required lines had been selected. This sort of continuous scrolling though data is just about passable as a solution when working with a small array of data in memory, but the conversion of the data storage to a SQL based system meant this method was not really best practice. Sure it would work but it wouldn’t be pretty under the bonnet. It just would not do.

The change was relatively straight-forward, the random line number is still generated, the line is read directly from the table using a SELECT and then checked for validity. If it passes it gets used, if it fails then we go round again until the requisite number of lines have been retrieved. After working on this method for the ‘poem’ function I made it a little more efficient by updating the SQL to only select lines that had not been used and were the correct number of syllables. This still meant the program spent an amount of time firing ‘SELECT’ statements and returning empty recordsets when they didn’t match the right criteria ie the random line selected had already been used or didn’t have the required number of syllables. An update for the future on this method would be to firstly identify all the valid lines and then select one randomly from that subset, this would bring the SQL interaction down to a maximum of two SELECTs per poem line rather than a variable amount which is happening now.

This version also contains a little tidying up, I’ve learnt more about the ‘Pack’ command and organised the screen a little better and also change the font size based on the number of lines of poem that need to be displayed – this prevents the buttons disappearing off the end of the screen on long poems. In the back of my mind I’m also wondering what will happen if I have long lines as well, these are likely to disappear of the the sides of the screen so I ought to write in some sort of dynamic font sizing routine to check for and handle these potential issues.

The updated source code has been placed in Dropbox for those of you enjoying working through it yourselves.

One final thought I had to add to the ‘To Do’ list is for the program to output each of the poems created to a text or log file to create a permanent record of all the combinations created.

 

 

 

Found Poem Generator – User Interaction – Adding Buttons

Working through my list of enhancements for my ‘Found Poem’ Generator that I documented in a previous blog post: 

1. Output the created poems to a larger screen ie a graphical type large fonted output rather than  using the console window.
2. Provide two methods of running:
a. Auto-generate a new poem every 5 minutes or so
b. Provide buttons for user input to create poems on demand
3. Package the program up to be self-contained
4. Deploy the program onto some sort of platform that wasn’t my Mac laptop ie a Raspberry PI and large screen monitor or TV
5. Make the program auto-run when the device is started / restarted

Further enhancements could then be:

6. Use a text / csv file to load in the words and lines of the poems rather than the hardcoded ones used currently
7. Check the first letter of each line to capitalise it for poems and un-capitalise it for haikus, and add commas to the end of lines if they are missing or remove for haikus.

This blog post describes the update for item number 2(b):

2. Provide two methods of running:
a. Auto-generate a new poem every 5 minutes or so
b. Provide buttons for user input to create poems on demand

My previous experience with programming GUIs started a long time ago with Microsoft Access 2 and Visual Basic 4 and then, later, a number of Borland Delphi iterations with Microsoft SQL Server. With these development environments you got, well, development environments and that meant a WYSIWIG screen designer and a palette of buttons, text boxes, counters and other widgets to choose from. Not so with Python.

Tkinter Screen Layout Design

I had a google around for a WYSWIG screen designer but there really doesn’t seem to be anything that is recommended, one suggestion was to use graph paper but I’m not sure this project is complex enough to go that far. It seems that hand-coding buttons and labels is the way to go using one of the three Layout or Geometry Managers that Tkinter possess. These are Pack, Grid and Place:

‘Pack’, I’m informed, is the easiest to use but gives least control over placement of widgets, merely their relative positions to each other.

‘Place’ is used explicitly to set the position and size of a window, either in absolute terms or relative to another window. This didn’t sound like the sort of thing for creating and laying out a bunch of buttons.

‘Grid’ is similar to Pack but allows a certain amount of control of the placement of widgets by allocating them x and y co-ordinates within a notional two dimensional grid. The dimensions are then determined by the Grid Manager. This sounded the most likely candidate to try and use.

Tkinter Command Buttons

So now I have an inkling of how to place my buttons, my next task is to find out how to create the buttons and then assign the ‘poem’ and ‘haiku’ commands to them such that I could generate some interactive output.

Turning to the code, the first thing I found was that the label I was using to place the text on the screen, using an example copied from the internet, was using the ‘pack’ command. In order to keep things simple, I decided to change course and stick with the ‘pack’ function and see what the outcome was before deciding if I needed to get more complex with my button placement.

In the end I created three buttons and a slider bar, the buttons were for quitting the program (just to make things simpler for testing, I’ll remove this one later), a ‘haiku’ button and a ‘poem’ button. The slider allows the user to slide the scale from 1 to the number of ‘found’ lines there are available before pressing the ‘poem’ button to generate a poem of that number of lines. 

FoundPoemGenerator_Screen_Buttons

In doing this I uncovered what I thought was a bug in the system where after generating one haiku the program would appear to freeze with the spinning pointer icon, yet the ‘poem’ button worked fine. I say it worked fine, a little more button pressing revealed that the ‘poem’ button also eventually ran into the same problem, and then the penny dropped. The code is designed to mark any lines used in a poem as such ie already used, this meant that the ‘haiku’ function was very quickly running out of 3 and 5 syllable lines to use and so the program got stuck in an endless loop of looking for lines that weren’t there.

The fix was easy, to clear the ‘line already used’ flag between each press of a button. I’m not altogether happy with this outcome as my design was to ensure each poem produced was unique, which logically thinking couldn’t actually work unless I had infinite lines of poem to draw from. A case of having to admit my design was flawed and forge ahead with the new working design.

Source code available here for this update.