Minutes of Weekly Meeting, 2011-02-07

Meeting called to order: 11:05 AM EST

1. Roll Call

Eric Cormack
Ian McIntosh
Brian Erickson
Patrick Au
Carl Walker
Brad Van Treuren
Adam Ley
Heiko Ehrenberg
Peter Horwood
Tim Pender

2. Review and approve previous minutes:

01/31/2011 minutes:

3. Review old action items

4. Discussion Topics

  1. Identification of key "Take Away Points"
    - Progress on review of past minutes
    - Search tooling?
    • {Forum posts on Key Points shared}
    • [Ian] After Brad verified my example, I went back and added in a generic example to help explain what each field is, since we didn't really have that explained here.
    • [Eric] That helped me. I wasn't quite sure what the double quote indicated.
    • [Ian] The other I had to do was use two spaces for each indent. For whatever reason, the forum software ignored a single space at the start of a line.
    • [Eric] You've got a double quote with no spaces before it - should there be?
    • [Ian] Ah, no. The date in the entry above starts in column 1 to show it's the first field. The quote is also in column 1 to carry over the date from the previous entry.
    • [Eric] OK, I see now.
    • [Ian] My example was only the records for the first set of minutes in my block. Later, I converted some more, and that revealed a couple of things. In one, I had nested bullets. In effect this meant there was a sub-context for some of the comments, in this POST in an embedded system rather than simply POST. This left me wondering whether the sub-context was part of the comment or part of the context. In the end I decided that it was refining the context, so I handled it that way.
    • [Brad] That's about the only thing you could do if you only have one level of context.
    • [Tim] You could have added it to the comment.
    • [Ian] Yes, as I said, and maybe it doesn't really make any difference either way. I just wanted to talk through my thought process so we could see if it works and try to get a consistent approach by everyone.
    • [Ian] Keywords also proved tricky. I tried to confine myself one word or two word phrases. But sometimes I found that I needed to modify a word from how it appears in the comment to something that more likely to be the term you'd search for. Or even add words that seem pertinent even though they don't actually appear in the text.
    • [Brad] Keywords are the only thing that's likely to be inconsistent.
    • [Ian] Yes. The trouble is we don't have a defined lexicon we can use here.
    • [Brad] I see that we need to be coming up with a baseline glossary. Clearly, some words like "architecture" and others will be words we'll be looking for.
    • [Ian] But many others will only suggest themselves as we work through this.
    • [Brad] We can add words as we find we need to.
    • [Ian] I've still got about 4 sets of minutes to get into the new format. I wonder if I should carry on with that. I guess we still need to get this populated, else we'll be doing nothing.
    • [Eric] I'm about half done, but I have it in a separate document.
    • [Brad] The time I was going to use on this this week I spent writing a Python script to take in the new format and separate it into the fields.
    • [Ian] That was going to be one of my other questions today - now that we've created a machine readable format, how do we get a machine to read it?
    • [Brad] I was thinking that maybe I can use SQL Alchemy. But at least I have code that puts it into fields. It's about a half page of Python and also used the double space indents.
    • [Ian] I guess I was thinking, apprehensively, about writing some sort of web front end to a MySQL database. I just didn't fancy the work involved right now.
    • [Brad] There's Django, a Python framework for web applications. It might be something we could use, but it's not easy to say right now. Which database do you use, Ian, MySQL?
    • [Ian] Yes, it's MySQL. Actually, it has MySQLi (MySQL Improved) support, so you can use either interface.
    • [Brad] What I've got allows me to dump out the SQL statements so you can create the tables in another database. The real key thing is how we want to partition the fields into tables.
    • [Ian] I don't think there are too many tables - The date, URL and comment are all related one to one, so could probably go in one table.
    • [Brad] Yes, date and URL are secondary keys to the comment.
    • [Ian] And the keywords are pointers to the comments.
    • [Brad] It's a many to many relationship: A keyword relates to many comments, and each comment may have many keywords. I'm imagining that you'd ask for all the comments that relate to a certain keyword. I think there's a minimum of two tables.
    • [Ian] Two tables is what I thought I was seeing. But will you always be searching on keyword? Might you look for all comments within a defined context?
    • [Brad] Depends on how you're going to represent the data. You might have a drop down list of the possible contexts; it's a freeform field, so you can't expect someone to be able to type in the text to search for.
    • [Ian] Which would possibly make context an indexed table in it's own right. So we'd need to build a catalogue from what we've got.
    • [Ian] What I'm sensing is that we really need to get these keywords and contexts from the current exercise before we can start to rationalize them?
    • [Brad] I'm concerned that there may be an element of rework if we wait too long.
    • [Ian] So can we create tables of keywords and contexts now?
    • [Brad] I think keywords are a more realistic goal.
    • [Tim] What if you don't want your search to be context driven?
    • [Ian] I'm imagining a search for keywords either with or without a given context.
    • [Brad] It's no different to search on a keyword or a context. You're still searching on a relation.
    • [Brad] What we haven't talked about is searching on an aggregation of keywords.
    • [Ian] Hmm, yes, that could bring in boolean AND/OR into the search. That does add complication.
    • [Brad] That's what I was thinking.
    • [Ian] Our existing website search does a lot of that; I wonder if we can rob some of the code out of there.
    • [Brad] It's why I wanted to automate the conversion from text to tables. If we find we need different tables then it's a just a case of changing the front end then cranking the handle again to get a new set of tables.
    • [Ian] So the idea would be to collect everything from these "code boxes" on the forum into a single text file and parse that?
    • [Brad] That's the plan, yes.
    • [Ian] So, after I finish updating my minutes records, should I be looking at compiling a list of contexts?
    • [Brad] More beneficial would be to come up with a list of keywords.
    • [Ian] OK. SOme of the ones I came up with are two words, as a single word didn't really make much sense.
    • [Brad] Two words shouldn't be a problem, but I wouldn't want to go any more than two words.
    • [Brad] Perhaps we can use what time we have left today to go through the keywords we have now?
    • [Ian] OK, well if we start with the first of mine:
      constraints, board-edge, system test
    • [Brad] I wonder if "board-edge" should maybe become "edge test"?
    • [Ian] I suppose there are several options - board-edge, board boundary, edge test.
    • [Heiko] Maybe we need a list of terms that we all try to use.
    • [Brad] I think that's what we're trying to do here.
    • [Ian] Maybe I need to take this away and compile the list offline, for review next week?
    • [Brad] To understand the usage of the keyword, we'd still need to see the context, so we can decide if it's appropriate.
    • [Tim] Is this a duplication of effort? Will Brad's Python not generate that list?
    • [Ian] Yes, but we're trying to rationalize the keyword list, prior to people filling out the records.
    • [Tim] But maybe the script can generate the initial list.
    • [Brad] I can generate the list, but it'll lose context.
    • [Tim] You could prefix the keyword with the context.
    • [Tim] Sometime the context won't be enough and you'll need to go back to the full minutes.
    • [Ian] Actually, I thought that was always something we'd have to do. These summaries we're doing are really just a means to give an indication of whether or not a particular discussion is more or less likely to be relevant to a chosen subject.
    • [Tim] With that example, board-edge, maybe we're losing the real intent.
    • [Ian] I don't think we can ever make this perfect. I guess this is something that no-one else has really done; I'm used to finding that someone else has already had the same problem and there's a ready made solution on the internet.
    • [Brad] What's different for us that we're trying to deal with multiple references. That's not the kind of thing that big searches like Google do.
    • [Ian] We're out of time today. There's clearly still a lot to think about. I don't want to end up creating a lot of rework, but I feel we do still need to move on with compiling these records.
    • [Brad] There will be some level of rework.
    • [Ian] Yes, but it should be limited to the keywords. The rest is still useful.
    • [Brad] Well, leave the field empty, or ", for the keywords just now.
    • [Ian] OK, but it'd be useful to know what keywords people would have proposed.
    • [Brad] They could send them to you by email.
    • [Eric] That's a better idea.
    • [Ian] OK, I'll set that as an action then: Send me a list of the keywords that would go against your comments. Best to include dates too, so we have a chance to reconstruct the contexts. {ACTION}
    • [Eric] Do you want that as a spreadsheet or a text file.
    • [Ian] Either really, but a text file is fine and probably easiest.
    • [Tim] One thing I was thinking about on expanding the keyword list. In some of the comments I've looked at, you might have something on SERDES but it wasn't important enough to actually appear in the comment.
    • [Ian] Yes I had that too, since it'd be a relevant search term for that comment then it would be OK to add it even though it doesn't appear in the comment.
  2. White Paper
    - Volumes 1 and 2 are "stable": How do we progress Volumes 3 to 5?
    • {Not discussed due to lack of time}

5. Key Takeaway for today's meeting

6. Schedule next meeting

Schedule for February 2011:

Patrick will be unavailable until March 7th.

7. Any other business


8. Review new action items

9. Adjourn

Eric moved to adjourn at 12:21 PM EST, seconded by Brad.

Respectfully submitted,
Ian McIntosh