Print

Minutes of Weekly Meeting, 2009-11-16

Meeting called to order at 10:36 AM EST

1. Roll Call

Brad Van Treuren
Eric Cormack
Ian McIntosh
Carl Walker
Michele Portolan
Adam Ley
Tim Pender

Excused:
Patrick Au

2. Review and approve previous minutes:

11/09/2009 minutes:

3. Review old action items

4. Discussion Topics

  1. White Paper Review - Review of Virtual Systems
    • [Ian] Brad added some proxy comments to last week's notes. In one he asked if we could discuss a bit more of Adam's remark on defining requirements.
    • [Brad] I was really trying to get some scope on the requirements space - it can mean many different things.
    • [Adam] As I recall, I was asked to comment on the breakdown of the application model. I didn't say anything about the model itself, but was asked to comment on where boundaries are drawn, whether these map onto real world applications. I think we need to stay focussed on the requirements: If we see where we need to allow one application to interface with another, then that's useful. I'm saying you shouldn't put artificial boundaries in place before the applications are defined.
    • [Brad] I agree, decomposing applications into solutions is the wrong way for us to go. What you say is warranted and makes sense - identify the need for the boundary.
    • [Adam] I think we're very much on the same page. It's very early to be mapping these virtual systems onto real applications.
    • [Ian] Then Peter commented that embedded solutions may be stand-alone circuitry in a corner of the board or may be hosted in some existing asset of the design.
    • [Brad] For POST, we've done that in hardware, similar to how Firecron and Intellitech have shown. It is a very dedicated set of hardware for that.
    • [Ian] And that's pretty much the way we've elected to go. It may not be the most logical of reasons, but a major factor was the desire to avoid any need for 'software' which would be implied by using a processor: That demarcation opens up a whole load of issues, for organizational and process reasons.
    • [Brad] It's interesting you mention that, because one of our product groups has delegated the BScan diagnostics software back the the firmware people; it's software but written by the hardware guys.
    • [Ian] I suspect that's where we'll eventually get to, too. We have similar issues on using soft processor cores in FPGAs.
    • [Brad] One other thing, while we have a level of POST we also have firmware self-test, but that's not on-demand test. Do everything you can in boot, then there are things you can test running concurrently with the software loading, etc.
    • [Ian] That's very similar to the PBIT-1 and PBIT-2 phases we have on some products.
    • [Brad] Then there are the Green initiatives, where we see a lot more blocks powering up into low power states, that presents other issues over what can be tested.
    • [Ian] From a different angle, we're seeing similar requirements: To reduce the load on an aircraft's auxiliary power unit, we may have to start in a low power mode, but are still expected to report 'readiness'. It's hard to confirm there are no faults if you haven't fully powered everything up.
    • [Brad] There's also the thermal factor. We're trying to cram more equipment into the confines of old facilities.
    • [Carl] We have the same issue.
    • [Brad] You have the same issues as we have; your systems may be distributed amongst a number of boxes and you have to test the interconnects between the boxes.
    • [Ian] Often that's the hard part. They're not covered by the board JTAG, so often it means in-process monitoring of the data. If you don't see data on one link for a while you might start suspecting it has gone faulty.
    • [Brad] We have piggy backed some BScan onto links like that. There was paper on a distributed base station[1] where one part was remote by maybe 20km. It's an extreme case of a distributed system.
    • [Ian] Brad, can you provide a citation for that, so I can record it correctly?
    • [Brad] OK, I'll dig it out. {ACTION}
    • [Brad] Can we map this back onto the virtual systems? How do we show distribution? Do we need or want to? Are there other places we can have distribution?
    • [Ian] I think what you're suggesting is showing the hierarchical delegation of the test control. How we look at this is that each board can run it's own suite of tests. The LRU's BIT system only needs to know how to run the board tests and collect the results via some API; it doesn't need to know what the tests are. That scales up to a system comprising several LRUs. But I'm not sure if trying to show that wouldn't just add confusion.
    • [Brad] I was reminded of Gunnar's paper[2] where the BScan Test Manager resides on the FRU and has interfaces to a higher level that can apply tests on demand.
    • [Brad] This can get you into the philosophical debate of whether you have a 'push model' or a 'pull model'
    • [Brad] Some of our products have self-test, but also have multidrop capability.
    • [Ian] I think that's essential: All of these embedded solutions have some dependency on a degree of functionality being present, so you need a 'back door' if the unit is apparently dead. It's one of the other things with distributed systems. The fault reporting has to be communicated over a mission bus. If that fails, you might not be able tell if it's the link or the remote unit that's gone down.
    • [Brad] Is this requiring that the link needs to be of some highly reliable type?
    • [Ian] Possibly, but maybe we need supplementary signals outside of JTAG. The sort of thing we'll do is provide some critical discretes to show that power is OK or that the main controller is alive. That can help you figure out what is wrong if the BIT tests return nothing. We've even had a big debate about whether or not the boxes should have 'power on' LEDs. I can see an argument based on electrical noise but not on cost.
    • [Brad] In the telecomm industry, certain LEDs are required to be on every board and the colors are specified for particular indications.
    • [Brad] There is an argument that if you know a board has a fault then you'll need to exchange it anyway, so is there any point in conducting further tests? But can you reproduce the fault? Will you be able to determine the root cause? Do you need to take a snapshot? Was it a thermal overload or a software glitch?
    • [Ian] In that kind of vein, I've been looking at Single Event Upsets. These are more likely to occur at high altitude than near sea level. Since they affect SRAM and SRAM based FPGAs the effects can be either momentary or more persistent. With some FPGAs you can detect an SEU by the configuration SRAM CRC changing; other cases it's difficult to tell it from a hard fault. It just goes away after the power is cycled.
    • [1] 'Testing and remote field update of distributed base stations in a wireless network', Chen-Huan Chiang Wheatley, P.J. Ho, K.Y. Cheung, K.L. - Lucent Technol., Bell Labs., Holmdel, NJ, USA; ITC 2004 Proceedings.
    • [2] 'Remote boundary-scan system test control for the ATCA standard', Backstrom, D. Carlsson, G. Larsson, E. - Embedded Syst. Lab., Linkopings Universitet, Linkoping; ITC 2005 Proceedings.
  2. 2009 Survey
    • [Ian] So far, we've had maybe a half-dozen responses, but a few have generated additional referrals. As people complete the survey, I'll delete them from the mailing list. I'll let the survey run for a couple of weeks then send a reminder to the people still on the list; that worked quite well last year.
    • [Brad] I'd like to know, are people trying to answer the whole questionnaire or just sections?
    • [Ian] People are pretty much filling out the whole thing. There are some bits getting missed out, but not a lot.
    • [Ian] I can post a link to the results page, but we have to treat the data as 'privileged'.
    • [Brad] Yeah, when we did the 2006 survey, Ben tried to keep the detail data to just the officers, and then give a summary, they way you did last year, Ian. I'm just a bit cautious about privacy here.
    • [Ian] That's a good point. What I can do is create a version of the results page that removes names, companies and email addresses. {ACTION}
    • [Brad] That would be good.

5. Schedule next meeting

Schedule for November 2009: Monday November 23, 2009, 10:30 AM EST Monday November 30, 2009, 10:30 AM EST

6. Any other business

7. Review new action items

8. Adjourn

Tim moved to adjourn at 11:47 AM EST, seconded by Brad.

Respectfully submitted,
Ian McIntosh