Minutes of Weekly Meeting, 2008-01-23

Meeting was called to order at 8:20 am EST

1. Roll Call


Brad Van Treuren
Patrick Au
Ian McIntosh
Heiko Ehrenberg
Carl Nielsen
Adam Ley

Email Proxy on the Structural Test Use Case Discussion:

Guoqing Li

  • [Peter Horwood] - I agree with Guoqing the topic has been covered well.

2. Review and approve 1/14/2008 minutes

Heiko found the minutes were in error where the web site for SJTAG pointed to the old web site. The new web site should be http://www.sjtag.org. With this correction, the minutes were approved. (moved by Heiko, second by Ian)

3. Review old action items:

  • Adam proposed we cover the following at the next meeting:
    • Establish consensus on goals and constraints
    • What are we trying to achieve?
    • What restrictions are we faced with?
  • Establish whether TRST needs to be addressed as requirements in the ATCA specification if it is not going to be managed globally (All)
  • Provide feedback of more use cases not yet identified to Brad (All)
  • Review tables (Goals vs. use case matrix) on slides 38-41; (All)
  • Register on new SJTAG web site (www.sjtag.org) (All)
  • All need to check and add any missing Doc's to the site (All)
  • Respond to Brad and Ian with suggestions for static web site structure (Brad suggests we model the site after an existing IEEE web site to ease migration of tooling later) (All)
  • Look at proposed scope and purpose from ITC 2006 presentation (attached slides) and propose scope and purpose for ATCA activity group (All)
  • Look at use cases and capture alternatives used to perform similar functions to better capture value add for SJTAG (All)
  • Contact Guoqing regarding alternate meeting time process (Brad)
  • Set up Use Case categories for forum discussions (Ian)
  • Volunteers needed for Use Case Forum ownership (All)
  • Send Ian list of volunteers for Use Case champions (Brad) (Only Ian responded to request)
  • Continue Fault Injection/Insertion discussion on SJTAG Forum page (All)
  • Send additional sub topics to Heiko for the continued Structural Test use case discussion for 1/14/08. (All)
    • Guoqing's list is a very good start
  • We will need to begin writing a white paper for the System JTAG use cases to provide to the ATCA working group (All)
    • Most likely, champions will own their subject section and draft the section with help from others.
    • This paper will be based on the paper Gunnar Carlsson started in 2005.

4. Discussion Topics

  1. SJTAG Value Proposition - Structural Testing (Continued ...)
    • [Heiko] - in summary: benefits of "structural test": diagnostics, ease of test development, reusability, protecting module/card vendor IP; diagnostics provided by boundary scan may also help finding design problems in specific cards under certain conditions (e.g. temperature) by revealing failure trends; would be helpful in resolving such issues; GO/NO-GO not good enough to warrant cost of implementation; Structural Test runs at different product states and helps prove-in designs; Test diagnostics must reside on the board; There is a needed format for test and Diagnostics, data storage for diagnostics may need compression to store as much information as possible
    • [Patrick] - how to get diagnostic off the system if it is stored in FLASH (simple commands)? possibly displaying diagnostics in the system itself? need detailed diagnostics over a period of time (like a log file), need to have enough memory resources to store sufficient amount of diagnostics; also, would we "mirror" the results (store the results in two different locations to avoid problems with data corruption)?
    • [Brad] - writing to FLASH takes a long time, so writing to two FLASH devices may take too much time under normal conditions (if the system is fault free)
    • [Patrick] - How many runs do you store in FLASH?
    • [Heiko] - Store data over a period of time.
    • [Brad] - to err on the side of too much data is better than not enough data
    • [Patrick] - At IBM, we often don't have enough data stored, more because there isn't enough room to store it in the system FLASH. Ideally, it wuld be nice to have a diagnostic data FLASH.
    • [Patrick] - What happes if the data FLASH becomes corrupted?
    • [Brad] - A console output helps if you cannot access the FLASH for writing.
    • [Brad] - when would structural test be executed at system level? power on self test, initiated system self test or targeted module by operator, automated periodic test, ...; what to do with diagnostic results in those different cases; need to store type of test initiation with test results;
    • [Patrick] - is there only one power on self test (POST) or does SJTAG suggest different POST for different stages (manufacturing vs. in field, for example);
    • [Brad] - We generally use the same POST with additional tests for the specific environments added that run after the system has booted. There is a time budget that must be met for booting a system into service. The POST is not suppose to exceed this.
    • [Brad] - What information should we store for diagnostics?
    • [Carl N.] - Store bit level or some mapping of failing bit information. We also need to store what test the failure occurred in as well as the vector number.
    • [Brad] - I would agree.
    • [Carl N.] - For all this to work, you need to have tests that make expected data available.
    • [Heiko] - Should we also have some way to compress the data?
    • [Silence...]
    • [Brad] - Have really dealt with the Value Proposition question raised by Adam?
    • [Heiko] - can we really measure the benefit of boundary scan based structural test in numbers yet? We have addressed Adam's question the best we can without real numbers.
    • [Ian] - We need good DFT in the first place to get good numbers.
    • [Brad] - achievable test coverage is key to determine whether structural test makes sense
    • [Ian] - If someone is going to invest in functional test, then it is difficult to justify boundary-scan testing.
    • [Heiko] - Should we have an example of test coverage and results?
    • [Ian] - It is really defined by how small the ambiguity group from the results needs to be. In the end, it is going to be as important as the lowest level of replacement that may be achieved.
    • [Ian] - required detail of diagnostics depends on where/when structural test is executed and what information is needed (if I only need to know which board/module is failing, then I don't care which two pins are shorted); may change from application to application;
    • [Ian] - If one of our avionics systems goes bad, we pull the whole black box out and use the next level of testing location to open it up and diagnose it to identify which board is failing. We don't pull out a soldering iron by the airplane.
    • [Brad] - So the FRU is the important thing for you?
    • [Adam] - but you may need diagnostic results in order to troubleshoot problems that only occur in the field under certain circumstances and not when the failing unit is removed from the system and is retested later ("no trouble found")
    • [Brad] - The repair depot process plays an important role in the decision process as well. We ran into a case where the depot would update firmware as the first step when a board came in to the depot. This would end up masking Firmware faults that appeared to be hardware defects because the functional tests had bugs.
    • [Brad] - some faults may be caused by software issues (bad firmware) that boundary scan would be more immune to
    • [Heiko] - need to create a summary of our discussions, defining "structural test" (what?), benefits? (why?), when to run it? (when?), infrastructure, where to store results (how?) ...
    • By Proxy:
    • Guoqing:
      • It seems enough that SJTAG architecture based structural test has been discussed.
      • My understanding is that exceeding diagnosis would not make sense in field service. The structural test would be useful when it can help people to judge which FRU is failing, not which component or joint is failing. Because the storage space is limited in system, it would be enough if test data information log is useful and simple. When failing board or system is delivered to repair site, it is easy to use tools to do detail test and diagnosis at board or system level.

5. Schedule next meeting

Monday, 1/28/2008, 8:15am EST

6. Review new action items


7. Any other business


8. Adjourn

meeting adjourned at 9:20am EST