ChaOS Diary Dec 2008 - Apr 2009

ChaOS Home    ChaOS Source Notes    ChaOS Source Index    ChaOS Downloads    CTPP Home

Current Diary    Diary to Jun 2010    Diary to Mar 2010    Diary to Dec 2009    Diary to Aug 2009    Diary to Nov 2008

ChaOS Diary - monoblog and links to reference documents

Many golden nuggets lie herein.

30/4/2009 IBM eServer 440x Quad Xeon/PCI ChaOS PCI only identifies devices on Bus 0. This is because the 440x has multiple Host PCI bridge devices (PCI type 0x0600), rather than just the one encountered on my other testbeds, which connect to other buses through a PCI-PCI bridge device (PCI type 0x0604). Not yet located a tech ref for the Winnipeg PCI-X Host Bridges, but can see that PCI:0x40 seems to be the PCI bus config register. In the meantime, though, will have to modify PCI scan to pre-scan dev/fn 0/0 for buses 0x00-0x0f, as 440x has six Winnipeg Host Bridges on buses 0x00,0x01,0x02,0x05,0x07 and 0x09!

The eServer 440x is an awesome piece of kit, weighing about 50kg, with twin redundant fans and power supplies, and mirrored SCSI raid drives. It disassembles like a field gun into huge chunks, the most impressive being the processor/memory module with the 4 Xeons and 16 DIMM slots.

24/4/2009 IBM eServer 440x Quad Xeon Astonishingly, ChaOS booted first time from CD!

24/4/2009 IBM eServer 440x Quad Xeon Added this vintage server to my collection, to further ChaOS MP and SCSI development.

21/4/2009 LSI1030/MPT Fusion To test MPT driver, tried a routine to read all sectors of a SCSI hard disk. MPT driver hangs after 282 interrupts, which happens to be 256 plus 13 interrupts each for the init code for each channel. I think the reply fifo depth must be 256, and I am stuffing unnecessary reply addresses in the fifo. Reply fifo needs to be primed, but IOC only needs an extra address after a reply is received with the top bit set. When this happens, a write to the reply fifo inside the interrupt handler seems to fix the problem.

17/4/2009 IP Had a go at pinging a remote ChaOS session over the internet. Did not work, because router returned the ping. I will try a UDP packet tomorrow.

16/4/2009 LSI1030/MPT Fusion Success of 13/4/2009 a little premature - whilst I had a successful SCSI hard disk read for the first time, the same code from cold boot delivered only odd bytes from the disk sector and the IOC returned a SCSI DATA UNDERRUN error. Same thing on SCSI INQUIRY data returned by the adapter. Clearly this is an error in DMA, but had me foxed for a day or two. Workaround: The last thing I had done on 13/4/2009 was to clean up all my test functions until no errors were present. Today, after an adapter reset, I deliberately sent a corrupt SCSI INQUIRY message to force the card into an error reply. A second correct SCSI INQUIRY command worked perfectly. Update: Never having programmed SCSI before, I am thinking TEST UNIT READY might be a requirement before media access. By calling TEST UNIT READY immediately before READ10 or INQUIRY, everything is hunky dory. But: DMA error still persists unless adapter reset is sent to devfn1 after devfn0 reset had finished. Surely there has to be a quicker way of initialising this hardware than to wait 24 seconds. Still, the LSI takes ages to boot up, so perhaps I have to accept that is just the way it is.

13/4/2009 LSI1030/MPT Fusion Added adapter reset to SCS0 driver init, with inreset flag to await doorbell interrupt when adapter goes ready. This allows startup code to move on and instal SCS1 driver and interrupt handler for second port (cable) on the LSI1030. When reset complete doorbell interrupts occur, both drivers go ready. MPT message formatting was a pain to conquer, strangely on SCS1 the adapter requires the PCI devfn of SCS0 on the message. Needed to set 0x40 bit in lun[1] for some unknown reason. Control field, despite being heavily featured in the Linux Drivers, needs to be set to zero, otherwise results are pants. Basic SCSI IO commands now working, INQUIRY, TEST UNIT READY, START STOP UNIT, and just managed a READ10 to access a hard disk on this SCSI adapter. Happy Days.

10/4/2009 LSI1030/MPT Fusion The story so far....in the absence of a technical reference on the LSI SCSI adapter in my Dual Xeon machine, I have been sending random messages to the adapter until I receive a reply with iocstatus set to 0x0000 i.e. success. Incorrectly structured messages usually result in 0x0004  (INTERNAL ERROR), so a trial and error exercise then establishes which fields in the request message are essential. At boot-time, the adapter reply fifo contains addresses posted by the BIOS. These could be cleared by sending a random message without pushing an address into the reply fifo, until the message times out because the reply fifo is empty. At the moment I do a hard adapter reset but this takes 12 seconds before the adapter doorbell announces the adapter to be ready again. Then, send IOCINIT with who=4 (HOST DRIVER, but can be any number I think) and repsiz=0x50. This message returns no information, but a successful IOCINIT message and reply seem to be essential otherwise subsequent messages, although returning valid data, have iocstatus set to INTERNAL ERROR. Next send a blank IOCFACTS message, and the adapter sends back the 80-byte IOCFACTSREPLY structure with who set to the value received by the adapter on the previous IOCINIT message.

3/4/2009 USB/Fujitsu Stylistic ST4110 Reprogramming the PCI IRQ routing registers to make the Cardbus slot work (4/1/2009) causes USB interrupts to be lost. I will try a little more generic code in the PCI HST driver to check device PCI IRQ# and the respective routing register during DEV init. Hopefully the only box failing this test will be the Stylistic, at which point a little corrective code can be inserted.

1/4/2009 Had a break from ChaOS to deal with the effects of the credit crunch on my business - banks start to turn the screws on their remaining customers because of all the money lost due to imprudent lending. i.e. no warning, charge me 6.5% over base rate instead the agreed 3.5%. Totally predictable, because CTPP has been in business for 24 years, and I have been on been on the receiving end of a lending slowdown before. Last time I paid a high price for not reacting fast enough to the slowdown, carried on paying employees instead of making them redundant, hoping to muddle through. This time it is different, the signs of an impending slowdown have were obvious well before Gordon Brown became Prime Minister.  I bit the bullet nearly four years ago, and made my last employees redundant. This allows me to clear my overdraft from savings the day before the bank starts to charge 29.25% interest as current agreement expires. Correction - because of the case with the OFT on unfair overdraft charges, the bank now calls my overdraft unplanned borrowing. The name of my bank: Yorkshire/Clydesdale. I have had to live through the period when Sir Fred Goodwin ran this bank, before he moved on to take down the Royal Bank of Scotland. Fortunately I had the good sense to get the deeds to my factory out of the bank when my bank manager was made redundant a few years ago in the Goodwin shredding fest. A shame really. Property worth 500,000 pounds serling, no debt, banking with a call centre 200 miles away in Scotland staffed by robots.

16/2/2009 COM Posted source code to COM driver in the ChaOS Source Index.

11/2/2009 AC97/Fujitsu Stylistic ST4110 As usual the Stylistic has a quirk - trying to read the AC97 mixer registers at boot time hangs the machine! This is because the AC97 Cold Reset bit (NABMBAR+0x2c)&1 is not set by BIOS on power-up. Programming the Stylistics can be infuriating when these quirks are first encountered. This is why I love and hate the Stylistic in equal measure.

8/2/2009 AC97/PCM/ADPCM ChaOS has no sound support at the moment, but since my (new) Dual Xeon machine, my 845 development workhorse and my Fujitsu Stylistic ST4110 all have AC97 codecs, I have been revisiting work done three years ago decoding WAV files and passing them to the AC97 hardware. PCM is simple enough, as 16-bit stereo files are more or less the same as CD audio. ADPCM is more of a challenge, but I managed to get a decode for Microsoft 4-bit ADPCM mono working, other ADPCM formats are just variations of that basic algorithm. The ChaOS device driver system has come a long way in the last couple of years, so it will be easy now to add a sound driver with capability to stream audio to the AC97 bus master to provide background sounds or continuous music.

1/2/2009 SCSI/PCI BUSES I have been researching the SCSI on my Dual Xeon machine. Also there are lots of powerful SCSI machines around going cheap, ex-corporate servers and it will be fun to get ChaOS booting from SCSI. Until now my only brush with SCSI was using ASPI to run the old Sharp JX330 scanners (interrupt 0x33 over DOS I seem to remember), so I have never got down and dirty with SCSI hardware before. Had to rewrite PCI scan to handle the multiple PCI buses on the Xeon mainboard, never quite understood PCI secondary and subordinate bus numbers until now, you really need to run your code on one of these machines to appreciate how configuration cycles propagate through the PCI bridges.

LSI 1020/1030 SCSI adapter uses MPI (Message passing interface), which embeds a SCSI CDB (Command data block) within a Message Frame wrapper. It's starting to make sense, shouldn't take long to hack a function to read sectors from SCSI hard disks.

19/1/2009 CTPPNET/VESA Wrote V2 utility, running in VESA graphics screen to display the screen contents of a remote computer, and using REMOTEQMSG to poke keystrokes into the system message queue of the remote. Remote computer now can be controlled over the network, useful for editing source and recompiling. Removed network poll block whilst in debugger, so remote computer can be debugged over the network. The PAUSE/BREAK key causes the local computer to enter break into tjhe local debugger, so to enter the remote debugger I added a button which sends a REMOTEDEBUGTRAP message. Great Fun!

13/1/2009 CTPPNET My old ChaOS network was originally designed for short peer-to-peer messages (such as machine control packets) around the factory. It was never fast because each command packet generates a reply. Increased polling frequency in network TSR speeds things up by a factor of 10, allowing file transfer at 1Mb/sec, which is fast enough to propagate and reboot new versions of ChaOS over a network for development purposes. Used some empty space in the partition table to store a network name - so net user names exist immediately and don't need a filesystem - also this means I don't get confused as to which drive I have in which machine (all my development drives are in removable caddies).

10/1/2009 CTPPNET A few mods to ChaOS allow old CTPPNET packets to be routed through the new network device drivers, thus re-enabling CTPP peer-to-peer networking. Had to reduce sector count for remote drive accesses from 3 to 2 as router truncates ethernet packets. Can now install ChaOS over a network in 8 seconds. Modified network command processor to remain active even when stopped in the debugger. Added new REMOTEQMSG command to place keystrokes in the remote system queue. Because the ChaOS debugger now uses the system message loop for keystrokes, remote debugging is also possible.

4/1/2009 MAP5/VESA/GPS MAP5 now running Fujitsu Stylistic ST4110 with CF GPS adapter in Cardbus slot. A couple of extra mods to ChaOS to make all this work - 1: added custom PCI register setups to ChaOS HST device to fix the PCI IRQ routing on the Stylistic. By matching PCI Host Bridge vendor ID and model number, user-defined values can now be poked into PCI register before any devices are initialised; 2: added custom driver for touch screen pen device to COM.DRV. MAP5 GPS logging and replay working too.

30/12/2008 COM/Digitiser Digitiser coordinates in PTR (pen/mouse device structure) were borrowed during VESA development to hold VESA display size. Modified mousevent() to use VESA or VGA pointers to calculate mouse position directly from the graphics mode - pointing device structures no longer need to be fiddled with each time the graphics mode changes.

29/12/2008 GPS (MAP5) /VESA MAP5 project (porting MAP4 to VESA graphics) 70% done. This was scarily easy, about 6000 lines of code to revised in one day. GPS moving map, all the main displays up and running in VESA modes from 640x480x8 to 1280x1024x32. Added transform() to WND system, a function to apply a 2x2 transformation matrix, if required, to all WNDs during screen blit for painless switch between portrait and landscape modes. Tried Radeon 7000 AGP card, just one issue in getfreeromseg() due to odd VGA ROM size, caused cardbus driver to fail, easy enough to fix. Similar romseg issue on Fujitsu Stylistic ST4110, then worked no problems. Just need a DRV to run the touch screen on the ST4110 and the system is ready to try in the plane.

28/12/2008 GPS/VESA Multiple monitor support now means I can finally start MAP5 project, porting GPS moving map to VESA graphics. Revised GPS support in COM driver to provide separation between GPS devices (to allow for multiple GPS inputs, may be useful later to build redundancy into my aviation GPS systems). NMEA parsing now done inside the COM driver. Added system-wide GPS message generated by driver. MAP5 will be able to get GPS info from the message queue, instead of by callback from the COM driver. Fixed incorrect .drvtbl length in COM driver. Took a couple of hours to realise (you idiot)  that VBE Get Palette Data function (needed for 8-bit graphics modes) returns incorrect data when called with a linear graphics mode active - I had been trying to get the palette data before changing modes. Call Get Palette data just after ROM init instead - no longer any need to access palette each time graphics mode is changed to 8 bit.

27/12/2008 Multiple Monitors/Cursors Multiple cursor support is built in to ChaOS as the Fujitsu Stylistics have built-in digitisers, but having multiple screens exposes a flaw in my design - cursors were sharing a data area for status and position which is wrong. Redefined cursor internal structure to cure this. Also looked at cursor drawing routines - do I draw the graphics cursor inside the device interrupt handler for instant response or each time a mtMM (mouse move) message is generated? Questions like this make my brain hurt. Using mtMM messages only works fine, but I know this only applies as long as the application regularly calls getmsg(); How often do we see Microsoft Windows go unresponsive when the app does this. So I think both is the answer, with a flag to stop the cursor update from being re-entered. But part of me says use mtMM only, to ensure that the code I write is properly GUI-aware. This requires a different approach to programming, in which no function should ever take more than a few milliseconds to execute. Reading a large file into memory, for example could take several seconds, so we needs multiple short calls to a function which manages a state-machine for the file-load process, with indication of the time to completion (hopefully better then Microsoft Minutes).

24/12/2008 Multiple Monitors It has taken a week to implement multiple VGA, but I can now run the GUI on one screen and the debugger on another. VESA struct in VGA struct reorganised, and changed to VESA* (pointer). Bootstrap can set a graphics mode before ChaOS starts. VGA->VESA* set to VESA area in the bootstrap for the one PCI or AGP VGA device active at boot time. (because boot VGA is not necessarily the first VGA device detected during PCI scan!).

22/12/2008 Multiple monitors VESA structure added to each VGA device, boot device uses pointers to the VESA info blocks in the bootstrap. VGA card on AGP identified by parent device being Bus 0 Dev 0 Function 1. Not sure if this will work for all mainboards, but it identifies the AGP bridge on all my computers. Int 0x10 vector for each VGA BIOS ROM saved in the VGA structure. Observe PCI BIOS INIT calling convention - Set AH:AL register to bus:devfn of PCI device before calling ROMSEG:0x0003 (old MACH64 PCI card needs this). Function added to pmrm to call real-mode far pointer as an interrupt (pushf then call <seg:off>). Video BIOS for each VGA card now accessed by a direct call to the appropriate BIOS ROM rather than through int 0x10.

21/12/2008 Multiple monitors Second monitor BIOS ROM init running using the mechanism described below. When one VGA is on AGP and the other is on PCI both VGAs can be active at the same time, just touch the VGA_EN bit on the AGP bridge to switch between the two. Extra field added to bootstrap to record interrupt 0x10 vector during cold boot. Boot VGA device identified by checking PCI command register at PCI[4]&7 to find which adapter is active on startup. Boot VGA device is correctly identified irrespective of the order in which VGA devices appear during the PCI device scan. For hot reboot, all but the boot VGA device have PCI[4]&7 reset to zero (simulating the cold boot condition), so the bootstrap sequence is repeatable. Each VGA device now needs a VESA substructure.

20/12/2008  Multiple Monitors Revisiting work done years ago on multiple ATI display cards. Since modern cards have VESA compatibility it should be possible to run multiple monitors without card-specific code. At boot time only one VGA card is active, and this can be determined by checking the PCI Command reg bits 2-0 at PCI[4]. If zero, the card is dormant, but can be initialised by its own ROM. First map the ROM into linear address space (aperture=PCI[0x10], PCI[0x10]=0 (set aperture to zero so it does not interfere with memory cycles accessing ROM), (PCI[0x30]=aperture|1 (set ROM address to aperture and enable bit 0). Find a free space in option ROM memory. Set the PAM registers on the PCI Host bridge to Read/write, copy the ROM from the aperture to option ROM space, save rom address to PCI[30] (used on hot reboot to indicate option ROM is ready to use), and reset aperture (PCI[0x10]=aperture). Next, disable any active VGA devices on the PCI bus (PCI[4]&7 = 0). If the card to be initialised is on the AGP, set the Host Bridge to forward VGA accesses to AGP. Save real mode interrupt 0x10 vector. Initialise the card by realmodefarcall to ROMSEG:0x0003, then reset PAM registers to lock down the ROM image. Note this will overwrite real mode interrupt vector 0x10. Call video BIOS to get VESA mode info, and set video mode as required. This needs to be MDA or a linear graphics mode to avoid a resource conflict when the primary VGA is re-enabled. Graphics modes can be changed as desired on the cards by reinstating real mode vector 0x10, disable PCI io (PCI[4]&7 = 0) to the other cards and call video BIOS as normal. (22/12/2008, int10 ROM entry point can now be called directly)

18/12/2008 GUI Wondering about adding soft corners to windows in the GUI, to make them look more modern. To do this properly the corners need a bit mask to determine when the mouse pointer is in our out of the curved windowframe, rather than just a simple inrectangle test. But since all windows are memory buffered, and I have for years reserved the colour 0xfefefe as transparent (this is how ChaOS mouse cursors and icons are masked on to the screen, instead of using a separate mask as per MS Windows), the curved edged can be created by setting the relevant parts of the MEMBUF to transparent, and adding an extra test to under( ) (the function which returns the WND* of the window under the cursor) to ignore a window where the MEMBUF is showing a transparent pixel. Simple, elegant and fast. Moreover, using this mechanism, windows can be any shape, just like icons and cursors. This whole idea fits with the standardization of ChaOS bitmaps, icons and cursors into new ftype 0x00000007 last week.

17/12/2008 Dec 2008 Demo Simple GUI up and running based on WND project and using VESA linear graphics modes. Windows are moveable, sizeable, heirarchical and can be overlapped. All drawing is to MEMBUFs, screen is drawn by traversing the window tree and blitting the MEMBUFs to the linear graphics aperture in reverse focus order. Highlight under cursor is supported by redrawing window on mtOVER, mtGONE or mtMM (mouse moved) message when in focus (i.e. on top), and doing a blit of the window straight to the graphics memory. Fit to window working now for 8,15,16,24 and 32 bit VESA modes. Reworking HTML parsing function tokenise( ) to display simple HTML files (such as ChaOS source files) in these windows.

16/12/2008 Multiprocessor support Bought an old DELL Twin Xeon workstation to begin ChaOS multiprocessor support. ChaOS boots fine, but obviously is only using one CPU. It shouldn't be too difficult to set the second CPU to run a short test loop to see it working, then improve the debugger to see it single-stepping. Interrupts will be a bit harder. I've already had ChaOS running on the APIC, but more work is needed to set up the APIC on the second CPU. Then it should be possible to load another copy of ChaOS for the second CPU to run, with the prospect of being able to single step the whole bootstrap, including the real-to-protected mode switch which to date has been a black art.

13/12/2008 Dec 2008 Demo Reworking some of the old L4 code to introduce a graphical element to the next demo. Standardised CUR ICO and LYR file formats into one new format CTPP ftype 0x0000007 (ftLYR). Image fit to window working for 24-bit ftLYR to 24-bit VESA mode. Simple job now to expand this template to handle 15,16 and 32 bit VESA modes.

7/12/2008 Zen cPanel Could not quite understand why HTML files are taking so long to update. Just discovered HTML editor in cPanel has inserted hundreds of empty <span> and <font> tags into this file and no doubt all my others. Do I have to write my own HTML editor as well??????

Pulled this file from site using ftp, and cleaned it in the ChaOS source editor. Size reduced from 240k to 3.5k!!!!!!

5/12/2008 ISO CD Revising ISO CD image construction to create multiple partitions on one ISO image. 48k extra space allocated at the start of ramdrive buffers for construction of ISO CD header. Complex CD compilations with multiple bootable partitions can now be constructed entirely in memory. The ramdrive buffer and ISO CD image are one and the same. Simply burn the ramdrive buffer to a CD to make a disk.

4/12/2008 HTTP/TCP Had some trouble backgrounding the TCP process whilst running a while(1){getmsg( );} loop. No problems if I put a debug breakpoint in the test program, which was curious. This was because once the debugger hit the breakpoint, it idles in getkey( ) which calls poll( ) continuously whilst waiting for a keypress - getmsg( ) was calling poll( ) only once before idling in a sub-block. poll( ) must be called regularly to dequeue network traffic. The code in getmsg( ) looked simple and harmless, but had become flawed when I recently introduced process polling to run the network TSRs.