ChaOS Diary Sep 2009 - Dec 2009

ChaOS Home    ChaOS Source Notes    ChaOS Source Index    ChaOS Downloads    CTPP Home

Current Diary    Diary to Jun 2010    Diary to Mar 2010    Diary to Aug 2009    Diary to Apr 2009    Diary to Nov 2008

ChaOS Diary - monoblog and links to reference documents

Many golden nuggets lie herein.

29/12/2009 ChaOS type system: MORE TYPES! The ChaOS compiler and linker need expanding to encompass 64-bit types, processor modes and opcodes. Have been revising type system, refining #defines for type flags and weeding out hard-coded values. The existing type system has two limitations: operand size is carried by masking the lower 4 bits of the type dword, and user defined types are flagged at bit 12, and indexed by the lower 12 bits of the type dword. This limits operand size to 15 bytes (need more to encode 64-bit XMM instructions), and a maximum of 4096 user-defined types. There are now around 3500 types in the current build of ChaOS, and each process can carry another 4096 types, so this has not been a pressing issue. But to completely reorganise the type system, an OS image containing the compiler and linker built-in allows the type system to be completely revised in one pass, making a change easier. Problem is, the compiler and linker need another 1500 types, and the resulting 5000 types are beyond the 4096 types per process limitation.

This is very nearly unfixable, because the OS contains the type system, which is the bedrock of type-safe dynamic linking. The compiler and linker use the operating system to generate types.

However: the compiler needed for a NEWTYPE system can be produced on the OLDTYPE system, by compiling existing compiler source code with a NEWTYPE header file. Similarly for the linker. We now have a special compiler and linker pair, for a non-existent NEWOS. With these two programs, NEWOS (with modified dynamic-link behaviour) can be compiled using the NEWTYPE header. Similarly for the source editor (not needed if the source code is error-free!). Finally, the compiler and linker recompile themselves AGAIN, to produce the native NEWTYPE compiler and linker for NEWOS. These programs will only run on NEWOS, and can then be used then to recompile the editor, then other source code for NEWOS.

As a test run, I ran the above sequence with two type #define bits swapped, to completely alter the ChaOS type system. My old familiar types looked strange, but the new OS image runs normally. Then by moving tUSER to bit 13 (previously tREF), changing mUSER from 0xfff 0x1fff, and running the above rebuild sequence, the number of user-defined types in ChaOS is increased to 8192 per process.

The way is now clear to combine OS/compiler/linker into one bootable, executable image. It will be about 10Mb when compiled with full embedded source code, and will be the first true self-modifying ChaOS image.

19/12/2009 Dec2009 Demo The largest ISO so far, at 8.5Mb, dec2009.iso is a glimpse at the environment in which ChaOS is developed. This demo includes the ChaOS compiler and linker, and auto-extracts source files for the operating system image into a RAMdisk directory tree. A small HELLO program demonstrates the edit, make and debug cycle. The operating system can be rebuilt once and relaunched from the RAMDISK.As ever, source files and symbols for the operating system can be browsed in the system debugger.

18/12/2009 VESA graphics/Nov2009 Demo Just posted nov2009.iso, the new ChaOS VESA graphics kernel. This ISO demo is set to boot into VESA mode 0x101 (257), but the new VESA engine handles all 8-bit and 32-bit modes. ChaOS debugger now runs in frame 2 of the current VESA environment, instead of switching back to VGA text mode. All VESA frames are memory-buffered. Finally, I have a GUI system capable of debugging itself without switching graphics modes. The VESA debugger also runs in IA32e (64-bit) mode, which was a pleasant surprise. There is currently no easy way of doing a real-mode callback from IA32e mode. But IA32e processors are multi-core, so BIOS calls could be performed by one of the APs. I just tried it. Scarily easy.

3/12/2009 FTPSRV/ChaOS ISO Downloads FTPSRV is now stable enough over the LAN to transfer ISO downloads to a WinXP session, then forward over FTP to Zen web server. The object has always been to transfer files directly from ChaOS to the website without the use of a Microsoft box, and this should be achieved within the next week. Nov2008test.iso now revised to concur with current ChaOS version. Oct2008.iso is a little more of a challenge, as it used SVGA graphics, which need to be ported to VESA.

3/12/2009 FTPSRV Added STOR support so FTPSRV can accept file uploads. Fixed 1/12/2009 problem of dropped directory entries during FTP LIST data transfer, which was due to an incorrect sequence number on data channel FIN packet.

1/12/2009 FTPSRV Max TCBs on chaos.ctpp.co.uk server increased to 45, to cope with the many PASV commands issued by browsers during FTP sessions. Packet interval timers increased massively to transfer files over the internet. Over the LAN a 3Mb transfer takes .36 seconds, over the internet it takes 3 minutes at the moment but at least it works! Some directory entries are being dropped during internet browsing of FTPSRV, not yet worked out why.

29/11/2009 FTPSRV/TCP checksums ChaOS now has a simple FTP server, which allows remote management of chaos.ctpp.co.uk. Bomb-testing FTPSRV using a Windows XP box to transfer larger files (>1Mb) exposed a bug in TCP checksum calculations. It's logical to accumulate all the 16-bit carry bits in the top half of a 32-bit data entity, and add them to the checksum just before converting to one's complement. But this addition can cause one further carry. I am sure I am not the first to make this mistake, resulting in approximately one in 256 TCP checksums being wrong by one bit. This error can be masked by TCP/IP packet retransmission, if the retry carries a different ip->pktno, resulting in a different checksum. Setting the IP and TCP checksum offload bits in the RTL8101, to get the network chip to do the checksums for me didn't clear this bug.

18/11/2009 Nov 2008 ISO/Nehalem i7-920 Preparing to revise last year's ISO downloads, can't quite remember where I saved all my development files, so tried downloading nov2008test.iso straight from Zen server to a ChaOS session via HTP utility. I wrote HTP six or seven months ago to experiment with sending simple HTTP GET commands to web servers, usually 50-100k max file size. 2.6Mb download worked first time which was a complete surprise. Modified MC (makecd) to accept .iso image, burned image from the downloaded file, boot CD worked OK on Intel Atom, but hits a #GP fault on Nehalem i7-920 when reading MSR 0xcd. (trying to read processor bus speed). The Nehalem MSRs were not featured in Intel SDM Vol 3b when I wrote this download last October.

18/11/2009 Unlimited Bootable partitions To capitalise on the recent work, modified bbd: (BIOS boot drive) driver to work for GPART EFI partitions anywhere on disk up to 2Tb. As mentioned before, due to the limitation of the MBR exec, a cold boot must be from an MBR partition; but a warm boot can now be performed by any ChaOS partition, on any BIOS boot device. This finally breaks the 4-partition limitation for creating and testing OS images on each disk device.

17/11/2009 Intel i7-920/IDE.DRV 6-SATA ports, 1 external SATA and a PATA controller on the i7 took a little while to master, with several different BIOS options for the IDE controllers on this mainboard (SATA.IDE, SATA Enhanced, AHCI and RAID). With write support for the BIOS boot drive (Fizbook - 28.10.2009), ChaOS can now be recompiled without a native hard disk driver. So this was an opportunity to develop a new IDE driver, and pull loads of scrappy code from years gone by out of the main ChaOS image. CHS IDE drives are now obsoleted, but can still be read/written using BIOS. IDE.DRV uses 48-bit LBA calls, and UDMA by default if ATA IDENTIFY shows the drive to be capable. So IDE.DRV takes maximum drive size from 128Gb (28-bit LBA) up to 2Tb (32-bit LBA), with scope to go to full 48-bit LBAs when the time comes.

9/11/2009 Intel i7-920 Took delivery of my i7 box today, ChaOS DVD booted first time. This is the first time I have bought a development box purely for ChaOS. RTL8101.DRV connected to the internet immediately! Quad core 2.66GHz, 8 processor cores. The PATA on the ASUS 1366 PE mainboard is on bus 4, so ChaOS standard IDE will not find PATA drive; however, BIOS will boot the PATA, so BIOS boot drive can be used to recompile the OS (since it is now read/write - see 28/10/2009). First clean compile of OS took 19 seconds, but I want more speed. I suppose each of the 8 cores could compile modules simultaneously, using about 32Mb or RAM each. Kludged the JMicron PATA registers into bbd fast ata routine, so, once booted, PATA hard disk access is all protected mode. SATA hard drives will arrive tomorrow, which can emulate PATA on the standard IDE ports, but I will run ChaOS from bus 4 PATA whilst experimenting with SATA RAID configurations. Ran Int 0x15/0xe820 BIOS memory call to see all of the 6Gb of RAM in the machine. Remembered my old Compaq Portable286, which cost me about 700 quid to upgrade to 6Mb of RAM, back in the 80s. 20 years later, an 8-core CPU with 1000 times the RAM costs less than a 1989 memory upgrade.

8/11/2009 CDR/DVD+R Bootable disks/ATAPI.DRV Rewrote MC (MAKECD) to use new MMC functions in ATAPI.DRV. ATAPI.DRV polls drive for status to detect tray open, tray closed and disk change events. If disk inserted is ISO data/ChaOS boot CD, logical drives are created as necessary to use the filesystems. On tray open, logical drives are destroyed. MC now uses the hardware functions in ATAPI.DRV to detect empty media for writing, and to burn the disc, making it 85% smaller.

7/11/2009 DVD+R LG DVDRW drive goes into a sulking state when asked to CLOSETRACK with write parameters/multisession set to zero. Set byte 2 of write parameters mode page to 0xc1 (multisession allowed + track-at-once), and set BUFE (burn-proof) and all is well. Now I have a backup routine which works with CDR or DVD+R, the only media-specific tweak being a difference in the way the disc is finalised. Strange to see a boot DVD of 1Gb with less than a quarter of the DVD used.

6/11/2009 CDR/DVD+R Bootable ChaOS DVD Since my main ChaOS CDR backup is approaching 700Mb (the maximum capacity of CDR), I spent a couple of hours researching DVD+R format with a view to writing larger backup disks. The capacity of a CDR is stored in DISCINFO, in MM:SS:FF format which maxes out at 100 minutes - far less than the capacity of DVD. Had a fiddle with GET PERFORMANCE DATA command, to find out where the usable capacity of the empty DVD is stored, but without success. Then I realised that the capacity field is in the same place as CDR, i.e. in the structure returned by READDISCINFO, but 32-bit in hexadecimal format (maximum 8Tb). Ran existing CDR write routine on empty DVD+R - hit an error on RESERVETRACK, but after skipping over this error the drive received 800Mb of write data without reporting an error. Initially the drive would not eject the disc, but an UNLOCK command sorted that. Browsing the resulting disc showed data sectors all correctly written, but with track 1 open and session 1 incomplete. This of course did not stop the disc from booting first time. Ran the code on a CDR without RESERVETRACK, the resulting disc is fine. So it appears only minor modification of CLOSETRACK and CLOSESESSION are needed to cope with DVD+R, increasing the ChaOS backup facility to 4.6Gb. RESERVETRACK can be retired.

6/11/2009 ACCDIS There has been a bug in ACCDIS for years, and today I found it. Very occasionally, ChaOS crashes with a screen full of graphics characters (0xff's). I've always suspected ACCDIS, but mistakenly thought that (because ACCDIS uses MALLOC) it was due to a memory allocation error during complex routines, such as debugger display breaks inside interrupt handlers. It turns out to be no more than a wayward CH* being passed to ACCDIS as a format string, which can happen if a variable is uninitialised, or the format string is corrupted in some way. Adding a debug break at the beginning of vaccept() traps the bug: if(!(isprint(*format)){brk return 1;} My excuse is that vaccept was written 20 years ago, when I was a mere novice programmer, rather than the rank amateur I have become. Added similar traps to other format-string-based display routines such as vcatprint(), which were derived from ACCDIS.

6/11/2009 MALLOC Doing some housekeeping, decided to add a topmalloc function for one-time allocation of memory-aligned hardware buffers during device init. Much simpler than alignedmalloc() which has to pad the heap with small freespace items to create an aligned buffer. Works by simply adjusting the size of the last heap item (it should be a large free record stretching up to memtop), and returning the new, lower memtop value. Alignment is hard-coded on 4k boundaries.

5/11/2009 TELSRV/HTPSRV Both simple servers ran 24/7 last night for the first time, with some minor issues related to recycling of timed-out TCP sessions. Server-specific data is now reset when TCP layer switches timed-out session back to LISTEN, rather than waiting for the next incoming SYN.

5/11/2009 EDD BIOS boot/ChaOSnet over IP As promised yesterday, EDD and non-EDD partition table executables harmonised, and can be swapped with a simple MBR command. Non-EDD partition table runs EDD check, to advise whether EDD is possible on the current boot device. ChaOSnet NETUSER now carries two IP addresses; if NETUSER->ip is a WAN address, NETUSER->lanip is the address of a computer on a remote LAN.

4/11/2009 EDD BIOS boot As a final tweak to the work on EDD partition tables and boot sector code, I will add EDD test to the partition table executable. Two flags in the partition table will indicate EDD-capable, and EDD-boot. In this way, MBR <hdx> /edd, and sys <hdxn:> <os-image> /edd can ensure that the new boot code is only written to devices which report an EDD capability, and the 8Gb boot sector limitation can be broken. Eventually all boots will use EDD, as older computers are retired.

4/11/2009 ChaOSnet over IP Following on from 30/8/2009 (UDP datagram used to get WAN IP), and improvements to the reliability of ChaOSnet datagrams to copy whole drives (28/10/2009), it is a small step to load IPX datagrams into UDP packets for transmission over the internet. Only one extra field is needed in NETUSER, to hold the WAN IP of the destination ChaOS server. The LAN address of the remote target can be carried in encrypted UDP packet data. At the receiver, the ChaOS box on a selected UDP open port will match the target LAN IP to an ARP cache entry, and forward the packet over LAN as necessary.

4/11/2009 TELSRV/HTPSRV/IP Added inactivity timer to TCP in IP.DRV, so servers TELSRV and HTPSRV no longer handle session timeouts. This tidies up the server code considerably, and for the first time remote HTTP and TELNET sessions to the ChaOS box on 82.68.176.217 time out gracefully and are recycled after three minutes. Increased max sessions on the TCP layer to 20, which is still a ridiculously low number, but adequate for testing the reaction of the system to connection overload. My TCP supports only a subset of the TCP/IP specification, but if remote browsers get the files they ask for, then who cares. It is interesting to see the rogue traffic hitting the servers, asking for php admin files etc, or attempting to use our server as a proxy to upload other websites.

3/11/2009 TELSRV/HTPSRV Duly chastened by my attempt at making a WinXP/ChaOS dual boot system, I realise once again that life is too short to fathom Microsoft software. Revised TELSRV and HTPSRV, this time using the TCP/IP layer to create extra server listening ports as connections are established, rather than calling passiveopen() from the server layer. Still not really stable yet, but I am being hard on the TCP/IP layer with only 10 connections shared between TELSRV and HTPSRV.

Felt so impatient today I bought an Intel i7 quad core box, to bring down compilation times and extend 64-bit development over the coming year. Also bought the www.chaos64.com domain name.

3/11/2009 EDD BIOS boot services Having gone the whole hog last week and revised ChaOS partition table, boot sector and bootstrap to use EDD int 0x13/0x42 disk services, I find the SiS mainboard BIOS on my Win XP machine does not support EDD for USB key boot. Bugger! In a couple of years this will be a non-issue, but for the time being, I have reverted to standard CHS boot code for the ISO releases, so ChaOS will boot on the broadest machine base. My own EFI partitioned hard disks continue to use EDD, to allow boot from up to 128 partitions per drive.

2/11/2009 ChaOS vs WinXP Had a go at resizing NTFS partitions, with mixed results. Tried Knoppix on the Fizbook, and EASEUS on a spare WinXP disk. Both claim to have resized the NTFS partition, then <WINROOT>\system32\hal.dll NOT FOUND error. Funnily enough I had exactly the same error when trying to boot a WinXP NTFS partition which I had placed in a ChaOS EFI logical drive, just to see what would happen. ChaOS EDD USB key did not work on XP machine when trying to use ChaOS to set bootflag in partition table to boot RECOVERY partition rather than NTFS system partition. Had to revert to ChaOS v1.01.25154, because the USB HDD BIOS does not do EDD. Changed the bootflags, and Win XP recovery boot IMMEDIATELY rewrites the partition table boot flags, which was a surprise, but typical Microsoft. The future is with multiple operating systems on one machine, so there is not much point in thrashing around with the current myopic Microsoft offerings.

1/11/2009 Fizbook Spin Running from USB key, managed to get MAP5 GPS program running SIRF USB GPS dongle - but without dual-mode interrupt handlers for the GPS and e-Turbo touch screen the system struggles. Once initialised, the USB GPS sends NMEA strings continually, and I have not yet found a way of pausing the USB during a real-mode callback which does not stall the USB GPS, the USB touch-screen, or both. Of course the UHC can be stopped to prevent interrupts, but this does not stop the device on the USB port from sending data. Although mandatory, the USB HALT Feature (whilst accepted by the PL2303) does not halt anything. These problems do not exist when running ChaOS from hard disk, but I am still loth to delete Win XP from the Fizbook before researching a dual-boot.

28/10/2009 Fizbook Spin Rather expensive for a WinXP 1Gb/60/Gb wireless netbook, but it converts to a tablet with touch-screen so is the natural successor to my Fujitsu Stylistic GPS, with a 32-bit Atom HT inside. PCI express network chip is so new I have yet to find a datasheet for it. However it is similar enough to the RTL8101 on my Atom 330 to make no difference. Not wanting to destroy the WinXP installation, I am using ChaOS booted from a USB key to transfer the Windows partition to a backup EFI partition on the Dell. The USB support on the Fizbook is impressive - so stable I have rewritten the ChaOS bbd DEV to allow read-write access to the boot media (in this case the USB key). Even better, the USB key can be removed, modified and plugged back in without disturbing the BIOS Int 0x13 support. So drivers can be modified and recompiled on or off the Fizbook, without touching the WinXP hard drive. In preparation for an attempt to resize the NTFS partition on the Fizbook, (and to make sure the network driver is stable) I am making two copies of the NTFS partition, and will ensure they are identical before attempting to change the Fizbook hard drive.

Although I have done plenty of work on TCP/IP, I am using the old ChaOSnet protocol to suck the data off the Fizbook. TCP/IP SYN and sequence numbers are handy but they wrap round at 4Gbytes, and the Fizbook hard drive is 60Gig. When mirroring drives, TCP is a sledgehammer to crack a nut - the physical ordering of LBAs on a hard disk are already an inviolable sequence. To prove the point, many times during development and testing of the RTL8101 network driver the Fizbook would hang part way through a backup - after reboot the drive mirror could be resumed by a simple retry on the read LBA error by the client making the mirror copy.

25/10/2009 Atom 330 and ATOM N270 Testing on these two processors threw up a swine of a problem related to VESA graphics modes. Both mainboards for these Atoms support only a subset of the usual VESA graphics modes, which is understandable. Interestingly they use the same BIOS template, reporting the same graphics mode list. But when calling for VESA mode info, many entries seem to be just blank, whilst reporting a successful inquiry (AX=0x004f)!!!!!!!! Sure I know Intel and the UEFI group are trying to slim down and speed up the BIOS, but damn them for returning a mode list which references blank tables. Am I missing something? Maybe the missing information is in another standard document.

23/10/2009 IA32e 64-bit mode Reorganised GDT in bootstrap and OS image to use 16-byte descriptors, so the same GDT can be used after switching to IA32e mode. This was an upheaval, as I have used 0x0008 for 4Gb code and 0x10 for 4Gb data segments since the year dot. Now 4Gb code is 0x0010, data is 0x0020, 0x0030 and 0x0040 are zero-based 16-bit/64k segs for real-mode callbacks, 0x0050 and 0x0060 do the same job for the BIOS segment, and 0x0070 is now the 64-bit linear code selector. added code to 64-bit interrupt handlers to wrap in a 32-bit-style IREGS structure just before calling back into the 32-bit debug handlers. Added a flag to IREGS to tell the debugger that an IA64 exception is in progress. By checking this flag, the debugger display can be adjusted to cope with 64-bit mode. At present I just display the 64-bit general registers RAX, RDX etc, nice to see these for the first time, and single-step some 64-bit instructions. All-in-all, 64-bit mode is not much different to 32-bit mode, since only addresses default to 64-bits, operands default to 32-bits. The 64-bit registers only kick in with an override on an instruction-by-instruction basis.

21/10/2009 IA32e 64-bit mode Had an idea that a 64-bit interrupt could call a wrapper function to transfer control back to a 32-bit handler, thus allowing the Atom processor to run in IA32e mode using the existing interrupt handler code. Not as easy as I thought, because the 0x9a CALL opcode is illegal in 64-bit mode. But it can be done. And without a call gate. I just did it. This is a major step forward, as it will allow ChaOS to run largely unchanged in IA32e mode, whilst enabling 64-bit code to be developed and debugged real-time. By careful design of the wrapper function, and addition of one flag on the stack, ChaOS debugger will be able to adjust to callbacks occurring whilst in IA32e mode. And by snooping the code selector on the stack at the time of the interrupt, the debugger can decide whether to process the exception mode as 32-bit, or 64-bit.

To support 64-bit mode, memory needs to be reserved for system tables such as 64-bit IDT, 64-bit TSS, and PAE pagetables. These are now located in the memory area 0x100000-0x120000, with the default OS image load address at 0x120000.

16/10/2009 ATAPI CDRW/DVDRAM Burn CD would not work on LG GH22 DVDRAM. Problem was that old chestnut write parameters mode page - data read from this mode page cannot be written back to drive - a few fields need to be tweaked.

14/10/2009 Boot CD encryption Added encryption to ChaOS boot image in preparation for the next ISO release.

14/10/2009 ISO CD/RAM drives ChaOS ISO CD images have always been built by a tortuous process in a makeshift RAM drive, before burning to disk in a fixed-disk, single-partition, track-at-once format. By streamlining the RAM drive code, ChaOS RAM drives are formatted with EDD partition tables and boot sectors. This means any RAM drive (which may contain multiple partitions) can now be burned raw (with the addition of a short ISO header) to create a bootable ChaOS CD.

14/10/2009 EDD bootstrap After careful backup of existing system, ChaOS now rebuilt with EDD partition table, boot sector and bootstrap. This is important stuff, because EDD BIOS calls need no CHS translation, and CHS can now be consigned to the dustbin. Using a 32-bit LBA, ChaOS partitions are bootable anywhere within the first 2Tb of a hard disk. The 6 bytes of CHS information in the MBR partition record can now be used for other purposes. This area could hold the length, and load address of a bootstrap. Or the partition exec could be completely redesigned, maybe using the EFI_PART instead. Provided the bootstrap is patched with some logical drive info, the boot sector will be redundant. The original ChaOS bootstrap took months to develop, but has taken only a couple of days to revise.

13/10/2009 Logical drive names With the advent of GUID EFI partition table, logical drive names are having a makeover. Logical drive names for up to four MBR partitions are now zero based, i.e hda1: becomes hda0: etc. EFI logical drives take their name from partition name pre-pended with the hard disk letter (so that multiple hard drives with the same EFI partition names generate different logical drive names), e.g. EFI partition named test: on hda becomes atest:, EFI partition named test: on hdc becomes logical drive ctest: Provided logical drives have the same physical length in sectors, diskcopy  drive1:  drive2: transfers the contents of drive1: to drive 2: in double-quick time, otherwise use copytree drive1: drive2:

12/10/2009 IA32e 64-bit mode After success of 10/10/2009 in running 64-bit code for the first time, I turn to modifications to the ChaOS assembler and C compiler to generate 64-bit code. Firstly, 64-bit mode uses opcodes 0x40 to 0x4f (INC Rv,DEC Rv) as new REX overrides, so 0xff00 to 0xff0f (INC Ev/DEC Ev) has to be used instead. Implementing this for the ChaOS assembler took about 5 minutes! - simply a case of moving the entries for INC Rv/DEC Rv down in the opcode map so that the assembler finds and encodes INC Ev/DEC Ev by default. The C compiler generates no INC Rv/DEC Rv so needs no modification. Next step is to add UQ, SQ to generate some quadword types to submit to the assembler, then cause the assembler to generate REX overrides when these types are encountered.

11/10/2009 EDD partition table Modified partition table exec to correctly propagate passed-in DL value through to BIOS int 0x13 reads. Now boot <hdx> <bios drive> will execute the partition table of any hard disk and pass in <bios drive> in DL. I have yet to come across a BIOS which passes anything but 0x80 to the partition exec on a cold boot, but ChaOS will now boot if this happens. There is insufficient space in the partition table exec to implement an EDD bios read, to make is possible to cold-boot from an LBA above 8Gb, and there is little reason to leave 8Gb empty at the start of any hard disk. But it would be neat to use partition entries with no CHS information - all that is needed is start LBA and length. Scrapping CHS info saves 24 bytes which is probably enough to squeeze an EDD call in. This non-standard partition table would make ChaOS drives unrecognisable to mainstream operating systems, which would be fun.

10/10/2009 Intel Atom 330/IA32e mode Tried to switch Atom into 64-bit native mode yesterday, not realising how much work was involved. System descriptors are 64 bits wide so IDT has to be changed. Paged mode must be enabled. A 64-bit TSS has to be loaded. All interrupt handlers have to be written in native 64-bit code. If interrupts occur whilst running 32-bit code, IRET has to be prefixed with REX.W. Despite this,it is refreshing to see that 64-bit mode sweeps away most of the segment protection mechanism which I have always hated. Managed to write a 64-bit mode switch with working timer interrupt handler, then switch back to 32-bit mode. It is probably possible to use a 64-bit wrapper for existing 32-bit interrupt handlers, to allow limited interrupt handling whilst in 64-bit mode. This would allow development and testing of a 64-bit compiler and debugger.

9/10/2009 Intel Atom 330/RTL8102E/Windows 7 RC1 Got my Intel Atom box yesterday, with Windows 7 RC1 pre-installed. There is nothing special about the hard disk format, no EFI GUID partition table. ChaOS booted first time from a USB key, so I now have a 64-bit development box. Substituted an old PATA IDE drive with a full ChaOS development system to have a proper look inside this tiny machine. As expected, compilation time is about 50% longer than the Dell Xeon 2.4GHz, since the Atom only runs 1.6GHz. Biggest surprise is that the Atom has four processor cores! (dual-core with HT). An awful lot of processor power for 60 pounds sterling including mainboard! Gigabyte mainboard with 945 chipset uses an RTL PCI Express lan chip, which is new to me. Hacked together a working driver for RTL8102E in about 4 hours. VESA graphics mode switches also work fine on the new box which is another pleasant surprise.

7/10/2009 EDD BIOS Disk services/GUID partitions Using a GUID partition table I now have all my development drives on one hard disk. CHS boot sector is limited to partitions which start below LBA 0xfac53f (about 8Gb). BIOS EDD service allows 64-bit LBAs, bringing any GUID partition within range of a boot sector. Wrote prototype boot sector and bootstrap loader for ChaOS to use EDD BIOS read rather than Int 13h/02h. Success in under 2 hours, booted partition 16 of my GUID from an address around 22Gb from the start of my 200Gb development drive. This is another major leap forward. My MBR partition code remains, so a cold boot of ChaOS has to be one of the four MBR partitions. But with EDD boot sectors, ChaOS partitions anywhere on the hard disk can be warm-booted by simply copying the boot sector to address 0x7c00, and jumping into it. It is a small step to rewrite the partition table executable to make it GUID-aware, and allow selection of any partition from a cold boot.

6/10/2009 IDE interrupts Introduction of UDMA as the default mode for hard disks threw up a slight problem with ChaOS legacy mode (boot ChaOS with SCROLL LOCK on and the system starts with no load-on-demand device drivers). Legacy mode setups finish before PCI device scan, so IDE interrupt handler cannot know where the BMIBA register is. Workaround: clear irq number for createATAdevice, so IDE runs with interrupts masked (polled I/O). Running IDE like this means irq sharing would not work, but irqs 14 and 15 are internal to the PCI South Bridge and are never shared anyway. Now all IDEs (including SATA) are read/write usable in legacy mode - to edit and recompile device drivers which crash the system during development.

4/10/2009 GUID Partitions Added GUID partition table to ChaOS, created by command GPTBL hdx. This creates an EFI_PART header at lba1, and mirror entries in sector 2 for the existing MBR partitions. New command bkp <logical drive> hdx creates a new GPTBL entry, and copies the contents of a logical drive to hdx. logHD() modified to scan GPTBL for partitions not accessible via the MBR, and create logical drives for these. For the first time I can hold more than 4 partitions on one hard disk. Revised drive logging code extensively so that all logical drives (including USB drives) are now created through one logdrive(DEV*,.......) function. LDRIVE table handling modified to cure problems arising when device scan creates logical drives before initIDE. It may be some time before I encounter a BIOS which supports GUID partition booting, however at least I now have a disk format ready to try.

2/10/2009 EFI/GUID Discovered GUID is a subset of EFI, which will undoubtedly replace PC BIOS. Included in EFI is a proper system boot manager. By adding a GUID partition table to ChaOS drives, newer BIOSes will choose EFI over PC BIOS, and maybe allow ChaOS to coexist with other operating systems on one host machine. Apparently some 64-bit Microsoft systems are now using the GUID partition table. I will buy an Intel Atom and install Windows 7, see how this all works. EFI Bios is some way off, and may or may not become mainstream, as the GUID partition table provides enough for an OS to provide the multiboot service. For the moment I will add a GUID partition table to ChaOS disks, boot from one of the 4 MBR partitions, then allow a reboot/launch from the GUID. In this way the current OS can remain stable, yet act as a superloader for more exotic OS images in the future.

1/10/2009 Partition Tables/GUID MBR partition table of only 4 partitions on a hard disk has long been a limitation to ChaOS, especially since development has been on FAT16 which limits partition sizes to 2Mb for a cluster size of 64k. Disk capacities continue to grow, with the result that current ChaOS development resides on a a 200Mb disk of which only 8Mb is actually used. Recent work on FAT32 breaks the 2Mb-per-partition boundary, but a new boot loader is needed to load an OS image from a FAT32. Also, I would like to hold mirror partitions of systems working on the Cobden Chadwick and C02 Laser, in order to blend these into current development, replacing IPX networking with UDP, and retiring DOS4GW. For this, a dozen or more partitions are needed. Stumbled across GUID partition table structure in Wiki which allows unlimited partitions on a disk, so it seems reasonable to follow this standard.

30/9/2009 Multiple Monitors/Dell Precision 450 Revising work done 20-28/12/2008 on multiple VESA graphics cards, to put a second monitor on my Dell. Hung horribly initially, but fortunately each problem area has been visited before. PAM registers on E7505  and 845 chipsets are in different PCI config registers, with different dword alignment. PCI bus devices each have a VGAEN bit at 0x3c&0x80000, which must be set to claim the VGA transactions. E7505 has five PCI bridges, you have to set VGAEN for the right bus! Only one VGAEN is allowed at any one moment, references indicate ISAEN at 0x3c&0x40000 must be set on ALL buses for the VGA cycles to be claimed by the right target.

Getting ready to retire my 845 P4 2.0GHz in favour of the Dell. LSI1030 SCSI in the Dell takes ages to start (over a minute, yawn) so I pulled it out and boot now takes 11 seconds from cold. Faster memory and processor means a clean recompile of ChaOS takes 40 seconds compared to over 80 on the P4. Without the 10k SCSI disk whining away the office a lot quieter too.

29/9/2009 LFN/FAT32 Bomb-testing longfilename support, resolved a couple of latent bugs for longfilenames in createdirectoryentry (1) new longfilename entries now placed in the first empty slot of the required size... (2) longfilenames now straddle sector and cluster boundaries, even if a new cluster has to be allocated to make space...

28/9/2009 IDE/SATA Reading up on SATA in order to support these drives in ChaOS - stumbled across the fact that two bits in BMIBA+2 are reserved for software to flag drives 0 and 1 as UDMA capable and configured. A quick check on all my machines shows that BIOS sets these flags for UDMA drives, and will do a much better job of configuring drives for maximum performance. Even better, I discovered that the interrupt flag at (BMIBA+2)&4 (which I had always thought was for UDMA only) is set by non-UDMA commands. This is important because an interrupt flag is essential for devices sharing an interrupt line. My IDE disk code is some of the oldest and scrappiest in ChaOS, written when I barely understood what an interrupt was - (it runs whether or not the disk interface is generating interrupts!). Took this opportunity to clean up my UDMA disk read and write code, taking proper account of the interrupt flag.

SATA emulates parallel ATA closely, so with minimal effort and this new UDMA code I see 60Mb/sec read speeds using 64k blocks (1000 read operations per second). To prove the worth of (BMIBA+2)&4, I now have a SATA disk sharing an irq with a network card.

23/9/2009 HTPSRV Adding log to HTPSRV, a seemingly simple job throws up a few OS problems: Log needs to be saved to disk (easy enough) but continuation of log after reboot requires file systems to be running before HTPSRV loads. Presently TSRs HTPSRV, TELSRV load at priority 1 after protocol drivers (TCP/IP, ARP) but before devices are initialised, which is no good. Abolished initSRV() (which loads all TSRs, in priority order) in favour of loadSRV(max,min) which loads a block of TSRs from max to min priority. loadSRV is now called twice during startup, once before devices are initialised and once after, and by setting the appropriate priorities, TSRs loading is more controllable but still automatic (i.e. still not found a need for a config.sys file yet!).

22/9/2002 HTP/HTPSRV Improved long filename support in ChaOS (something I started several years ago) so that HTP and HTPSRV can upload and download files with longfilenames, particularly important since the CTPP website uses .html extensions rather than plain vanilla .HTM. Fixed a latent problem with the ChaOS filesystem to allow longfilename entries to be deleted, especially where the directory entry straddles a disk sector boundary. Further improved HTP download utility to scan incoming HTTP header for file length, display upload progress and save files to disk automatically. Whilst longfilename support is still new, use a separate logical drive to store a downloaded copy of the CTPP website for HTPSRV. HTPSRV running on IP 82.68.176.217 with A records mapped to chaos.ctpp.co.uk and www.chaos.ctpp.co.uk, so ChaOS is now running a mirror HTTP server!

20/9/2009 HTP/HTPSRV HTP improved, now using command-line arguments for host and target file. After a successful DNS query for the host IP address, it despatches a HTTP Get command for the target file, downloads the response and if successful (HTTP code 200), strips off the HTPP response header and saves the file to disk. An invaluable tool now for grabbing files off websites. HTPSRV improved to cope much better with multiple sessions, and loads files from disk in response to GET commands. HTPSRV is now capable of hosting a simple multi-page website!

Left HTPSRV running all day - lots of phishers out there so all 10 TCP sessions in IP.DRV end up used, timing out to LISTEN on port 80, which is OK. Multi-frame delivery on GET command is not yet 100% reliable, but can be retried easily by stopping the remote browser and then clicking reload. TCP part of IP.DRV is less than 8k of code, early days.

I have always imagined the ChaOS internet presence might be hosted by a native ChaOS server. This is now a distinct possibility.

19/9/2009 TCP/IP/HTPSRV Improved TCP as per 14/9/2009 to delay ACK for incoming multi-packet transfers. A delay of 1/5th second works well, servers deliver data in bursts of increasing length. Revised HTPSRV to convert incoming GET command into upload of file from disk, build HTTP response header and attach file data, then despatch data pack ets at timed intervals with ACK, ACK+PUSH on the last one. This is simplistic TCP, but it satisfies both Internet Explorer and Firefox which is a start. Both browsers immediately send further GET commands for links within index.html, e.g. style sheets or cgi-bin.

17/9/2009 TCP/IP HTTP Installed Suse Linux 11 on IBM eServer, mapped it to one of our IPs and tried to configure Apache for multiple hosts. Meanwhile added A records to ctpp.co.uk domain directing bob.ctpp.co.uk, and chaos2.ctpp.co.uk to the Apache server. Tried in vain for four hours to configure Apache through Yast to handle multiple hosts, should it be so difficult?. Each time I set multiple hosts and use HTTP header for the second host, Yast says they I cannot send them to the same IP. What rubbish.

Came home. Created A records for bob.ctpp.co.uk and ian.ctpp.co.uk, mapped them both to my home IP. Opened port 80 on router through to my ChaOS box. Hacked ChaOS TELSRV into HTPSRV, running on port 80 instead of port 23. Added code to scan first incoming packet from HTTP client for bob.ctpp.co.uk or ian.ctpp.co.uk, and to select different HTML file accordingly. Open HTTP session in Windows Explorer/Mozilla FireFox with ian.ctpp.co.uk or bob.ctpp.co.uk, different index pages are seen. HTPSRV now sends different data according to Host: in the HTTP/1.1 header, my home IP address is effectively co-hosting two websites. Two hours, job done. Only geeks could make rocket science difficult. I will never cease to be infuriated. This is why I wrote ChaOS.

14/9/2009 TCP/IP TELNET Rewrote ChaOS TELNET, taking in all that has been learned implementing TELSRV. Added command-line IP address and port, to test ChaOS TELNET against commercial servers. Encountered wildly different scenarios when TELNETting router and Zen mailbox - after closely examining the packets exchange for each I finally underst and that TCP options (embedded in the TCP header) are distinct from TELNET options (embedded in the TELNET data stream). Also realised that TCP options need to be duplicated in the TCB, one set for the sender, another for the receiver. Only by swapping packets with a server can the black art of TCP be understood. Semi-successful TELNET session with mail server (mistakenly allowed TCP->window to go to zero - which causes the sender to wait) generated server-keepalive ACK packets with a sequence number wrong by one byte - interesting. Modified TCP to move the sliding window and transferred my first full email from the server. The same mechanism should work for HTTP web servers. Mail server sends multi-frame data in quick succession, so quick that the second frame arrives before I can ACK the first. ACKing first frame after the server has sent another causes the server to resend the second frame and slow down. Clearly ACKing the sender should be delayed when multi-frame data is expected. Last frame in the sequence carries the PSH flag, which could be the trigger, or reset a short ACK timeout each time a frame is received, and ACK when the timer expires. Much to test.

10/9/2009 Zen File Manager I have complained before about the HTML editor which is used to create this diary, and spent hours cleaning automatically-generated <font>,<strong> and <u> tags from my writings. Finally saw how it is happening. This editor creates HTML source with overlapped tags, especially when using the text Bold and Underline buttons with selected text ranges. When the source file is saved to the Zen server, it is scanned and the tags are changed, presumably to eliminate open tags. I see </strong> tags being added and </p> tags disappearing. I suspect the scanner cannot handle overlapping tags, and the result is extra open tags are added to the file. The next time the file is uploaded, then saved, another crop of tags is added. Sack the programmer.

The only fix is to carefully check the raw source file, and ensure all HTML tags are properly nested.

10/9/2009 TCP Sniffers Added display output each time an unrecognised SYN packet is received by TCP.DRV. Several of these unsolicited packets arrive each hour. TCP ports seen so far include 8800-Sun Web Server Admin, 9090-WebSM IBM RS/6000 Remote Execution, 4899-RAdmin Remote Admin, 3000-Home Banking Computer Interface. Most common is port 445, the Microsoft replacement for NetBIOS on ports 137,138,139, and by many accounts an insecure back door into MS Windows for remote procedure execution.

9/9/2009 TELSRV/TCP TELNET server (TELSRV) running yesterday with a maximum of 6 open sessions, for testing over the internet. Sessions need to be terminated by the client sending FIN or this simple server runs out of resources. Managed 5 open sessions at once, then tried later on and all sessions timed out. On checking this morning I find the TELSRV sessions had been used up by rogue IP addresses trying to establish a connection from Poland and from Colombia. TCP/IP modified now with improved timers to recycle old sessions, and TELSRV modified to terminate the session after 2 minutes of inactivity. TELSRV runs as a ChaOS protocol DRV at priority 2, is only 7k data and code. As such it takes 1.5 seconds to recompile (hard disk UDMA on), 3 seconds to reboot ChaOS with a new server running.

7/9/2009 TCP/TELNET/SMTP/POP Improved TCP in IP.DRV to successfully handle activeopen() (which runs a simple HTTP logon to the CTPP web server) as well as passiveopen() (which runs TELSRV, a simplistic TELNET server). Managed to get one Windows XP TELNET session to connect to TELSRV over the internet. Modified TELSRV to handle multiple open connections. Annoyed that Microsoft TELNET sends a frame for every character, added code to handle BACKSPACE characters correctly - essential to correct typing errors. Tested multi-session TELNET server running on 82.68.176.217:23 over the internet, now seems to work fine. I will leave this running, along with the simple UDP datagram server on 82.68.176.217:51717. Windows XP command.com does not run TELNET as well as cmd.exe, with command.com the screen sometimes comes up blank with no character echo. I didn't realise how easy it was to extend TELNET to other server protocols, until I logged on to our SMTP server and sent a simple email, then logged on to the POP3 server and retrieved it from our mailbox. The SMTP and POP3 servers at Zen cannot handle BACKSPACE characters in the input stream, which made me smile. Happy Days.

6/9/2009 CC Compiler bug encountered - type cast of array name to pointer to arraymember[0] does not work where the array name is a member of a structure being passed to a function. Not serious, instead of struct->memberarrayname, use &struct->memerarrayname[0]. There is heaps of C code out there which uses this notation, so this sort of compiler bug must be common.

4/9/2009 COBDEN MCS/CTPPNET/IP Further debugged the new COBDEN MCS this week during running of the print machine. Issues: (1): HEATERON state not recognised - old MCS checked HEATERON state from inside timer interrupt handler, i.e. 100 times per second, in order to gauge when the machine was ready to run (Heaters HOT) or safe to stop without setting fire to the paper (Heaters COLD). Fix: call readCOBDENstatus inside tmrhnlder() rather than int_timer() which is an old function no longer used! (2) NE1000 network card transmitting frames with zero MAC address - had commented out memcpy of MAC address into NETUSER in IPX.HTM (this is now done in network card .DRV). Fix: since COBDEN is a ISA machine, compile with a special version of NE.HTM (old NE1000 code) which pokes MAC address into NETUSER. (3) Incoming network frames with IEEE 802.3 length >=0x800 hit breakpoint - these are IP packets hitting the system now that the 10baseT network is connected to the 10/100 LAN through a switch. Fix: ignore the packets. (NB: by choosing to process these packets, COBDEN is connected through the router to the internet!!)

Another small issue on the network was breakpointed this week: IEEE 802.2 frames with length = 0x600. Every now and again, the ChaOS network receives a packet which generates phantom users A??????, B??????, C?????? and so on. Trapped one of these frames this week - they are sent when a MS Windows box starts up, or is woken from standby. Apparently this is a XEROX (XNS) network frame, not sure what it does, but I think XNS is still used in LCP (Link Control Protocol) which is seen during router ADSL connection.

1/9/2009 RTL8139 Had a problem with RTL8139 network card on IBM Thinkcentre and Dell XEON, resulting in horrible crash when receiver ring buffer wrapped around. Tried modifying code, but kept losing control of keyboard as the buffer wrapped, even though the network card then appeared to be working fine. Tried sending EOI to hardware after each packet but this doesn't work. Finally changed a while(){} loop to if(){} in the receiver interrupt handler, effectively meaning the handler is called once for each packet received, rather than checking for additional incoming packets before returning. The while(){} loop mimicks the old NE1000/NE2000 handler (RTL 8029), my first successful network driver. Clearly this is wrong for RTL8139, or the hardware needs additional outputs for this method to work. The bug was nasty, because the receiver indicates that another frame has been received before completing the DMA transfer of the frame to memory.