Inaccuracy on LZX documentation

A few more days of debugging passed.

I really appreciate to have documentation available from Microsoft so that I can give somewhat correct names to the binary structures.

But progress moves slowly as the available pages only provide twitter-style sentences to explain how each block tie together and even these are sometimes misleading. Nevertheless, I can't complain much as progress is slow but not frozen.

Right now I've reached to a point where an implementation of the Microsoft LZX-2 algorithm is required to compress data. Looking at alternative implementations made by others such as the WINE team, I found this funny comment:
/* LZX decruncher */
/* Microsoft's LZX document and their implementation of the
* com.ms.util.cab Java package do not concur.
*
* In the LZX document, there is a table showing the correlation between
* window size and the number of position slots. It states that the 1MB
* window = 40 slots and the 2MB window = 42 slots. In the implementation,
* 1MB = 42 slots, 2MB = 50 slots. The actual calculation is 'find the
* first slot whose position base is equal to or more than the required
* window size'. This would explain why other tables in the document refer
* to 50 slots rather than 42.
*
* The constant NUM_PRIMARY_LENGTHS used in the decompression pseudocode
* is not defined in the specification.
*
* The LZX document does not state the uncompressed block has an
* uncompressed length field. Where does this length field come from, so
* we can know how large the block is? The implementation has it as the 24
* bits following after the 3 blocktype bits, before the alignment
* padding.
*
* The LZX document states that aligned offset blocks have their aligned
* offset huffman tree AFTER the main and length trees. The implementation
* suggests that the aligned offset tree is BEFORE the main and length
* trees.
*
* The LZX document decoding algorithm states that, in an aligned offset
* block, if an extra_bits value is 1, 2 or 3, then that number of bits
* should be read and the result added to the match offset. This is
* correct for 1 and 2, but not 3, where just a huffman symbol (using the
* aligned tree) should be read.
*
* Regarding the E8 preprocessing, the LZX document states 'No translation
* may be performed on the last 6 bytes of the input block'. This is
* correct. However, the pseudocode provided checks for the *E8 leader*
* up to the last 6 bytes. If the leader appears between -10 and -7 bytes
* from the end, this would cause the next four bytes to be modified, at
* least one of which would be in the last 6 bytes, which is not allowed
* according to the spec.
*
* The specification states that the huffman trees must always contain at
* least one element. However, many CAB files contain blocks where the
* length tree is completely empty (because there are no matches), and
* this is expected to succeed.
*/

Funny because a decade has passed and still we see specifications for other formats to contain lapses, mistakes and misplacements on official docs from the MS corporation. Even the patent claim they made for the WIM specification contains inaccuracies.

Would be nice to see things change.


Inaccuracy on WIM documentation

For those brave souls in the future, trying to interpret the WIM headers using the documentation provided by Microsoft: there is a typo on the declaration of data structure for RESHDR_BASE_DISK.

Where one reads:
typedef struct _RESHDR_BASE_DISK
{
ULONGLONG ullSize;
BYTE sizebytes[7];
LARGE_INTEGER liOffset;
}

It should actually be read as:
typedef struct _RESHDR_BASE_DISK
{
BYTE bFlags;
BYTE sizebytes[7];
LARGE_INTEGER liOffset;
}

The only difference is replacing ullSize by bFlags. If you don't, the difference is that a BYTE on this case is only sized in 8 bits whereas ULONGLONG is sized in 64 bits.

If you're trying to read the header from a binary file then you'd be stuck with the wrong results.

I had actually noted this detail over a year ago. Now I was looking at this again and had to spent around two days doing the math and printing the hex dump to see why things were not looking right (and get some grey hairs).

So, now I've decided to write it once for all in the blog so that it won't get forgotten again. If it helped you, do let me know.

Happy Christmas!

:)

Fresh start on the Alexa rank


I like watching the Alexa rank to get a notion of how different sites fare in terms of mainstream audience.

Wonko, "the Sane" has a very different opinion about the usefulness of this rank. Yet, it is indeed interesting to observe how we are starting from a fresh new domain that was ranked in 2 million about a week ago and right now is already breaking the top 30 000 barrier.

Not something that we get to see very often.


More interesting to note that over the past weeks our rank had been around 60 000 and that moving to a new domain has also impacted our relevance to search engines as an old domain has far more trust when compared to a brand new one, bringing less visitors from google and the sort.

I'm just glad that this transition is going so smoothly. A lot of things could have been gone wrong and still a lot more in need to be fixed but so far, I'm really happy to see how we are going.

:)

Support for Android and Iphone

One of the advantages from upgrading the board is that we also enjoy some of the recent changes.

Here is one of the nice surprises, support for mobile browsing already available by default:

The Reboot empire.

Boot Land was rebooted, why?

Over the past few months, anyone could note that despite our growth in terms of popularity, page views and visits - we were no longer working true to our founding principles.

Our once peaceful netizens turned the public forums onto a circus stage/arena for proceeding in never-ending quarrels.

While looking back, I see how much energy was wasted in defending or destroying opposing positions between aristocracy members without practical results. Instead of seeing progress, I'd risk stating that we actually saw regress and crisis to install in our development/research projects across 2010.

-------------------------

The Boot Land republic


The flagship weapon of our community, Winbuilder, saw script warlords requesting so many new features to a script engine craftman that seldom times said no. Encouraged with a multitude of feedback and requests, he began an effort to instantiate syntax correctness that would last the following two years.

From my perspective, this craftman was (and is) well intended. But each new version would disrupt scripts coded in older versions of the weapon. This forced warlords to re-train themselves and update all scripts in weaponry stock.

The script warlords were (and are) well intended. They know that winbuilder is one of the strongest weapon on their arsenal. They desire new specialized features that may give them an advantage on battlefields not just in open plains as before, but also in mountains, swamps and tropical environments.

I would have preferred to see other weapons being used together with winbuilder to achieve optimal performance in combat, rather than seeing both the engine craftman and these warlords creating a tool that served their specific situation alone. Since design simplicity was no longer present, we really lost the single most important combat advantage.

The engine kept on convolving to a reality each time farther apart from practical reality in present battle fields. While warlords of conquered domains kept on using older versions and consider other weapons for conquering new territories, new versions became ignored or deeply criticized.


Personal conflicts escalated to unprecedented levels of animosity to find guilt in others.

Rage settled in, projects are removed from public sight, opposing parties verbally attack each other on sight of public movement. This discourages thousands of netizens from joining the public forum. Vengeance, rather than reason, becomes a frequent dish served at public gatherings that are now only frequented by a few surviving senators that observe, but seldom times intervene in fear of retaliation.

Ironically enough, due to the plural investment in many other projects instead of just being a winbuilder centric community, we also saw the Boot Land domain escalate to an unparalleled growth across the boot disk universe at the Internet.

We see for the first time the barrier of 700 000 page views being broken and also celebrate the success of many excellent projects promoted by unaligned brave souls at our community, that put their heart in the work and move the boot disk state of the art by themselves.


Conflicts between aristocratic members stale any decision or course of action for the future.

Our online republic fails.

-------------------------

The reboot empire begins
Situations of this kind are not uncommon to occur at any community of reasonable dimension. I remember clearly a sequence of similar events that took place at 911CD.net some years ago. Bart, the author of BartPE superseded by far in popularity the work of DoctorXP, author of the 911CD project.

DoctorXP stepped down from public activity and BartPE became the defacto tool in coming years. Bart lost interest after some years and his work was left to other initiatives such as Reatogo and ubcd4win. Conflicts soon started to fringe the once peaceful environment at 911CD.net to a stage of pandemonium until everyone was unhappy.



As time passed, our state of conflict starts to resemble each time closer to the one observed at 911CD.

Since nobody was accepted as right by others nor admitted wrong doing on their side, decisions still need to be made and the state of Imperium was declared.




The reboot empire comes to life.

Martial law is instantiated to restore a sense of order amidst the political chaos. Those who cross the line of civilized manners are handled summarily regardless of their rank in society. A sad period but necessary to prevent our public forum from returning onto a public arena.

We live in the age of pax romana.

The goal is clear. We work to rebuild stability, to define the new milestones of expansion for our domain and to ensure that our society regains once again it's own balance to conquer new territories. These decisions will surely not please everyone but we are a breed of fighters.

We reboot.



RawReg included in pwning bootkit

Just of hearing the name of RawReg brings back some really good memories from the attic.

So, it was kind of fun to read IceCube mentioning that it was included by default on the Stoned bootkit, a project described by the author as:
Stoned Bootkit is a new Windows bootkit which attacks all Windows versions from 2000 up to 7. It is loaded before Windows starts and is memory resident up to the Windows kernel. Thus Stoned gains access to the entire system. It has exciting features like integrated file system drivers, automatic Windows pwning, plugins, boot applications and much much more. The project is partly published as open source under the European Union Public License. Like in 1987, "Your PC is now Stoned! ..again".

Peter Kleissner, Software Developer in Vienna
The project can be found at http://stoned-vienna.com/

-------------

Well, this certainly brought me back good memories when I didn't worried about the integration of enterprise applications and their survivability in the long term.





Here is a screenshot of the "about" screen on rawreg, while running under Wine on the Mac OS.










Looking forth to the future, many plans lay ahead. However, I still question every day if there will ever be time and commitment to see them through.

700 000 page views

For the first time since its inception in 2006, Boot Land has reached a new record with more than 700 000 page views served on a single month.

This means that over the course of 11 months, we have successfully grown more than 77% in terms of popularity when compared to last year.

Over the same period of time, the daily consumption of bandwidth on the server has surpassed 100Gb per day while also reducing our RAM usage to little above 4Gb.

Things are looking bright when looking at these numbers and indeed there are many good reasons to be proud about, however, not all news are roses as seen on the case of the winbuilder wars.

The good part is that our community has finally outgrown this state of warfare and other projects are also giving very solid signs of growth as well. Across 2010 we can see grub4dos, Sardu, multiPE, Wimb's work and many other good projects rising to become top tools in this industry. These projects are indeed becoming the defacto tools on this arena.

At this rate, 2011 is already promising to become a really interesting year in terms of community achievements.

:)