Inaccuracy on LZX documentation

A few more days of debugging passed.

I really appreciate to have documentation available from Microsoft so that I can give somewhat correct names to the binary structures.

But progress moves slowly as the available pages only provide twitter-style sentences to explain how each block tie together and even these are sometimes misleading. Nevertheless, I can't complain much as progress is slow but not frozen.

Right now I've reached to a point where an implementation of the Microsoft LZX-2 algorithm is required to compress data. Looking at alternative implementations made by others such as the WINE team, I found this funny comment:
/* LZX decruncher */
/* Microsoft's LZX document and their implementation of the
* com.ms.util.cab Java package do not concur.
*
* In the LZX document, there is a table showing the correlation between
* window size and the number of position slots. It states that the 1MB
* window = 40 slots and the 2MB window = 42 slots. In the implementation,
* 1MB = 42 slots, 2MB = 50 slots. The actual calculation is 'find the
* first slot whose position base is equal to or more than the required
* window size'. This would explain why other tables in the document refer
* to 50 slots rather than 42.
*
* The constant NUM_PRIMARY_LENGTHS used in the decompression pseudocode
* is not defined in the specification.
*
* The LZX document does not state the uncompressed block has an
* uncompressed length field. Where does this length field come from, so
* we can know how large the block is? The implementation has it as the 24
* bits following after the 3 blocktype bits, before the alignment
* padding.
*
* The LZX document states that aligned offset blocks have their aligned
* offset huffman tree AFTER the main and length trees. The implementation
* suggests that the aligned offset tree is BEFORE the main and length
* trees.
*
* The LZX document decoding algorithm states that, in an aligned offset
* block, if an extra_bits value is 1, 2 or 3, then that number of bits
* should be read and the result added to the match offset. This is
* correct for 1 and 2, but not 3, where just a huffman symbol (using the
* aligned tree) should be read.
*
* Regarding the E8 preprocessing, the LZX document states 'No translation
* may be performed on the last 6 bytes of the input block'. This is
* correct. However, the pseudocode provided checks for the *E8 leader*
* up to the last 6 bytes. If the leader appears between -10 and -7 bytes
* from the end, this would cause the next four bytes to be modified, at
* least one of which would be in the last 6 bytes, which is not allowed
* according to the spec.
*
* The specification states that the huffman trees must always contain at
* least one element. However, many CAB files contain blocks where the
* length tree is completely empty (because there are no matches), and
* this is expected to succeed.
*/

Funny because a decade has passed and still we see specifications for other formats to contain lapses, mistakes and misplacements on official docs from the MS corporation. Even the patent claim they made for the WIM specification contains inaccuracies.

Would be nice to see things change.


Inaccuracy on WIM documentation

For those brave souls in the future, trying to interpret the WIM headers using the documentation provided by Microsoft: there is a typo on the declaration of data structure for RESHDR_BASE_DISK.

Where one reads:
typedef struct _RESHDR_BASE_DISK
{
ULONGLONG ullSize;
BYTE sizebytes[7];
LARGE_INTEGER liOffset;
}

It should actually be read as:
typedef struct _RESHDR_BASE_DISK
{
BYTE bFlags;
BYTE sizebytes[7];
LARGE_INTEGER liOffset;
}

The only difference is replacing ullSize by bFlags. If you don't, the difference is that a BYTE on this case is only sized in 8 bits whereas ULONGLONG is sized in 64 bits.

If you're trying to read the header from a binary file then you'd be stuck with the wrong results.

I had actually noted this detail over a year ago. Now I was looking at this again and had to spent around two days doing the math and printing the hex dump to see why things were not looking right (and get some grey hairs).

So, now I've decided to write it once for all in the blog so that it won't get forgotten again. If it helped you, do let me know.

Happy Christmas!

:)

Fresh start on the Alexa rank


I like watching the Alexa rank to get a notion of how different sites fare in terms of mainstream audience.

Wonko, "the Sane" has a very different opinion about the usefulness of this rank. Yet, it is indeed interesting to observe how we are starting from a fresh new domain that was ranked in 2 million about a week ago and right now is already breaking the top 30 000 barrier.

Not something that we get to see very often.


More interesting to note that over the past weeks our rank had been around 60 000 and that moving to a new domain has also impacted our relevance to search engines as an old domain has far more trust when compared to a brand new one, bringing less visitors from google and the sort.

I'm just glad that this transition is going so smoothly. A lot of things could have been gone wrong and still a lot more in need to be fixed but so far, I'm really happy to see how we are going.

:)

Support for Android and Iphone

One of the advantages from upgrading the board is that we also enjoy some of the recent changes.

Here is one of the nice surprises, support for mobile browsing already available by default:

The Reboot empire.

Boot Land was rebooted, why?

Over the past few months, anyone could note that despite our growth in terms of popularity, page views and visits - we were no longer working true to our founding principles.

Our once peaceful netizens turned the public forums onto a circus stage/arena for proceeding in never-ending quarrels.

While looking back, I see how much energy was wasted in defending or destroying opposing positions between aristocracy members without practical results. Instead of seeing progress, I'd risk stating that we actually saw regress and crisis to install in our development/research projects across 2010.

-------------------------

The Boot Land republic


The flagship weapon of our community, Winbuilder, saw script warlords requesting so many new features to a script engine craftman that seldom times said no. Encouraged with a multitude of feedback and requests, he began an effort to instantiate syntax correctness that would last the following two years.

From my perspective, this craftman was (and is) well intended. But each new version would disrupt scripts coded in older versions of the weapon. This forced warlords to re-train themselves and update all scripts in weaponry stock.

The script warlords were (and are) well intended. They know that winbuilder is one of the strongest weapon on their arsenal. They desire new specialized features that may give them an advantage on battlefields not just in open plains as before, but also in mountains, swamps and tropical environments.

I would have preferred to see other weapons being used together with winbuilder to achieve optimal performance in combat, rather than seeing both the engine craftman and these warlords creating a tool that served their specific situation alone. Since design simplicity was no longer present, we really lost the single most important combat advantage.

The engine kept on convolving to a reality each time farther apart from practical reality in present battle fields. While warlords of conquered domains kept on using older versions and consider other weapons for conquering new territories, new versions became ignored or deeply criticized.


Personal conflicts escalated to unprecedented levels of animosity to find guilt in others.

Rage settled in, projects are removed from public sight, opposing parties verbally attack each other on sight of public movement. This discourages thousands of netizens from joining the public forum. Vengeance, rather than reason, becomes a frequent dish served at public gatherings that are now only frequented by a few surviving senators that observe, but seldom times intervene in fear of retaliation.

Ironically enough, due to the plural investment in many other projects instead of just being a winbuilder centric community, we also saw the Boot Land domain escalate to an unparalleled growth across the boot disk universe at the Internet.

We see for the first time the barrier of 700 000 page views being broken and also celebrate the success of many excellent projects promoted by unaligned brave souls at our community, that put their heart in the work and move the boot disk state of the art by themselves.


Conflicts between aristocratic members stale any decision or course of action for the future.

Our online republic fails.

-------------------------

The reboot empire begins
Situations of this kind are not uncommon to occur at any community of reasonable dimension. I remember clearly a sequence of similar events that took place at 911CD.net some years ago. Bart, the author of BartPE superseded by far in popularity the work of DoctorXP, author of the 911CD project.

DoctorXP stepped down from public activity and BartPE became the defacto tool in coming years. Bart lost interest after some years and his work was left to other initiatives such as Reatogo and ubcd4win. Conflicts soon started to fringe the once peaceful environment at 911CD.net to a stage of pandemonium until everyone was unhappy.



As time passed, our state of conflict starts to resemble each time closer to the one observed at 911CD.

Since nobody was accepted as right by others nor admitted wrong doing on their side, decisions still need to be made and the state of Imperium was declared.




The reboot empire comes to life.

Martial law is instantiated to restore a sense of order amidst the political chaos. Those who cross the line of civilized manners are handled summarily regardless of their rank in society. A sad period but necessary to prevent our public forum from returning onto a public arena.

We live in the age of pax romana.

The goal is clear. We work to rebuild stability, to define the new milestones of expansion for our domain and to ensure that our society regains once again it's own balance to conquer new territories. These decisions will surely not please everyone but we are a breed of fighters.

We reboot.



RawReg included in pwning bootkit

Just of hearing the name of RawReg brings back some really good memories from the attic.

So, it was kind of fun to read IceCube mentioning that it was included by default on the Stoned bootkit, a project described by the author as:
Stoned Bootkit is a new Windows bootkit which attacks all Windows versions from 2000 up to 7. It is loaded before Windows starts and is memory resident up to the Windows kernel. Thus Stoned gains access to the entire system. It has exciting features like integrated file system drivers, automatic Windows pwning, plugins, boot applications and much much more. The project is partly published as open source under the European Union Public License. Like in 1987, "Your PC is now Stoned! ..again".

Peter Kleissner, Software Developer in Vienna
The project can be found at http://stoned-vienna.com/

-------------

Well, this certainly brought me back good memories when I didn't worried about the integration of enterprise applications and their survivability in the long term.





Here is a screenshot of the "about" screen on rawreg, while running under Wine on the Mac OS.










Looking forth to the future, many plans lay ahead. However, I still question every day if there will ever be time and commitment to see them through.

700 000 page views

For the first time since its inception in 2006, Boot Land has reached a new record with more than 700 000 page views served on a single month.

This means that over the course of 11 months, we have successfully grown more than 77% in terms of popularity when compared to last year.

Over the same period of time, the daily consumption of bandwidth on the server has surpassed 100Gb per day while also reducing our RAM usage to little above 4Gb.

Things are looking bright when looking at these numbers and indeed there are many good reasons to be proud about, however, not all news are roses as seen on the case of the winbuilder wars.

The good part is that our community has finally outgrown this state of warfare and other projects are also giving very solid signs of growth as well. Across 2010 we can see grub4dos, Sardu, multiPE, Wimb's work and many other good projects rising to become top tools in this industry. These projects are indeed becoming the defacto tools on this arena.

At this rate, 2011 is already promising to become a really interesting year in terms of community achievements.

:)

To EJB, or not to EJB?

I've found myself asking this question, what advantages does EJB bring?

Googled a lot, found a lot some bulleted lists repeated across many sites, but not so easy to find the real reasons that might drive a person to consider EJB in regard to other options.

Finally, I've found a really a good article from Humphrey Sheil that was written at the year of 2000. It encompasses the fundamental questions that one should ask ourselves while looking at this technology even after a decade has passed.

Below is the introduction:
To EJB, or not to EJB: that is the question.
Whether 'tis nobler in the mind, to suffer
The slings and arrows of outrageous licensing;
Or to take arms against a sea of potential overheads and features,
And by opposing end them? To roll your own: to reinvent the wheel;
No more; and by reinvent, to say, we continue
The heart-ache of low-level systems maintained in-house,
and the thousand natural shocks
That flesh is heir to; 'tis a consummation
Devoutly to be avoided.



Disabling SVN on Windows

Windows has some fascinating characteristics and other not so amusing.

Antivirus often get in the way of new files that are created and rendered my use of SVN to a true nightmare.

Hence, I needed to remove SVN from one of the projects that I'm working and there is no single straight forward way of doing this from the coding IDE.

Looking around for advice I see how many people recommend just removing all files starting with ".SVN". However, this is a lot of manual work when the project is of medium dimension.

I've tried it once but it's too time consuming. So, looking around the web I found a nifty way of cleaning SVN from a project.

Create a .bat file, place it on the root of your project and place the following code inside:

FOR /F "tokens=*" %%G IN ('DIR /B /AD /S *.svn*') DO RMDIR /S /Q "%%G"

I take no credit for the snippet as it came from this page: http://www.sean-barton.co.uk/2009/07/how-to-recursively-remove-svn-directories/

Hope it helps if you ever run into the same nuisance.

Getting real, the software development model


Found a very nice description of a software development model called "getting real". In a world where processes become the salvation to achieve efficient results in complex systems, we find a group with a different twist of perspective.

They advice that when competing, one should scale down the features instead of scaling up and avoid common decisions that are made too early in the project development and eventually lead to a result that nobody is happy about.

I can't say that I agree with all their comments, but in good truth should admit that our development of WinBuilder followed a good part of their guidelines to thrive.

Here's one of the things you might find inside their pages:

Be An Executioner

It's so funny when I hear people being so protective of ideas. (People who want me to sign an nda to tell me the simplest idea.)

To me, ideas are worth nothing unless executed. They are just a multiplier. Execution is worth millions.

Explanation:

  • Awful idea = -1
  • Weak idea = 1
  • So-so idea = 5
  • Good idea = 10
  • Great idea = 15
  • Brilliant idea = 20
  • No execution = $1
  • Weak execution = $1000
  • So-so execution = $10,000
  • Good execution = $100,000
  • Great execution = $1,000,000
  • Brilliant execution = $10,000,000

To make a business, you need to multiply the two.

The most brilliant idea, with no execution, is worth $20. The most brilliant idea takes great execution to be worth $20,000,000.

That's why I don't want to hear people's ideas. I'm not interested until I see their execution.

—Derek Sivers, president and programmer, CD Baby and HostBaby

You can read about them here: http://gettingreal.37signals.com/toc.php

:)

Partnership with 360Amigo

We've become the official discussion forum of products released by 360amigo a company that recently began releasing their tools to the public such as a System Speedup, a free tool for home users.

Their forum section is hosted at VirusRemoval.pro right next to the discussion forum of Ninja pendisk - http://virusremoval.pro/forumdisplay.php?fid=12

At VirusRemoval we sometimes receive invitations from security companies to test a given security suite. As a house policy on these cases, we only test and help products that are at minimum free for home users.

One of these security companies introduced us to 360amigo and the dev team behind it are good guys, so we decided to help them and bring some visibility to their free tool while also bringing some diversity to our forum discussions. A win-win situation for both ends.

You can visit the company site at http://360amigo.com and let us know at VirusRemoval your opinion.

:)

http://ous.in acquired

It costed some whooping 8 dollars to acquire this short domain that allows setting up domain hacks like http://fabul.ous.in, http://danger.ous.in or even marvel.ous.in amongst any others that you might imagine.

Despite having no idea of use in sight for "ous" at the moment, it was a good opportunity at a reasonable price. Good enough to compose hundreds of common words for domain hacks in English language at future projects.

This kind of reminds me of the Delicious website that was initially using http://del.icio.us before it was acquired by Yahoo! Inc. Funny enough that Delicious itself was founded by a Carnegie Mellon student, Joshua Schachter.

But back to earth, if you have any suggestions regarding how to put this URL into some good use, then do let me know!

:)

Boot Land keeps rising up the charts.

I'm really proud in seeing how well Boot Land is achieving good results in terms of popularity and overall ranking on a global scale.

Over these past few days we've kept on moving up and surpassing a giant site like MSFN, reaching a position on the top 30 000 sites around the globe.

30k is certainly far from reaching the top 1000 but considering the fact that our community does not deal with mainstream topics like fashion, movies or even generic PC support then I would say that things are looking bright on our arena when comparing to other sites in a similar specialty field.

Congratulations are due to the Boot Land community!

:)




New laptop - Toshiba R630

Lo and behold, I'm back on Windows!

Yesterday I've got myself a brand new machine that comes native with Windows 7 x64.

Guess this concludes a cycle that began two years ago when I removed Vista and installed Ubuntu to use it as a full time replacement desktop OS -

At that time, Vista was a serious nuissance and moving to Ubuntu was a real blessing to prevent me from getting bald in early age.

Now Windows 7 came. Lots of mistakes learned the hard way, hardware continuously evolving and even more experience on my side as I've been using Ubuntu and Mac OSX during this interreign.

Won't spend time talking about the good or bad things on either side but in a true sense can also say that trying to use each one of them for a certain period of time is certainly an enriching experience.

Now, I'm back on my roots and ready to have fun.

Choosing a new machine wasn't easy. So I've asked for opinion to a hardline no-nonsense structural engineer that follows the laptop trends. He needs to work intensively with Autocad, probably the most resource-hungry application in the world, so, who better to ask for an opinion? :)

This guy also happens to be my younger brother and it is interesting to hear his thoughts on new technology.

He recommended the Toshiba R630, a laptop that judging by the pictures that I saw online was probably one of the ugliest and un-sexiest machines on the market.

However, the machine is in fact a hidden gem when you read between the lines and compare it against other laptops.

It comes with an i5 processor (with 4 Intel x64 CPU cores), 4Gb of RAM, battery runs up to 8 hours on economic mode and weights some stunning 1.5 kilograms packed with all that processing power.

And when looking at the machine on real life, it is very discrete and small sized.

There's no fluff on the laptop. The display is not glossy, there are no dummy buttons and it goes against the current trend of PC manufacturers to look like a cheap copy of the MacBooks.

It's a real PC and I'm happy with it.

:)

Snippet: Fetching a regular expression

I'm including a small handy snippet to use regular expressions in your Java code.

Regular expressions allow to save time when in need of retrieving a very specific portion of text within a string. On this case, I've used to gather text from an HTML page.
/**
* Gets a string value from laptop characteristics based on a given pattern.
* A Matcher object is used internally.
*
* @param source string containing the text to be parsed
* @param reg regular expression pattern to use
* @param group index of one of the groups found by the pattern
* @return String containing the found pattern, or null otherwise
*/
private String findRegEx(String source, String reg, int group) {
String out = null;

Pattern p = Pattern.compile(reg); // Prepare the search pattern.
Matcher matcher = p.matcher(source); // Retrieve our items.

if (matcher.find()) {
try {
out = matcher.group(group);
} catch (Exception e) {}
}

return out;
}



You can use a regular expression simulator as the one available at http://gskinner.com/RegExr/ to test your regular expressions and also change the available templates over there.

:)

WinBuilder featured on C'T 2010

WinBuilder got featured (again) on the German C'T Computer magazine.
This was three months ago but only noticed it while looking on the visitor log at Boot Land.

The good guys from C'T are making available online a screenshot of the pages where the article is mentioned. You can see it in more detail at this link:

And below are two screenshots of the magazine pages:



Guess my goal for 2011 is learning German to properly read the article as intended.

:)

Using the URL shortner from google

Google has made available their own google shortner service at http://goo.gl

They require that you install the google toolbar to use the service or you can use the tool made available by Alexandre Gaigalas at http://gaigalas.net/lab/

Using their service gives some sense of stability, I was a fan of another service at http://tr.im but when it become inactive, so went inactive all my URL's from their site.

Hope you enjoy this tip.

:)

ipkr.net sold to game developer

Last week I've sold one of the domains on my portfolio - http://ipkr.net to a game developer from Ubisoft.

It was a good trade for both ends, he ended up with a good and short domain for his project and I wouldn't really give any proper use to this domain.

All that is left is wishing the new owner good luck and best wishes of success.

:)

HTTP servers for Java 5


At my project it is necessary to ensure that each client can also become a HTTP server on their own so that they can communicate with other clients.

One would think that this task would be easier using JMS or any other communication protocols like XMPP.

However, I'm assuming a worst case scenario where the proxy for a LAN only allows connections from port 80 to the outside world and even then, checks all packages to ensure that the content of each message is real HTTP content.

I'm also assuming that the users of the client have no administrative permissions and no power to allow ports to be open or controlled by applications with guest permissions.

So, port 80 is a nice way to communicate since LAN proxies often let this door open but we still can't move past the administrative permissions required to control port 80 so I'm using for the moment 8080 as an alternative.

I've considered several communication protocols over the past two weeks, I've lost so much time looking around that I became a bit disappointed at some time. Much of what you find in Java nowadays is targeted to enterprise applications and there are good reasons why whenever someone refers to "enterprise" it might just seems like another synonym for "slow", "fat" and "octopus" to come in mind.

Let's try to change this picture then.


As HTTP server, I've abandoned the option of JMS and went forward onto plain HTTP interpretation back and forth of HTML messages (possibly marshaled with XML instructions).

Looking for HTTP servers, I've discovered that Java 6 comes already built-in with an HTTP server (link), however, we can only use Java 5 as the minimum supported java so I went looking for other projects. Found quite many of them but my favorite was this one: nanoHTTPD.

It is self-contained inside a single Java file (sized in 24kb) and brings all the basic support for exchanging pages back and forth.


As a wishful thinking, it would be nice to use the servlet power since the application is intended to be flexible and allows plugins to be integrated but I'm running out of time and other priorities need to be meet on time.

Nevertheless, here is a list of other small sized web servers in plain java that you might be interested in taking a look:

Jibble - http://www.jibble.org/jibblewebserver.php (small sized)
WikiWebServer - http://www.wikiwebserver.org (user editable)
TJWs - http://tjws.sourceforge.net (requires 7beee dependency to build)
WinStone - http://winstone.sourceforge.net (Servlet, looks professional, multiple hosts, lite version is 170Kb)



I've stumbled at an interesting article about design in a distributed computing environment.

Looking at the past, it does help to prevent some (common) design errors in the future and it sure is good to keep them in mind (regardless of how many times you hear them..)

KISS. Keep it (the design) simple and stupid. Complex systems tend to fail. They are hard to tune. They tend not to scale as well. They require smarter people to keep the wheels on the road. In short, they are a pain in the you-know-what. Conversely, simple systems tend to be easy to tune and debug and tend to fail less and scale better and are usually easier to operate. This isn’t news. As I’ve argued before, spreadsheets and SQL and PHP all succeeded precisely because they are simple and stupid—and forgiving. Interestingly, standards bodies tend to keep working on standards long after they should have been frozen. They forget these lessons and add a million bells and whistles that would, if adopted, undoubtedly cause the systems to fail. Luckily this doesn’t happen because by then there is a large body of installed code (and even hardware) out there that assumes the simpler spec and cannot handle the new bells and whistles. Therefore, no one uses them and we are all protected.

Hope you enjoy the reading, you can grab the full article here:
http://queue.acm.org/detail.cfm?id=1103833

:)